Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Upgrade arrow to 4.0.1 #7495

Merged
merged 97 commits into from
Jun 29, 2021
Merged

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Mar 3, 2021

Fixes: #7224

This PR:

  • Adds support for arrow 4.0.1 in cudf.
  • Moves testing-related utilities to cudf.testing module.
  • Fixes miscellaneous errors related to arrow upgrade.

@galipremsagar galipremsagar self-assigned this Mar 3, 2021
@github-actions github-actions bot added CMake CMake build issue conda Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Mar 3, 2021
@galipremsagar galipremsagar added 2 - In Progress Currently a work in progress CMake CMake build issue conda libcudf Affects libcudf (C++/CUDA) code. and removed CMake CMake build issue conda libcudf Affects libcudf (C++/CUDA) code. labels Mar 3, 2021
@galipremsagar galipremsagar added 0 - Blocked Cannot progress due to external reasons and removed 2 - In Progress Currently a work in progress labels Mar 4, 2021
Copy link
Collaborator

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We moved the cudf.tests.utils to cudf.testing.utils which is now in a public namespace. Should we make it cudf.testing._utils?

@kkraus14 kkraus14 added breaking Breaking change improvement Improvement / enhancement to an existing function feature request New feature or request and removed improvement Improvement / enhancement to an existing function labels Mar 4, 2021
@kkraus14 kkraus14 added this to PR-WIP in v0.19 Release via automation Mar 4, 2021
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 5 - DO NOT MERGE Hold off on merging; see PR for details labels Jun 28, 2021
@galipremsagar galipremsagar moved this from PR-Needs review to PR-Reviewer approved in v21.08 Release Jun 28, 2021
@galipremsagar
Copy link
Contributor Author

rerun tests

2 similar comments
@trxcllnt
Copy link
Contributor

rerun tests

@galipremsagar
Copy link
Contributor Author

rerun tests

@galipremsagar
Copy link
Contributor Author

@gpucibot merge

1 similar comment
@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@galipremsagar galipremsagar requested review from isVoid and removed request for isVoid June 29, 2021 23:21
@rapids-bot rapids-bot bot merged commit 1e53776 into rapidsai:branch-21.08 Jun 29, 2021
v21.08 Release automation moved this from PR-Reviewer approved to Done Jun 29, 2021
rapids-bot bot pushed a commit that referenced this pull request Jul 2, 2021
With #7495 merged, it seems like the dev environment files create an environment with the CPU packages for `pyarrow` and `arrow-cpp`; this results in failure when trying to compile libcudf or `import cudf`:

```python
ModuleNotFoundError: No module named 'pyarrow._cuda'
    from cudf import rmm
  File "/opt/conda/lib/python3.8/site-packages/cudf/__init__.py", line 11, in <module>
    from cudf import core, datasets, testing
  File "/opt/conda/lib/python3.8/site-packages/cudf/core/__init__.py", line 3, in <module>
    from cudf.core import _internals, buffer, column, column_accessor, common
  File "/opt/conda/lib/python3.8/site-packages/cudf/core/_internals/__init__.py", line 3, in <module>
    from cudf.core._internals.where import where
  File "/opt/conda/lib/python3.8/site-packages/cudf/core/_internals/where.py", line 11, in <module>
    from cudf.core.column import ColumnBase
  File "/opt/conda/lib/python3.8/site-packages/cudf/core/column/__init__.py", line 3, in <module>
    from cudf.core.column.categorical import CategoricalColumn
  File "/opt/conda/lib/python3.8/site-packages/cudf/core/column/categorical.py", line 25, in <module>
    from cudf import _lib as libcudf
  File "/opt/conda/lib/python3.8/site-packages/cudf/_lib/__init__.py", line 4, in <module>
    from . import (
ImportError: libarrow_cuda.so.400: cannot open shared object file: No such file or directory
```

This updates the dev environments and recipe to ensure that the GPU package of `pyarrow` (and `arrow-cpp` accordingly) are used.

Authors:
  - Charles Blackmon-Luca (https://github.com/charlesbluca)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #8637
rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this pull request Jul 2, 2021
…ing (#430)

This PR contains three distinct changes required to get cuspatial builds working and tests passing again:
1. RMM switched to rapids-cmake (rapidsai/rmm#800), which requires CMake 3.20.1, so this PR includes the required updates for that.
2. The Arrow upgrade in cudf also moved the location of testing utilities (rapidsai/cudf#7495). Long term cuspatial needs to move away from use of the testing utilities, which are not part of cudf's public API, but we are currently blocked by rapidsai/cudf#8646, so this PR just imports the internal `assert_eq` method as a stopgap to get tests passing.
3. The changes in rapidsai/cudf#8373 altered the way that metadata was propagated to libcudf outputs from previously existing cuDF Python objects. The new code paths require cuspatial to override metadata copying at the GeoDataFrame rather than the GeoColumn level in order to ensure that information about column types is lost in the libcudf round trip and the metadata copying functions are now called on the output DataFrame rather than the input one.

This PR supersedes #427, #428, and #429, all of which can now be closed.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Christopher Harris (https://github.com/cwharris)

URL: #430
rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this pull request Jul 21, 2021
As of rapidsai/cudf#7495 the `cudf.tests.utils` module (and in particular the `assert_eq` function) are no longer part of the public API. This PR switches tests to use the public testing functions in the `cudf.testing` subpackage.

This PR is currently blocked by #430 and rapidsai/cudf#8646.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Christopher Harris (https://github.com/cwharris)
  - Paul Taylor (https://github.com/trxcllnt)

URL: #431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[FEA] Upgrade Arrow support to 4.0.0