Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cudf v0.16 #6547

Merged
merged 1,011 commits into from
Oct 21, 2020
Merged

[RELEASE] cudf v0.16 #6547

merged 1,011 commits into from
Oct 21, 2020

Conversation

GPUtester
Copy link
Collaborator

❄️ Code freeze for branch-0.16 and v0.16 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-0.16 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-0.16 into main for the release

rjzamora and others added 30 commits September 9, 2020 18:34
[REVIEW] Fix normalize_characters offset logic on sliced strings column
* Fix cmake build for arrow

* Updated Changelog

Co-authored-by: Keith Kraus <kkraus@nvidia.com>
* Hacky Python-only implementation of `scatter_to_table`

* Add pivot

* Make values a kwarg

* Add encode bindings

* Add a table encode

* Add droplevel

* Better handling of level argument

* Add unstack

* Add _level_index_from_level and _level_name_from_level helpers

* Fixes to _poplevel and add tests

* Can't have negative level because we sort later

* More droplevel tests

* Add docs for droplevel

* Add and pass tests for unstack()

* Fix null order in table encode and add docs

* Add table encode test

* Clarify null behaviour

* Document scatter_to_table and add default nrows/ncols

* Fix MultiIndex.__len__

* Logic fixes in pivot

* Add reason for xfail

* Changelog

* Remove split redefinition

* Add a detail::bitmask_and API that accepts a table view

* Replace multi-column sort with just a gather

* Remove overload of encode() that accepts a cudf::column

* Remove unused #include

* Review feedback

* Remove bindings for encode(column)

* Ensure columns are an `Index`

* Update pivot docstring

* Test for duplicate index/column pairs in pivot

* Docstrings for pivot/unstack

* Update python/cudf/cudf/core/multiindex.py

Co-authored-by: Keith Kraus <kkraus@nvidia.com>

* Remove nullable_pd_dtype arg

* Replace Series binop with column binops

* Review feedback

* Handle circular imports in core.reshape and copy pivot/unstack docstrings

* Use two loops over ilevels

* Handle empty levels in poplevels

* Add fill_value for Pandas compat

* Explain logic in encode

* Review feedback

* Fix typo

Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com>
Co-authored-by: Keith Kraus <kkraus@nvidia.com>
[REVIEW] Fix issue where `np.nan` is being return instead of `NAT` for datetime/duration types
[REVIEW] Add builder API for cuIO `write_parquet_args` and `read_parquet_args`
Avoid guid use from deprecated pyarrow.compat
kaatish and others added 13 commits October 7, 2020 17:25
* Remove macros from CompactProtocolWriter

* CHANGELOG fix

* Formatting

* PR review comments

* Moved CompactProtocolWriter to its own file

* Use reference instead of pointer for output vector

* Add back CompactProtocolFieldWriter

* Style fix

* PR review comments
MD5 list column support, not adding support for nested lists at this point. Also augments existing functionality to optionally retain nulls in the resulting column if elements in that row are null.
* Bug fixes for ColumnBuilder

Co-authored-by: Kuhu Shukla <kuhus@nvidia.com>
* Remove repo url from build-info, as it contains offensive word

* Update CHANGELOG.md

* Update CHANGELOG.md

Co-authored-by: Jason Lowe <jlowe@nvidia.com>
…6498)

* Add helper method to ColumnBuilder with some nits

Co-authored-by: Kuhu Shukla <kuhus@nvidia.com>
Closes #6343. Fixes COUNT_ALL, COUNT_VALID for window functions. In rolling_window() operations, COUNT_VALID/COUNT_ALL should only return null rows if the min_periods requirement is not satisfied. For all other cases, the count produced must be valid, even if the input row is null.
…df::io (#6386)

* removed redundant reinterpret_cast

* removed redundant reinterpret_cast

* removed redundant reinterpret_casts

* removed redundant reinterpret_casts

* removed redundant reinterpret_casts

* removed redundant reinterpret_casts

* removed redundant reinterpret_casts

* removed c-style pointer casts

* removed redundant reinterpret_cast

* removed c-style pointer casts

* removed redundant reinterpret_cast

* removed redundant reinterpret_cast

* removed c-style pointer casts

* changed source type to void* and removed rei-casts

* removed redundant reinterpret_casts

* removed redundant reinterpret_cast

* changed unknown pointer type to void*

* removed redundant reinterpret_casts

* removed c-style pointer casts

* removed redundant reinterpret_casts

* removed redundant reinterpret_casts

* removed c-style pointer casts

* removed c-style pointer casts

* removed c-style pointer casts

* removed redundant reinterpret_casts

* removed c-style pointer casts

* removed c-style pointer casts

* removed redundant reinterpret_casts

* added changelog

* changed decomp state member pointer to std::vector

* made variable's type automatically deduced

* clang-format fix

Co-authored-by: Mark Harris <mharris@nvidia.com>
Co-authored-by: Vukasin Milovanovic <vmilovanovic@nvidia.com>
Co-authored-by: Ryan Lee <ryanlee@nvidia.com>
…pe`) [skip ci] (#5861)

* Add fixed_point iterator test

* Add test demonstrating binary op

* Replace cout with asserts

* Test with fraction fixed_point values

* Attempt to implement proxy reference, but thrust doesn't like it

* Add hacked up transform_mutable_iterator

* Cleanup dead code

* Actually make mutable iterator work as input iterator

* Simplify diff from transform_output_iterator

* Add device implementation of fixed point iterator

* Add test with iterator random access in kernel

* Use thrust/transform_input_output_iterator

* Replace rescale with rescaled

* Add constexpr to numeric::scale_type ctor

* Cleanup forward declarations

* Add scale to data_type

* Fixed typo in clamp file

* Add make_numeric_column

* factories_test.cpp remove FixedWidthTypesWithoutFixedPoint

* Update CHANGELOG

* Fix mockup of ColumnLike, get unit tests working, clang-format

* Update CHANGELOG

* Fix formatting

* Add scale_type ctor to data_type

* Formatting of make_fixed_width_column for readability

* Fix typo

* Add `cudf::` to data_type member for consistency

* Update comment

* Add `scale` method to data_type

* Clean up column_view_printer for fixed_point

* Add decimal32 specialization for .element()

* Add initial `fixed_point_column_wrapper`

* Add temporary change to column_view_printer

* Add unit test for simple `fixed_point_column_wrapper`

* Add `thrust::optional` to `data_type::_scale`

* Make `fixed_point_column_wrapper` unit test for both reps

* Add decimal64 specialization for .element()

* Update column_view_printer for decimal64

* Small cleanup

* Fix for failing unit tests

* CUDF_EXPECTS data_type::_scale to be set

* Add unit test for data_type::_scale not set

* Remove TODO comment

* data_type ctor docs and CUDF_EXPECTS

* Add unit test for wrong type_id in data_type ctor

* Remove `thrust::optional` from scale in data_type

* Update CHANGELOG

* IIFE clean up

* Return fixed_point without temporary scaled_integer

* Change fixed_point unit test to 0 scale

* Add cudf::distance to avoid implicit casting

* Temporary fix for FILLING_TEST

* Add fixed_point specialization for make_elements

* Fixed FIXED_POINT_TEST

* Cleanup unit test

* Add specialization of cudf::to_host for fixed_point

* Add another specialization of make_elements

* Use .empty() instead of == 0

* Fix ROLLING_TEST

* CI fix

* Fix CI: get_current_default_resource

* Delete file

* Merge mistake: host_vector should be of RepType

* Use counting iterator + transform iterator

* Use lambda and transform iterator

* Add const

* Cleanup

* Fix (can't use generic lambda)

* Use std::for_each & std::any_of

* Fixed `RESHAPE_TEST` & `TRANSPOSE_TEST`

* Fix PARTITIONING_TEST and some of COPYING_TEST

* Fix ShiftTests of COPYING_TEST

* Cleanup: use ternary opreator

* Fix MERGE_TEST

* Temporarily disable BINOP

* Fix for ScatterScalar tests of COPYING_TEST

* Fix DictionaryConcat tests of COPYING_TEST

* Disable copy_range for fixed_point

* Add specialization for copy_range fixed_point

* Temporarily disable REPLACE_TEST for fixed_point

* Use snake_case cleanup

* Add back REPLACE_TEST for fixed_point

* Use cudf::test::make_counting_transform_iterator

* Fix for REPLACE_TEST

* Add get_column_stored_type

* Add CUDF_EXPECTS on representation type

* Fix/cleanup cudf::to_host for fixed_point

* Add simple fixed_point_column_wrapper test

* fixed_point column_view_printer

* Small cleanup

* Remove [&]

* Remove 10 * (from trying to break tests)

* Rename _scale and adjust comment for clarity

* Add column_type_id_matches_column_stored_type

* Use thrust::copy_if with stencil

* Add missing header

* Docs and removing TODOs

* Disable binop ptx with fixed_point

* Add initial support for fixed_point binary operations

* Enable some of the binary op tests

* Refactor / cleanup

* Remove rmm_log.txt

* Remove unnecessary header

* Use .is_empty() instead of .size() == 0

* Make changes to fixed_point_scalar

* Changes to search.cu

* Changes to scalar_construction_helper

* Changes to null.cu

* Changes to src/dictionary/search.cu

* Changes to fixed_point_scalar

* Remove temporary changes

* Addressing PR comments

* Update cpp/include/cudf/utilities/type_dispatcher.hpp

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/include/cudf/column/column_factories.hpp

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/include/cudf/column/column_factories.hpp

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/include/cudf/utilities/type_dispatcher.hpp

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Fix formatting

* Changes from PR 6063 and 6064

* Addressing PR comments

* CI Fix

* Update cpp/src/merge/merge.cu

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/src/merge/merge.cu

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/src/merge/merge.cu

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/src/merge/merge.cu

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/src/merge/merge.cu

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Update cpp/src/merge/merge.cu

Co-authored-by: Mark Harris <mharris@nvidia.com>

* Auto commit fix

* Replace for_each with all_of

* Cleanup in types.hpp

* Changes for fixed_point binary operations

* fixed_point binop different scales

* Enable last binop fixed_point test

* Clean up tests (remove TODO) and support both decimal32/64

* Final binop cases for fixed_point

* Addressing PR comments

* fixed_point_scalar_device_view inheritance from scalar_device_*

* Testing CI

* Final TODOs (docs)

* Remove redundant is_fixed_point

* Fix for AST (thank you David)

* Remove headers + clean up

* Remove headers

* fixed_point scalar factory test

* scalar_test for fixed_point

* Rename fn to type_id_matched_device_storage_type

* Remove header

* Addressing PR comments, fixed_point_column_wrapper tests

* Revert type_id_matches... changes

* Simplify device_storage_type_t docs

* Add failing binop test

* Disable `fixed_point` binaryops

* Addressing PR comments

* Spelling fixes

Co-authored-by: Trevor Smith <trevorsm7@gmail.com>
Co-authored-by: Mark Harris <mharris@nvidia.com>
@codecov
Copy link

codecov bot commented Oct 16, 2020

Codecov Report

Merging #6547 into main will decrease coverage by 1.56%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6547      +/-   ##
==========================================
- Coverage   84.54%   82.97%   -1.57%     
==========================================
  Files          81       95      +14     
  Lines       13845    14900    +1055     
==========================================
+ Hits        11705    12364     +659     
- Misses       2140     2536     +396     
Impacted Files Coverage Δ
python/cudf/cudf/comm/gpuarrow.py 79.76% <0.00%> (-9.53%) ⬇️
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
python/cudf/cudf/core/abc.py 87.23% <0.00%> (-4.26%) ⬇️
python/cudf/cudf/core/buffer.py 79.04% <0.00%> (-3.65%) ⬇️
python/cudf/cudf/utils/cudautils.py 48.55% <0.00%> (-3.60%) ⬇️
python/cudf/cudf/utils/utils.py 82.53% <0.00%> (-2.37%) ⬇️
python/dask_cudf/dask_cudf/io/orc.py 90.90% <0.00%> (-1.28%) ⬇️
python/dask_cudf/dask_cudf/io/parquet.py 91.35% <0.00%> (-0.85%) ⬇️
python/cudf/cudf/core/column/lists.py 97.43% <0.00%> (-0.65%) ⬇️
python/cudf/cudf/core/column/categorical.py 93.11% <0.00%> (-0.51%) ⬇️
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 71cb8c0...2cda39b. Read the comment docs.

Overview

First step before "Project Flash" to improve gpuCI scripts and allow for use of different node types for CPU builds. Post "Project Flash" there will be another round of improvements and updates with the goal to make these scripts easier to maintain.

Changes

- Consolidates some of the CPU gpuCI scripts, reducing number of scripts to maintain
- Standardizes naming of the upload script to upload.sh
- Changes logging to use gpuci_logger where practical
- Updates comments and documentation where necessary
- Enables gpuCI to override environment variables based on node types
- Reorders upload script to save ~5 mins or more in testing time
CHANGELOG.md Outdated Show resolved Hide resolved
@raydouglass raydouglass merged commit 7ef8174 into main Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet