[RELEASE] cudf v0.16 #6547

GPUtester · 2020-10-16T14:33:37Z

❄️ Code freeze for `branch-0.16` and v0.16 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-0.16 until release (merging of this PR).

What is the purpose of this PR?

Update documentation
Allow testing for the new release
Enable a means to merge branch-0.16 into main for the release

into fea-fancy-data-generator

[REVIEW] Fix normalize_characters offset logic on sliced strings column

* Fix cmake build for arrow * Updated Changelog Co-authored-by: Keith Kraus <kkraus@nvidia.com>

* Hacky Python-only implementation of `scatter_to_table` * Add pivot * Make values a kwarg * Add encode bindings * Add a table encode * Add droplevel * Better handling of level argument * Add unstack * Add _level_index_from_level and _level_name_from_level helpers * Fixes to _poplevel and add tests * Can't have negative level because we sort later * More droplevel tests * Add docs for droplevel * Add and pass tests for unstack() * Fix null order in table encode and add docs * Add table encode test * Clarify null behaviour * Document scatter_to_table and add default nrows/ncols * Fix MultiIndex.__len__ * Logic fixes in pivot * Add reason for xfail * Changelog * Remove split redefinition * Add a detail::bitmask_and API that accepts a table view * Replace multi-column sort with just a gather * Remove overload of encode() that accepts a cudf::column * Remove unused #include * Review feedback * Remove bindings for encode(column) * Ensure columns are an `Index` * Update pivot docstring * Test for duplicate index/column pairs in pivot * Docstrings for pivot/unstack * Update python/cudf/cudf/core/multiindex.py Co-authored-by: Keith Kraus <kkraus@nvidia.com> * Remove nullable_pd_dtype arg * Replace Series binop with column binops * Review feedback * Handle circular imports in core.reshape and copy pivot/unstack docstrings * Use two loops over ilevels * Handle empty levels in poplevels * Add fill_value for Pandas compat * Explain logic in encode * Review feedback * Fix typo Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com> Co-authored-by: Keith Kraus <kkraus@nvidia.com>

[REVIEW] Fix issue where `np.nan` is being return instead of `NAT` for datetime/duration types

[REVIEW] Add builder API for cuIO `write_parquet_args` and `read_parquet_args`

…888prabhu/cudf into 5962_refactor_avro_and_json

…962_refactor_avro_and_json

…ea-fancy-data-generator

Avoid guid use from deprecated pyarrow.compat

…ea-fancy-data-generator

into fea-fancy-data-generator

* Remove macros from CompactProtocolWriter * CHANGELOG fix * Formatting * PR review comments * Moved CompactProtocolWriter to its own file * Use reference instead of pointer for output vector * Add back CompactProtocolFieldWriter * Style fix * PR review comments

…6309)

MD5 list column support, not adding support for nested lists at this point. Also augments existing functionality to optionally retain nulls in the resulting column if elements in that row are null.

* Bug fixes for ColumnBuilder Co-authored-by: Kuhu Shukla <kuhus@nvidia.com>

* Remove repo url from build-info, as it contains offensive word * Update CHANGELOG.md * Update CHANGELOG.md Co-authored-by: Jason Lowe <jlowe@nvidia.com>

…6498) * Add helper method to ColumnBuilder with some nits Co-authored-by: Kuhu Shukla <kuhus@nvidia.com>

Closes #6343. Fixes COUNT_ALL, COUNT_VALID for window functions. In rolling_window() operations, COUNT_VALID/COUNT_ALL should only return null rows if the min_periods requirement is not satisfied. For all other cases, the count produced must be valid, even if the input row is null.

…df::io (#6386) * removed redundant reinterpret_cast * removed redundant reinterpret_cast * removed redundant reinterpret_casts * removed redundant reinterpret_casts * removed redundant reinterpret_casts * removed redundant reinterpret_casts * removed redundant reinterpret_casts * removed c-style pointer casts * removed redundant reinterpret_cast * removed c-style pointer casts * removed redundant reinterpret_cast * removed redundant reinterpret_cast * removed c-style pointer casts * changed source type to void* and removed rei-casts * removed redundant reinterpret_casts * removed redundant reinterpret_cast * changed unknown pointer type to void* * removed redundant reinterpret_casts * removed c-style pointer casts * removed redundant reinterpret_casts * removed redundant reinterpret_casts * removed c-style pointer casts * removed c-style pointer casts * removed c-style pointer casts * removed redundant reinterpret_casts * removed c-style pointer casts * removed c-style pointer casts * removed redundant reinterpret_casts * added changelog * changed decomp state member pointer to std::vector * made variable's type automatically deduced * clang-format fix Co-authored-by: Mark Harris <mharris@nvidia.com> Co-authored-by: Vukasin Milovanovic <vmilovanovic@nvidia.com>

Co-authored-by: Ryan Lee <ryanlee@nvidia.com>

…pe`) [skip ci] (#5861) * Add fixed_point iterator test * Add test demonstrating binary op * Replace cout with asserts * Test with fraction fixed_point values * Attempt to implement proxy reference, but thrust doesn't like it * Add hacked up transform_mutable_iterator * Cleanup dead code * Actually make mutable iterator work as input iterator * Simplify diff from transform_output_iterator * Add device implementation of fixed point iterator * Add test with iterator random access in kernel * Use thrust/transform_input_output_iterator * Replace rescale with rescaled * Add constexpr to numeric::scale_type ctor * Cleanup forward declarations * Add scale to data_type * Fixed typo in clamp file * Add make_numeric_column * factories_test.cpp remove FixedWidthTypesWithoutFixedPoint * Update CHANGELOG * Fix mockup of ColumnLike, get unit tests working, clang-format * Update CHANGELOG * Fix formatting * Add scale_type ctor to data_type * Formatting of make_fixed_width_column for readability * Fix typo * Add `cudf::` to data_type member for consistency * Update comment * Add `scale` method to data_type * Clean up column_view_printer for fixed_point * Add decimal32 specialization for .element() * Add initial `fixed_point_column_wrapper` * Add temporary change to column_view_printer * Add unit test for simple `fixed_point_column_wrapper` * Add `thrust::optional` to `data_type::_scale` * Make `fixed_point_column_wrapper` unit test for both reps * Add decimal64 specialization for .element() * Update column_view_printer for decimal64 * Small cleanup * Fix for failing unit tests * CUDF_EXPECTS data_type::_scale to be set * Add unit test for data_type::_scale not set * Remove TODO comment * data_type ctor docs and CUDF_EXPECTS * Add unit test for wrong type_id in data_type ctor * Remove `thrust::optional` from scale in data_type * Update CHANGELOG * IIFE clean up * Return fixed_point without temporary scaled_integer * Change fixed_point unit test to 0 scale * Add cudf::distance to avoid implicit casting * Temporary fix for FILLING_TEST * Add fixed_point specialization for make_elements * Fixed FIXED_POINT_TEST * Cleanup unit test * Add specialization of cudf::to_host for fixed_point * Add another specialization of make_elements * Use .empty() instead of == 0 * Fix ROLLING_TEST * CI fix * Fix CI: get_current_default_resource * Delete file * Merge mistake: host_vector should be of RepType * Use counting iterator + transform iterator * Use lambda and transform iterator * Add const * Cleanup * Fix (can't use generic lambda) * Use std::for_each & std::any_of * Fixed `RESHAPE_TEST` & `TRANSPOSE_TEST` * Fix PARTITIONING_TEST and some of COPYING_TEST * Fix ShiftTests of COPYING_TEST * Cleanup: use ternary opreator * Fix MERGE_TEST * Temporarily disable BINOP * Fix for ScatterScalar tests of COPYING_TEST * Fix DictionaryConcat tests of COPYING_TEST * Disable copy_range for fixed_point * Add specialization for copy_range fixed_point * Temporarily disable REPLACE_TEST for fixed_point * Use snake_case cleanup * Add back REPLACE_TEST for fixed_point * Use cudf::test::make_counting_transform_iterator * Fix for REPLACE_TEST * Add get_column_stored_type * Add CUDF_EXPECTS on representation type * Fix/cleanup cudf::to_host for fixed_point * Add simple fixed_point_column_wrapper test * fixed_point column_view_printer * Small cleanup * Remove [&] * Remove 10 * (from trying to break tests) * Rename _scale and adjust comment for clarity * Add column_type_id_matches_column_stored_type * Use thrust::copy_if with stencil * Add missing header * Docs and removing TODOs * Disable binop ptx with fixed_point * Add initial support for fixed_point binary operations * Enable some of the binary op tests * Refactor / cleanup * Remove rmm_log.txt * Remove unnecessary header * Use .is_empty() instead of .size() == 0 * Make changes to fixed_point_scalar * Changes to search.cu * Changes to scalar_construction_helper * Changes to null.cu * Changes to src/dictionary/search.cu * Changes to fixed_point_scalar * Remove temporary changes * Addressing PR comments * Update cpp/include/cudf/utilities/type_dispatcher.hpp Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/include/cudf/column/column_factories.hpp Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/include/cudf/column/column_factories.hpp Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/include/cudf/utilities/type_dispatcher.hpp Co-authored-by: Mark Harris <mharris@nvidia.com> * Fix formatting * Changes from PR 6063 and 6064 * Addressing PR comments * CI Fix * Update cpp/src/merge/merge.cu Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/src/merge/merge.cu Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/src/merge/merge.cu Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/src/merge/merge.cu Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/src/merge/merge.cu Co-authored-by: Mark Harris <mharris@nvidia.com> * Update cpp/src/merge/merge.cu Co-authored-by: Mark Harris <mharris@nvidia.com> * Auto commit fix * Replace for_each with all_of * Cleanup in types.hpp * Changes for fixed_point binary operations * fixed_point binop different scales * Enable last binop fixed_point test * Clean up tests (remove TODO) and support both decimal32/64 * Final binop cases for fixed_point * Addressing PR comments * fixed_point_scalar_device_view inheritance from scalar_device_* * Testing CI * Final TODOs (docs) * Remove redundant is_fixed_point * Fix for AST (thank you David) * Remove headers + clean up * Remove headers * fixed_point scalar factory test * scalar_test for fixed_point * Rename fn to type_id_matched_device_storage_type * Remove header * Addressing PR comments, fixed_point_column_wrapper tests * Revert type_id_matches... changes * Simplify device_storage_type_t docs * Add failing binop test * Disable `fixed_point` binaryops * Addressing PR comments * Spelling fixes Co-authored-by: Trevor Smith <trevorsm7@gmail.com> Co-authored-by: Mark Harris <mharris@nvidia.com>

…oups. (#6497) Fixes : #6477 and adds tests.

codecov · 2020-10-16T14:36:29Z

Codecov Report

Merging #6547 into main will decrease coverage by 1.56%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #6547      +/-   ##
==========================================
- Coverage   84.54%   82.97%   -1.57%     
==========================================
  Files          81       95      +14     
  Lines       13845    14900    +1055     
==========================================
+ Hits        11705    12364     +659     
- Misses       2140     2536     +396

Impacted Files	Coverage Δ
python/cudf/cudf/comm/gpuarrow.py	`79.76% <0.00%> (-9.53%)`	⬇️
python/cudf/cudf/utils/gpu_utils.py	`53.65% <0.00%> (-4.88%)`	⬇️
python/cudf/cudf/core/abc.py	`87.23% <0.00%> (-4.26%)`	⬇️
python/cudf/cudf/core/buffer.py	`79.04% <0.00%> (-3.65%)`	⬇️
python/cudf/cudf/utils/cudautils.py	`48.55% <0.00%> (-3.60%)`	⬇️
python/cudf/cudf/utils/utils.py	`82.53% <0.00%> (-2.37%)`	⬇️
python/dask_cudf/dask_cudf/io/orc.py	`90.90% <0.00%> (-1.28%)`	⬇️
python/dask_cudf/dask_cudf/io/parquet.py	`91.35% <0.00%> (-0.85%)`	⬇️
python/cudf/cudf/core/column/lists.py	`97.43% <0.00%> (-0.65%)`	⬇️
python/cudf/cudf/core/column/categorical.py	`93.11% <0.00%> (-0.51%)`	⬇️
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 71cb8c0...2cda39b. Read the comment docs.

Overview First step before "Project Flash" to improve gpuCI scripts and allow for use of different node types for CPU builds. Post "Project Flash" there will be another round of improvements and updates with the goal to make these scripts easier to maintain. Changes - Consolidates some of the CPU gpuCI scripts, reducing number of scripts to maintain - Standardizes naming of the upload script to upload.sh - Changes logging to use gpuci_logger where practical - Updates comments and documentation where necessary - Enables gpuCI to override environment variables based on node types - Reorders upload script to save ~5 mins or more in testing time

CHANGELOG.md

rjzamora and others added 30 commits September 9, 2020 18:34

Merge branch 'branch-0.16' into remove-pyarrow.compat

733c9f8

add remaining docs; small clean up around vector<bool>

35574a3

Merge branch 'fea-fancy-data-generator' of https://github.com/vuule/cudf

4e79643

into fea-fancy-data-generator

Merge branch 'branch-0.16' into fea-fancy-data-generator

31315d7

Fix test

8f88bb8

Merge pull request #6173 from davidwendt/bug-subword-sliced-column

cbd5fc7

[REVIEW] Fix normalize_characters offset logic on sliced strings column

[REVIEW] Fix cmake build for arrow (#6182)

286e030

* Fix cmake build for arrow * Updated Changelog Co-authored-by: Keith Kraus <kkraus@nvidia.com>

Merge branch 'branch-0.16' into 5962_parquet_write_read_args

5bb9f31

Update move.pxd

a59cd0d

style changes

3115e23

Merge branch 'branch-0.16' into bug-make-elements-reserve

2b12ad4

Add CUDA_HOST_DEVICE_CALLABLE to function object call operators

3ffc667

Reverse changes to gather_test

00fb162

move max_column_size definition out of types.hpp

0af240c

fix merge conflicts

da823d2

Merge pull request #6137 from galipremsagar/6132

de78e58

[REVIEW] Fix issue where `np.nan` is being return instead of `NAT` for datetime/duration types

Merge pull request #6051 from rgsl888prabhu/5962_parquet_write_read_args

ac39e37

[REVIEW] Add builder API for cuIO `write_parquet_args` and `read_parquet_args`

review changes

d09c982

Merge branch '5962_refactor_avro_and_json' of https://github.com/rgsl…

68b595e

…888prabhu/cudf into 5962_refactor_avro_and_json

Merge branch 'branch-0.16' of https://github.com/rapidsai/cudf into 5…

acd6092

…962_refactor_avro_and_json

fixing changes

d2d185e

update changelog

db8e7d4

remove unneeded mr parm from contains API

0c5fc60

Merge branch 'branch-0.16' of https://github.com/rapidsai/cudf into f…

e254fd8

…ea-fancy-data-generator

Merge pull request #6189 from rjzamora/remove-pyarrow.compat

671e8dd

Avoid guid use from deprecated pyarrow.compat

Merge branch 'branch-0.16' of https://github.com/rapidsai/cudf into f…

9bc064c

…ea-fancy-data-generator

Merge branch 'fea-fancy-data-generator' of https://github.com/vuule/cudf

0a4042b

into fea-fancy-data-generator

Update JNI to use parquet options builder

f6ddc4c

changelog

060d88b

kaatish and others added 13 commits October 7, 2020 17:25

Fix ORC reader issue with decimal type (#6466)

64228dc

Make all the CI .sh scripts have a consistent set of permissions (#…

a656ac1

…6309)

MD5 hashing list and null retention support (#6379)

7141d26

MD5 list column support, not adding support for nested lists at this point. Also augments existing functionality to optionally retain nulls in the resulting column if elements in that row are null.

Bug fixes for ColumnBuilder (#6462)

793baa0

* Bug fixes for ColumnBuilder Co-authored-by: Kuhu Shukla <kuhus@nvidia.com>

Remove repo URL from Java build-info [skip ci] (#6491)

421fddb

* Remove repo url from build-info, as it contains offensive word * Update CHANGELOG.md * Update CHANGELOG.md Co-authored-by: Jason Lowe <jlowe@nvidia.com>

[REVIEW] Add helper method to ColumnBuilder with some nits [skip ci] (#…

7c055c1

…6498) * Add helper method to ColumnBuilder with some nits Co-authored-by: Kuhu Shukla <kuhus@nvidia.com>

add support for list_topic for a specific topic (#6352)

4117d37

Java byte casting functionality (#6367)

06e4189

Co-authored-by: Ryan Lee <ryanlee@nvidia.com>

Fix issue reading parquet files containing lists with multiple row gr…

c6dfa6e

…oups. (#6497) Fixes : #6477 and adds tests.

GPUtester requested review from a team as code owners October 16, 2020 14:33

GPUtester requested review from vuule, davidwendt, kkraus14 and galipremsagar October 16, 2020 14:33

galipremsagar approved these changes Oct 16, 2020

View reviewed changes

harrism approved these changes Oct 18, 2020

View reviewed changes

davidwendt reviewed Oct 19, 2020

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Update CHANGELOG.md

2cda39b

raydouglass merged commit 7ef8174 into main Oct 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELEASE] cudf v0.16 #6547

[RELEASE] cudf v0.16 #6547

GPUtester commented Oct 16, 2020

codecov bot commented Oct 16, 2020 •

edited

Loading

[RELEASE] cudf v0.16 #6547

[RELEASE] cudf v0.16 #6547

Conversation

GPUtester commented Oct 16, 2020

❄️ Code freeze for branch-0.16 and v0.16 release

What does this mean?

What is the purpose of this PR?

codecov bot commented Oct 16, 2020 • edited Loading

Codecov Report

❄️ Code freeze for `branch-0.16` and v0.16 release

codecov bot commented Oct 16, 2020 •

edited

Loading