-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELEASE] cudf v0.17 #6935
Merged
Merged
[RELEASE] cudf v0.17 #6935
Changes from all commits
Commits
Show all changes
344 commits
Select commit
Hold shift + click to select a range
0fa8e29
Use correct stream in hash_join.
jrhemstad ecc8193
changelog.
jrhemstad 3644dbb
Merge remote-tracking branch 'upstream/branch-0.17' into mwilson/int96
hyperbolic2346 e620a73
Fix memory usage calculation (#6596)
galipremsagar e94ed01
Add function to create hashed vocabulary file from raw vocabulary (#6…
VibhuJawa b0b389e
Merge remote-tracking branch 'upstream/branch-0.17' into mwilson/int96
hyperbolic2346 c40e27a
Merge pull request #6603 from jrhemstad/fix-hash-join-stream
jrhemstad 1ddd81d
Merge remote-tracking branch 'upstream/branch-0.17' into mwilson/int96
hyperbolic2346 bdba041
Update JNI to new RMM cuda_stream_view API (#6612)
jlowe 13b06a0
Merge remote-tracking branch 'upstream/branch-0.17' into mwilson/int96
hyperbolic2346 d3f0fb6
int96 changes
hyperbolic2346 8b7730e
Fix JNI native dependency load order (#6617)
jlowe 5f35809
Improve subword tokenizer docs (#6608)
VibhuJawa 6b6f0cc
Add dictionary support to cudf::unary_operation (#6540)
davidwendt 64bab8f
Add missing device_scalar stream parameters. (#6582)
harrism a81d07b
Add in java column to row conversion (#6578)
revans2 35f9b23
Add strings::contains API with target column parameter (#6598)
davidwendt 851c881
Add support for conversion to Pandas nullable dtypes and fix related …
galipremsagar ea0b5d2
Fix integer overflow in ORC encoder (#6607)
vuule b0b3ad6
Updating to write INT96 type instead of INTERVAL
hyperbolic2346 4c6cbb1
linting
hyperbolic2346 b504d7a
Adding some documentation
hyperbolic2346 b04eb98
Adding changelog
hyperbolic2346 321c896
Merge branch 'branch-0.17' into mwilson/int96
vuule 06cb559
Revert bad CMake changes for JNI (#6629)
revans2 8aef966
Add operator overloading to column and clean up error messages (#6623)
galipremsagar 275d462
Merge remote-tracking branch 'upstream/branch-0.17' into mwilson/int96
hyperbolic2346 b2f875d
Add AVRO fuzz tests with varying function parameters (#6489)
galipremsagar 2452776
Fix timezone offset when reading ORC files (#6601)
vuule f7e6270
support `cudf.to_numeric` (#6592)
isVoid 1d72e0f
Fix Java HostColumnVector unnecessarily loading native dependencies (…
jlowe 54f3c0e
Update scatter APIs to use reference wrapper / const scalar (#6579)
brandon-b-miller 679a074
Fix ORC boolean column corruption issue (#6636)
rgsl888prabhu 2df2f3a
Fix subword tokenizer metadata for token count equal to max_sequence_…
davidwendt 2942e6f
Add error message for unsupported `axis` parameter in DataFrame APIs …
galipremsagar e01ab96
Update `to_pandas` api docs (#6622)
galipremsagar 6dfc6a3
small changes
rgsl888prabhu cdde9e4
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into 6…
rgsl888prabhu ac351d2
Add support for `pipe` API (#6638)
galipremsagar b678dc2
review changes
rgsl888prabhu 1912637
Add dictionary support to libcudf join APIs (#6556)
davidwendt ff022e6
review changes
rgsl888prabhu 7bea264
Implement `cudf::round` floating point and integer types (`HALF_UP`) …
codereport 86cdb58
Add ability to set scalar values in `cudf.DataFrame` (#6610)
galipremsagar 094c325
Pin cmake policies to cmake 3.17 version (#6545)
06e74ef
review changes
rgsl888prabhu 21a0a33
Replaced SHFL_XOR calls with cub::WarpReduce (#6653)
kaatish 5cf7106
Add cudf::test::dictionary_column_wrapper class (#6635)
davidwendt 494b8aa
Fix the bug where if a PTX instruction contains "st.param" the whole …
hummingtree c643818
Add the python test for the originally failing python lambda.
hummingtree b769bfd
Changelog.
hummingtree 3f5e3d9
Fix the added pytest so it actually tests what we want to test ...
hummingtree 13a38c7
enable decimal type in HostColumnVector
sperlingxx ed9425b
fix typo
sperlingxx 31551c8
add changelog line
sperlingxx e57b3d7
enable decimalType within nestedTypes
sperlingxx eae0a10
Update java/src/test/java/ai/rapids/cudf/DecimalColumnVectorTest.java
sperlingxx c79c3d1
address comments
sperlingxx 83a0d89
address comments
sperlingxx 0413e17
Update java/src/test/java/ai/rapids/cudf/DecimalColumnVectorTest.java
sperlingxx 6452679
a lot of refinement
sperlingxx f56c4d9
some refinement
sperlingxx a8e9c83
addressed comments
sperlingxx 69786ed
add decimalFromDoubles
sperlingxx 5951de8
refine doc for cv.fromDecimals
sperlingxx 5cdda6f
refine
sperlingxx 06aba54
fix doc
sperlingxx 29cf5d6
refine
sperlingxx f6b8e02
refine
sperlingxx 8015692
INT96 changes for Parquet writer
razajafri a98eac8
updated changelog
razajafri ada1a0c
Support fixed-point decimal for ColumnVector
sperlingxx b4e651e
[REVIEW] Fix csv writer handling embedded comma delimiter (#6643)
davidwendt 2cf893c
[REVIEW] Implement `cudf::round` floating point and integer types (`H…
codereport 1556a49
[REVIEW] Disallow `fixed_point` `cudf::concatenate` with different sc…
codereport 65cb685
Make the explanation for st.param.*** clearer.
hummingtree c1baf12
review changes
rgsl888prabhu a9d9329
Reading ORC statistics (#6142)
calebwin 77351ef
changes
rgsl888prabhu 1f8f262
Add cudf::dictionary::make_dictionary_pair_iterator (#6651)
davidwendt f962a5d
Fix issue where index name of caller object is being modified in csv …
galipremsagar 390c34f
[REVIEW] Fix integer parsing in CSV and JSON for values outside of in…
kaatish 7f9578e
Updating based on review comments
hyperbolic2346 2fb54db
Merge remote-tracking branch 'upstream/branch-0.17' into mwilson/int96
hyperbolic2346 a268a9c
Updating comment and fixing an issue spotted in diff
hyperbolic2346 45fe6c9
Correct use of CUDA_ARCH.
jrhemstad 1e1c323
Add tests for release_assert.
jrhemstad be9e335
Cover different CSV reader/writer options in benchmarks (#6644)
vuule faada81
Parameterize avro and json benchmark (#6673)
rgsl888prabhu 8159c99
Update to use custom main() to avoid RMM interfering with death test.
jrhemstad 5fba8b9
Update death test logic.
jrhemstad 24f725d
format.
jrhemstad a4a6ef7
Doc.
jrhemstad 762b2df
changelog.
jrhemstad b1a3de2
Update error_handling_test.cu
jrhemstad d6d83af
Inadvertent change revert
hyperbolic2346 bfb1b88
Fix issue related to `na_values` input in `read_csv` (#6693)
galipremsagar 2ef56ed
Merge branch 'branch-0.17' into fix-release-assert
jrhemstad bc43878
Fix handling of empty column name in csv writer (#6692)
galipremsagar aac682b
Explicitly set legacy or per-thread default stream in JNI (#6690)
rongou 1c31f0e
Add DecimalDtype to cuDF (#6675)
codereport 3d44ed5
Implement `cudf::round` `decimal32` & `decimal64` (`HALF_UP` and `HAL…
codereport 7223835
Merge branch 'branch-0.17' into hotfix/python_lambda_fail
harrism ce848ee
Merge pull request #6670 from hummingtree/hotfix/python_lambda_fail
hummingtree 24dcc8e
Fix leak warnings in JNI unit tests (#6704)
jlowe ca8f998
Project Flash script changes
raydouglass 0485085
Raise informative error while converting a pandas dataframe with dupl…
galipremsagar 8dd9323
Fix issue when `numpy.str_` is given as input to string parameters in…
galipremsagar 940f1a4
Add call to cudaStreamSynchronize() in ::get_value()
nvdbaranec 73fb0e2
Changelog for 6713
nvdbaranec 84bf578
format.
jrhemstad 4db8ed9
Merge branch 'branch-0.17' into fix-release-assert
jrhemstad 3bd593d
Apply `na_rep` to column names in csv writer (#6708)
galipremsagar e4404e4
[REVIEW] Fix an out-of-bounds indexing error in gather() for LIST typ…
nvdbaranec 6042739
Merge branch 'branch-0.17' into get_value_fix
harrism 96ee051
Add nested type support to Java table serialization (#6705)
jlowe 88821fb
Handle index=False in dask_cudf.read_parquet (#6722)
rjzamora 75b1880
Add a comment to get_value indicating that it synchronizes the stream.
nvdbaranec 8be124d
Merge branch 'branch-0.17' into get_value_fix
nvdbaranec b85c52b
Merge branch 'get_value_fix' of github.com:nvdbaranec/cudf into get_v…
nvdbaranec 08254af
FIX Use artifact conda channel
raydouglass 7a493b9
Implement `cudf::cast` for `decimal32/64` to/from integer and floatin…
codereport 52bb044
Updating per review comments to computer julian calendar epoch differ…
hyperbolic2346 e7b69d1
Add serialization methods for ListColumn (#6721)
rjzamora e89bb41
review changes
rgsl888prabhu ef61bef
remove headers
rgsl888prabhu a7e975b
Merge pull request #6696 from jrhemstad/fix-release-assert
jrhemstad d356040
Ensure CONDA_PREFIX is on the LD_LIBRARY_PATH
raydouglass f6cf409
Fix cuDF benchmarks build with static Arrow lib and fix rapids-compos…
harrism e734d5b
Remove reinterpret_cast conversions between pointer types in Avro (#6…
vuule b2d281b
Add dictionary support to cudf::quantile (#6676)
davidwendt aaba250
Remove 2nd type-dispatcher call from cudf::reduce for simple operatio…
davidwendt 52979d1
Ensure CONDA_PREFIX is on the LD_LIBRARY_PATH
raydouglass 8502ccf
CLN Remove specific gpu arch spec
raydouglass e82294c
Grammar change in documentation
nvdbaranec 41eeff7
Adding map method for series (#6459)
marlenezw fae4769
DOC Update changelog
raydouglass 00e073b
Fix implementation of `dtype` parameter in `cudf.read_csv` (#6720)
galipremsagar 666bfae
Add Java bindings for is_timestamp (#6739)
andygrove 73c3e1d
Fix concat bug in dask_cudf Series/Index creation (#6742)
rjzamora 95b7edb
Merge remote-tracking branch 'origin/branch-0.17' into project-flash
raydouglass 899ca0e
Add Java API to concatenate serialized tables to ContiguousTable (#6748)
jlowe 974c4c4
Update nested JNI builder so we can do it incrementally. (#6749)
revans2 89c6cdd
Fix cudf python docs and associated build warnings (#6728)
galipremsagar fbf12f3
Add ORC fuzz tests with varying function parameters (#6571)
galipremsagar 627940d
[REVIEW] Fix orc read corruption on boolean column (#6702)
rgsl888prabhu f323721
`RangeIndex` support for step parameter (#6662)
isVoid 3fc8142
Remove macros from ORC reader and writer (#6698)
kaatish dea44d9
Replace raw streams with rmm::cuda_stream_view (part 1) (#6646)
harrism 400d9a7
Refactoring cooperative loading with single thread loading (#6559)
vuule fb789bf
Merge pull request #6713 from nvdbaranec/get_value_fix
jrhemstad 1d5eec6
cuDF python scalars (#6297)
brandon-b-miller bd564a0
Fix DataFrame initialization from list of dicts (#6632)
brandon-b-miller 4fa2391
Binary operations support for decimal type in cudf Java (#6734)
nartal1 158cb6b
Add Java/JNI bindings for round (#6761)
nartal1 fcf2eee
Fix sort order of parameters in `test_scalar_invalid_implicit_convers…
galipremsagar 0683df9
Filtering ORC (#6116)
calebwin 21b28ba
Struct column support for cudf::concatenate (#6652)
nvdbaranec aebb4dd
Merge branch 'branch-0.17' into project-flash
raydouglass 7220598
Implement `cudf::cast` for `decimal32/64` to/from different `type_id`…
codereport bee3229
Fix hash join hash values mapping to reserved empty value (#6735)
jrhemstad 42fe218
Support creating decimal vectors from scalar (#6723)
sperlingxx cf4b838
Merge remote-tracking branch 'upstream/branch-0.17' into mwilson/int96
hyperbolic2346 a7b4f1e
Move `cudf::cast` tests to separate test file (#6780)
codereport 69203f1
Merge pull request #6737 from raydouglass/project-flash
raydouglass 8ae857c
Update java reduction APIs to reflect C++ changes [skip ci] (#6787)
revans2 01b8b5c
Add nested type support to ColumnVector#getDeviceMemorySize (#6786)
jlowe 68dafdd
Adding docstring in to ioutils.py
hyperbolic2346 546b9c3
Support building decimal columns with Table.TestBuilder (#6770)
sperlingxx 3827052
Cover different ORC and Parquet reader/writer options in benchmarks (…
vuule 3bf5606
Use `void` return type for kernel wrapper functions instead of return…
vuule f093627
Implement `cudf::unary_operation` for `decimal32` & `decimal64` (#6777)
codereport a7fec22
Rework ColumnViewAccess and its usage (#6751)
4c1ed29
Fix race conditions in parquet (#6766)
rgsl888prabhu d834777
Fix output size for orc read for skip_rows option (#6686)
kaatish 1a80df9
[REVIEW] Rename `unary_op` to `unary_operator` (#6789)
codereport 953c24d
Implement `cudf::clamp` for `decimal32` and `decimal64` (#6792)
codereport 8c420ae
Fix AVRO reader issues with empty input (#6794)
vuule 99cee1c
Cupy fallback for __array_function__ and __array_ufunc__ for cudf.Se…
VibhuJawa 0f0e748
Parquet writer list statistics (#6703)
devavret f3ccf1c
Merge branch 'branch-0.17' into mwilson/int96
hyperbolic2346 263ec65
Add support for create_metadata_file in dask_cudf (#6796)
rjzamora 7e51022
Replace raw streams with rmm::cuda_stream_view (part 2) (#6648)
harrism de5577c
[REVIEW] Add dictionary support to cudf::minmax (#6764)
davidwendt 71d4c34
Merge pull request #6625 from hyperbolic2346/mwilson/int96
hyperbolic2346 5380744
Fix JNI build (#6824)
jlowe 9b656ef
Fix resource management in Java ColumnBuilder (#6826)
jlowe 062bf85
Fix `read_avro` docs (#6798)
galipremsagar bc154af
Add support for join parameter in cudf concat (#6336)
marlenezw b391cb7
Support scatter() for list columns (#6768)
mythrocks b5f2e3c
Use CMake 3.19 for RMM when building cuDF jar (#6819)
GaryShen2008 dbeac89
Use settings.xml if existing for internal build (#6833)
GaryShen2008 5cabd73
Enable copy_if for fixed-point decimal columns (#6805)
sperlingxx 591bead
[REVIEW] Optimization and nested type support for contiguous_split. (…
nvdbaranec db066de
Enable workaround to write categorical columns in csv (#6829)
galipremsagar cdd72c9
Fix result representation in groupby.apply (#6790)
galipremsagar fd72e5f
Fix categorical scalar insertion (#6830)
VibhuJawa 8cc23bd
Replace raw streams with rmm::cuda_stream_view (part 3) (#6744)
harrism 17666c4
Enable `expand=False` in `.str.split` and `.str.rsplit` (#6813)
galipremsagar 0e7ffcf
First class support for unbounded window function bounds (#6811)
mythrocks 632ac54
Add LogicalType to Parquet reader (#6511)
karthikeyann 6d9b139
fix uint32_t undefined errors
rongou 490f01a
add to changelog
rongou e9aedb2
Implement `cudf::copy_range` for `decimal32` and `decimal64` (#6843)
codereport e1e3047
Split out cudf::distinct_count from drop_duplicates.cu (#6822)
davidwendt 4eff46f
INT96 changes for Parquet writer
razajafri 2bb8480
updated changelog
razajafri 6e276be
addressed review comments
razajafri 38dc99d
Merge branch 'parquet_writer_int96' of github.com:razajafri/cudf into…
razajafri d91ddaf
reverted CMake changes
razajafri 6f198b4
Implement `cudf::copy_if_else` for `decimal32` and `decimal64` (#6845)
codereport 5d45e03
updated changelog
razajafri 250e405
Merge pull request #6848 from razajafri/parquet_writer_int96
razajafri f3b0e06
Avoid gather when copying strings view from start of strings column (…
jlowe c34d9bf
Add support for scatter() on lists-of-struct columns (#6817)
mythrocks a6331bf
Correct the param order of writeParquetBufferBegin
GaryShen2008 ff4f6f0
update Changelog
GaryShen2008 0b58244
reduce HtoD copies in `cudf::concatenate` #6605
karthikeyann 45bd967
Merge pull request #6854 from GaryShen2008/fix-writeParquetBufferBegin
razajafri f0f53c7
Fix `.str.replace_with_backrefs` docs examples (#6855)
galipremsagar 1771a8f
Replace cuio macros with constexpr and inline functions (#6782)
kaatish 0492519
Merge remote-tracking branch 'upstream/branch-0.17' into fix-cstdint
rongou 76799a1
Move template param to member var to improve compile of hash/groupby.…
davidwendt f3c9322
Fix contiguous split of null string columns (#6853)
sperlingxx b8e1ca6
Merge pull request #6844 from rongou/fix-cstdint
rongou cdc53b7
Move align_ptr_for_type() from cuda.cuh to alignment.hpp(#6859)
davidwendt 83d2146
Fix compile error in type_dispatch_benchmark.cu (#6861)
davidwendt 1c81827
Add dictionary support to cudf::reduce(#6666)
davidwendt bd537b6
Push DeviceScalar to cython-only (#6800)
brandon-b-miller 0ddba3d
Verify that concatenating columns does not overflow size_type(#6809)
nvdbaranec ff66c5e
Refactor `std::array` usage in row group index writing in ORC(#6807)
rgsl888prabhu 42644cc
Specify git branches to avoid pip unresolvable issues(#6869)
jdye64 edd1af1
Fix index handling in parquet reader and writer(#6771)
galipremsagar a4bdf24
add groupby hash mean aggregation, 2-pass method of hash groupby(#6392)
karthikeyann f854938
Force local artifact install(#6806)
raydouglass 220c988
Improve Dockerfile(#6619)
igormp a2d2726
Remove bounds check for `cudf::gather`(#6875)
isVoid f137ed1
Support selecting different hash functions in hash_partition(#6726)
gaohao95 5336301
Fix typo and `0-d` numpy array handling in binary operation(#6887)
rgsl888prabhu 70ebbee
Handle index when dispatching __array_function__ and __array_ufunc__ …
VibhuJawa 73cca47
Serial murmur3 hash with configurable seed(#6781)
rwlee 9fb69a6
Add parquet chunked writing ability for list columns(#6831)
devavret b9ef96c
Adding `decimal32` and `decimal64` support to parquet reading(#6808)
hyperbolic2346 f5e76fb
Support read_parquet with paths resolving to multiple files(#6815)
ayushdg 1af9bc0
Update JNI to new gather boundary check API [skip ci] (#6899)
jlowe e22c3ae
Fix missing clone overrides on derived aggregations(#6898)
jlowe cd7a0ad
Parquet option for strictly decimal reading (#6908)
sperlingxx 30bbb39
Create agg() function for dataframes(#6483)
skirui-source bd321d1
Enable groupby `list` aggregation for strings(#6914)
shwina 00ca246
Update CHANGELOG.md
raydouglass File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -156,3 +156,6 @@ ENV/ | |
|
||
# Dask | ||
dask-worker-space/ | ||
|
||
# protobuf | ||
**/*_pb2.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicates line 7.