Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cudf v0.18 #7405

Merged
merged 204 commits into from
Feb 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
204 commits
Select commit Hold shift + click to select a range
80464ce
DOC v0.18 Updates
ajschmidt8 Nov 24, 2020
0e94bab
Add a cmake option to link to GDS/cuFile (#6847)
rongou Nov 30, 2020
2ed7e13
Merge pull request #6866 from rapidsai/branch-0.17
GPUtester Dec 1, 2020
a091304
Merge pull request #6867 from rapidsai/branch-0.17
GPUtester Dec 1, 2020
c0e03d6
Merge pull request #6874 from rapidsai/branch-0.17
GPUtester Dec 2, 2020
018d036
Merge pull request #6876 from rapidsai/branch-0.17
GPUtester Dec 2, 2020
7aa3863
Merge pull request #6877 from rapidsai/branch-0.17
GPUtester Dec 2, 2020
36c03a5
Merge pull request #6878 from rapidsai/branch-0.17
GPUtester Dec 2, 2020
48adcc0
Merge pull request #6879 from rapidsai/branch-0.17
GPUtester Dec 2, 2020
36d5205
Merge pull request #6880 from rapidsai/branch-0.17
GPUtester Dec 2, 2020
536d23a
Merge branch 'branch-0.17' into fix_automerge
kkraus14 Dec 3, 2020
737e715
Merge pull request #6890 from kkraus14/fix_automerge
Dec 3, 2020
3d80bb8
Merge pull request #6896 from rapidsai/branch-0.17
GPUtester Dec 3, 2020
c6f39b1
Merge pull request #6900 from rapidsai/branch-0.17
GPUtester Dec 4, 2020
009c307
Merge pull request #6904 from rapidsai/branch-0.17
GPUtester Dec 4, 2020
dd6cf15
Merge pull request #6906 from rapidsai/branch-0.17
GPUtester Dec 4, 2020
8c8e05f
Merge pull request #6910 from rapidsai/branch-0.17
GPUtester Dec 4, 2020
522103d
Merge pull request #6913 from rapidsai/branch-0.17
GPUtester Dec 4, 2020
214dccc
Implement DataFrame.quantile for datetime and timedelta data types(#6…
ChrisJar Dec 6, 2020
598a14d
Fix rmm_mode=managed parameter for gtests(#6912)
davidwendt Dec 7, 2020
917759b
Add `Index.set_names` api(#6929)
galipremsagar Dec 8, 2020
f6b16ab
Fix `columns` & `index` handling in dataframe constructor(#6838)
galipremsagar Dec 8, 2020
9120992
Implement `cudf::reduce` for `decimal32` and `decimal64` (part 1) (#6…
codereport Dec 8, 2020
78f9789
Update to official libcu++ on Github(#6275)
trxcllnt Dec 8, 2020
8a1a6d7
Remove **kwargs from string/categorical methods(#6750)
shwina Dec 8, 2020
17c8f97
Fix N/A detection for empty fields in CSV reader(#6922)
vuule Dec 9, 2020
83b1851
fix libcu++ include path for jni(#6948)
rongou Dec 9, 2020
b45fd4d
Fix cudf::merge gtest for dictionary columns(#6942)
davidwendt Dec 9, 2020
44eeb70
Update Java bindings version to 0.18-SNAPSHOT(#6949)
jlowe Dec 9, 2020
6d230ee
`fixed_point_value` double-shifts in `fixed_point` construction (#6950)
codereport Dec 9, 2020
a301e65
Add in basic support to JNI for logical_cast(#6954)
revans2 Dec 9, 2020
f117b68
Use simplified `rmm::exec_policy` (#6939)
harrism Dec 10, 2020
dc05261
Fix type comparison for java(#6970)
revans2 Dec 10, 2020
d028db6
Add Java bindings for URL conversion(#6972)
jlowe Dec 10, 2020
89938fa
Add JNI wrapper for the cuFile API (GDS)(#6940)
rongou Dec 10, 2020
f965d9a
Align `Series.groupby` API to match Pandas(#6964)
Dec 10, 2020
ea9c689
Fix typo in numerical.py(#6957)
rgsl888prabhu Dec 11, 2020
b136469
Fix groupby agg/apply behaviour when no key columns are provided(#6945)
shwina Dec 11, 2020
13acc98
Make `cudf::round` for `fixed_point` when `scale = -decimal_places` a…
codereport Dec 11, 2020
d842327
Remove duplicate file array_tests.cpp(#6953)
karthikeyann Dec 11, 2020
2b656b0
Add groupby idxmin, idxmax aggregation(#6856)
karthikeyann Dec 11, 2020
f2b9a36
Add `replace_null` API with `replace_policy` parameter, `fixed_width`…
isVoid Dec 11, 2020
c017cb4
Minor `cudf::round` internal refactoring(#6976)
codereport Dec 11, 2020
252f478
Add null mask `fixed_point_column_wrapper` constructors(#6951)
codereport Dec 11, 2020
df5d452
Fix default parameter values of `write_csv` and `write_parquet`(#6967)
vuule Dec 11, 2020
3c15d30
Fix java cufile tests when cufile is not installed(#6987)
revans2 Dec 11, 2020
5735da5
Merge branch 'branch-0.17' into branch-0.18-merge-0.17
shwina Dec 11, 2020
4c26155
Merge pull request #6995 from shwina/branch-0.18-merge-0.17
Dec 11, 2020
ab8c931
Avoid inserting null elements into join hash table when nulls are tre…
hyperbolic2346 Dec 11, 2020
929c3f4
Fix timestamp parsing in ORC reader for timezones without transitions…
vuule Dec 12, 2020
2ede7df
Fix int to datetime conversion in csv_read(#6991)
kaatish Dec 13, 2020
b986220
Disable some pragma unroll statements in thrust sort.h(#6982)
davidwendt Dec 13, 2020
8dbaa2f
Fix nullmask offset handling in parquet and orc writer(#6889)
kaatish Dec 13, 2020
b0cb9db
Pass numeric scalars of the same dtype through numeric binops(#6938)
brandon-b-miller Dec 13, 2020
29c0af1
Fix Thrust unroll patch command(#7002)
harrism Dec 14, 2020
f37d42d
Check output size overflow on strings gather(#6997)
davidwendt Dec 14, 2020
2a2b4d6
fix excluding cufile tests by default(#6988)
rongou Dec 14, 2020
a5515f2
Remove warning in from_dlpack and to_dlpack methods(#7001)
miguelusque Dec 14, 2020
15f9530
Add null count test for apply_boolean_mask(#6903)
harrism Dec 15, 2020
515a173
Skip Thrust sort patch if already applied(#7009)
harrism Dec 15, 2020
6d1b076
Enable strict_decimal_types in parquet reading(#6969)
sperlingxx Dec 15, 2020
b370963
Fix `cudf::hash_partition` for `decimal32` and `decimal64`(#7006)
codereport Dec 15, 2020
1963111
Implement cudf.DateOffset for months(#6775)
brandon-b-miller Dec 15, 2020
8c1f01e
Add pytest-xdist to dev environment.yml(#6958)
galipremsagar Dec 16, 2020
6bc71c8
Pin librdkakfa to gcc 7 compatible version (#7021)
raydouglass Dec 16, 2020
7ca3fad
Fix loc behaviour when key of incorrect type is used(#6993)
shwina Dec 16, 2020
e5d3742
Fix round operator's HALF_EVEN computation for negative integers(#7014)
nartal1 Dec 17, 2020
ae17c14
Implement `cudf::reduce` for `decimal32` and `decimal64` (part 2)(#6980)
codereport Dec 17, 2020
da60cce
Extend `replace_nulls_policy` to `string` and `dictionary` type(#7004)
isVoid Dec 17, 2020
1c8f2a8
Restore usual instance/subclass checking to cudf.DateOffset(#7029)
shwina Dec 17, 2020
8c8c421
Add compression="infer" as default for dask_cudf.read_csv(#7013)
rjzamora Dec 17, 2020
ce21296
Refactor rolling.cu to reduce compile time(#6512)
mythrocks Dec 17, 2020
c16a0a5
Decimal casts in JNI became a NOOP(#7032)
revans2 Dec 17, 2020
4385f54
Add `method` field to `fillna` for fixed width columns(#6998)
isVoid Dec 17, 2020
ff56585
Correct ORC docstring; other minor cuIO improvements(#7012)
vuule Dec 17, 2020
05653ef
Fix libcudf strings logic where size_type is used to access INT32 col…
davidwendt Dec 17, 2020
3be4428
Correct the sampling range when sampling with replacement(#6884)
ChrisJar Dec 17, 2020
ae90dd9
Add `ffill` and `bfill` to string columns(#7036)
isVoid Dec 18, 2020
c24171b
Fix `read_orc` for decimal type(#7034)
rgsl888prabhu Dec 18, 2020
442985a
Add Ufunc alias look up for appropriate numpy ufunc dispatching(#6973)
VibhuJawa Dec 18, 2020
9317361
Fix backward compatibility of loading a 0.16 pkl file(#7033)
galipremsagar Dec 18, 2020
923cf49
Share `factorize` implementation with Index and cudf module(#6885)
brandon-b-miller Dec 19, 2020
7556e23
Improve representation of `MultiIndex`(#6992)
galipremsagar Dec 21, 2020
2780a8c
Make Doxygen comments formatting consistent(#7041)
vuule Dec 23, 2020
4a1e465
Update cudf python docstrings with new null representation (`<NA>`)(#…
galipremsagar Dec 29, 2020
277bd9f
Reduce number of hostdevice_vector allocations in parquet reader(#7005)
devavret Dec 29, 2020
28d18d6
Implement `cudf::rolling` for `decimal32` and `decimal64`(#7037)
codereport Dec 31, 2020
af41136
Create sort gbenchmark for strings column(#7040)
davidwendt Jan 4, 2021
ca1a4d6
Fix to_csv delimiter handling of timestamp format(#7023)
davidwendt Jan 4, 2021
fc92bb9
`cudf::scan` support for `decimal32` and `decimal64`(#7063)
codereport Jan 4, 2021
8860baf
Spark Murmur3 hash functionality(#7024)
rwlee Jan 4, 2021
d641688
Upgrade nvcomp to 1.2.1(#7069)
rongou Jan 4, 2021
31c0d29
Adding decimal writing support to parquet(#7017)
hyperbolic2346 Jan 5, 2021
6ebd264
Add days check to cudf::is_timestamp using cuda::std::chrono classes(…
davidwendt Jan 5, 2021
873ab4a
Only upload packages that were built(#7077)
raydouglass Jan 5, 2021
91322ba
`cudf::rolling` `ROW_NUMBER` support for `decimal32` and `decimal64`(…
codereport Jan 5, 2021
7bf0505
Refactor ORC `ProtobufReader` to make it more extendable(#7055)
vuule Jan 5, 2021
6828e2c
Add dictionary support to libcudf groupby functions(#6585)
davidwendt Jan 5, 2021
c0920e6
Improve `digitize` API(#7071)
isVoid Jan 6, 2021
1930432
Add `unstack()` support for non-multiindexed dataframes(#7054)
isVoid Jan 6, 2021
8787a64
Implement `__hash__` method for ListDtype(#7081)
galipremsagar Jan 6, 2021
f768da7
JNI support for creating struct column from existing columns and fixe…
revans2 Jan 7, 2021
9439ed8
Handle `nan` values correctly in `Series.one_hot_encoding`(#7059)
galipremsagar Jan 7, 2021
ee65a47
Add `pyorc` to dev environment(#7085)
galipremsagar Jan 7, 2021
aa38f85
Add informative error message for `sep` in CSV writer(#7095)
galipremsagar Jan 7, 2021
30e154c
Add Java tests for decimal casts(#7051)
sperlingxx Jan 8, 2021
04aa30c
Add groupby docs(#7100)
shwina Jan 8, 2021
11ebc3e
Adds in JNI support for creating an list column from existing columns…
revans2 Jan 11, 2021
87e414c
Add segmented_gather(list_column, gather_list)(#7003)
karthikeyann Jan 12, 2021
9a66576
verify window operations on decimal with java tests(#7120)
sperlingxx Jan 12, 2021
4da8312
Add `scale` and `value` methods to `fixed_point`(#7109)
codereport Jan 12, 2021
d791e20
Handle nested string columns with no children in contiguous_split.(#6…
nvdbaranec Jan 12, 2021
9790ff7
Add `cudf::binary_operation` `NULL_MIN`, `NULL_MAX` & `NULL_EQUALS` f…
codereport Jan 12, 2021
68d4791
Build libcudf with -Wall(#7105)
trxcllnt Jan 12, 2021
0c7b36e
update GDS/cuFile location for 0.9 release(#7131)
rongou Jan 13, 2021
e647d1a
Fix compilation failure caused by `-Wall` addition.(#7134)
codereport Jan 13, 2021
e0e2cf8
Fix compilation errors in libcudf(#7138)
galipremsagar Jan 13, 2021
c2e9ffd
Fastpath single strings column in cudf::sort(#7075)
davidwendt Jan 15, 2021
5828cef
Fix JIT cache multi-process test flakiness in slow drives(#7142)
devavret Jan 15, 2021
bce9552
Add gbenchmarks for reduction aggregations any() and all()(#7129)
davidwendt Jan 15, 2021
e86cc65
Add documentation for support dtypes in all IO formats(#7139)
galipremsagar Jan 15, 2021
835ccf9
`cudf::rolling_window` `SUM` support for `decimal32` and `decimal64`(…
codereport Jan 18, 2021
e8ecb24
Enable logic for GPU auto-detection in cudfjni(#7155)
gerashegalov Jan 19, 2021
8d80d5c
Fix comparisons between Series and cudf.NA(#7072)
brandon-b-miller Jan 19, 2021
b0525f4
Fixing parquet precision writing failing if scale is equal to precisi…
hyperbolic2346 Jan 19, 2021
5828be5
Fix -Werror=sign-compare errors in device code(#7164)
trxcllnt Jan 20, 2021
7df4a4c
Update doxyfile project number(#7161)
davidwendt Jan 20, 2021
3e0af46
Add libcudf API for parsing of ORC statistics(#7136)
vuule Jan 20, 2021
5855bfa
Update s3 tests to use moto_server(#7144)
ayushdg Jan 20, 2021
0515a42
Fix importing list & struct types in `from_arrow`(#7162)
galipremsagar Jan 20, 2021
36f85dc
Replace offsets with iterators in cuIO utilities and CSV parser(#7150)
vuule Jan 20, 2021
d79da2c
Cross link RMM & libcudf Doxygen docs(#7149)
ajschmidt8 Jan 20, 2021
02e25b6
Add `MultiIndex.rename` API(#7172)
isVoid Jan 20, 2021
27893db
Implement `cudf::group_by` (sort) for `decimal32` and `decimal64` (#7…
codereport Jan 20, 2021
95059b8
Add encoding and compression argument to CSV writer (#7168)
VibhuJawa Jan 20, 2021
a51caa5
Enable round in cudf for DataFrame and Series (#7022)
ChrisJar Jan 20, 2021
81952d0
Export mock aws credentials for s3 tests (#7176)
ayushdg Jan 20, 2021
6390498
Replace ORC writer api with class (#7099)
rgsl888prabhu Jan 21, 2021
4111cb7
Java bindings for Fixed-point type support for Parquet (#7153)
razajafri Jan 21, 2021
4c6a57c
Add support for array-like inputs in `cudf.get_dummies` (#7181)
galipremsagar Jan 21, 2021
6c116e3
Implement update() function (#6883)
skirui-source Jan 21, 2021
797f004
Add Python DecimalColumn (#6715)
shwina Jan 22, 2021
78113f5
Fix `fillna` & `dropna` to also consider `np.nan` as a missing value …
galipremsagar Jan 22, 2021
70cefa4
Adding unit tests for `fixed_point` with extremely large `scale`s (#7…
codereport Jan 23, 2021
2e0889a
Add CudfSeriesGroupBy to optimize dask_cudf groupby-mean (#7194)
rjzamora Jan 23, 2021
f422391
Adding support for explode to cuDF (#7140)
hyperbolic2346 Jan 25, 2021
6c2675c
Add libcudf lists column count_elements API (#7173)
davidwendt Jan 25, 2021
bf0c37a
Add Java interface for the new API 'explode' (#7151)
firestarman Jan 25, 2021
93ef1d2
Default `groupby` to `sort=False` (#7180)
isVoid Jan 25, 2021
f09a75f
Add JIT cache per compute capability (#7090)
devavret Jan 25, 2021
103c41a
Refactor cudf::string_view host and device code (#7159)
davidwendt Jan 25, 2021
eb1336f
Fast path single column sort (#7167)
davidwendt Jan 25, 2021
b1e9e20
Support contains() on lists of primitives (#7039)
mythrocks Jan 25, 2021
a1db5c5
Modify the semantics of `end` pointers in cuIO to match standard libr…
vuule Jan 26, 2021
6a4c760
Replace parquet writer api with class (#7058)
rgsl888prabhu Jan 26, 2021
d97b09e
Fixing parquet benchmarks (#7214)
rgsl888prabhu Jan 26, 2021
ccf4ffa
Add coverage for `skiprows` and `num_rows` in parquet reader fuzz tes…
galipremsagar Jan 26, 2021
d19cb40
Remove floating point types from radix sort fast-path (#7215)
davidwendt Jan 27, 2021
fc40c52
Add static type checking via Mypy (#6381)
shwina Jan 27, 2021
dd1efe1
Add JNI and Java bindings for list_contains (#7125)
Jan 27, 2021
9631660
Fix missing null_count() comparison in test framework and related fai…
nvdbaranec Jan 27, 2021
cbc0394
Add JNI support for converting Arrow buffers to CUDF ColumnVectors (#…
tgravescs Jan 28, 2021
7d52970
Support `numeric_only` field for `rank()` (#7213)
isVoid Jan 28, 2021
ab34580
Fix test column vector leak (#7238)
Jan 28, 2021
02166da
Fix some bugs in java scalar support for decimal (#7237)
revans2 Jan 28, 2021
9672e3d
Fix Arrow column test leaks (#7241)
tgravescs Jan 28, 2021
b608832
Add dictionary column support to rolling_window (#7186)
davidwendt Jan 28, 2021
b097b5a
Add support for `cudf::binary_operation` `TRUE_DIV` for `decimal32` a…
codereport Jan 29, 2021
fe5e07d
Refactor io memory fetches to use hostdevice_vector methods (#7035)
ChrisJar Jan 29, 2021
019d7cc
Fix `loc` for Series with a MultiIndex (#7243)
shwina Jan 29, 2021
14b0900
Implement COLLECT rolling window aggregation (#7189)
mythrocks Jan 30, 2021
50be922
Add List types support in data generator (#7064)
galipremsagar Feb 1, 2021
b8cb8c7
Handle various parameter combinations in `replace` API (#7207)
galipremsagar Feb 1, 2021
ccc9173
Define and implement more behavior for merging on categorical variabl…
brandon-b-miller Feb 1, 2021
0ee8004
Disallow picking output columns from nested columns. (#7248)
devavret Feb 1, 2021
3ecde9d
Remove floating point types from cudf::sort fast-path (#7250)
davidwendt Feb 1, 2021
52f5b32
libcudf Developer Guide (#6977)
harrism Feb 3, 2021
900c1e1
Fix style issues related to NumPy (#7279)
shwina Feb 3, 2021
54cddb1
Prepare Changelog for Automation (#7272)
ajschmidt8 Feb 3, 2021
2e71b36
Add docs for working with missing data (#7010)
galipremsagar Feb 4, 2021
fd2d0e2
Pack/unpack functionality to convert tables to and from a serialized …
nvdbaranec Feb 4, 2021
fd38b4c
Move lists utility function definition out of header (#7266)
mythrocks Feb 4, 2021
369ec98
Add Segmented sort (#7122)
karthikeyann Feb 4, 2021
4f87a59
Throw if bool column would cause incorrect result when writing to ORC…
vuule Feb 4, 2021
110ef3e
Update JNI for contiguous_split packed results (#7127)
jlowe Feb 4, 2021
1062fbc
fix java cuFile tests (#7296)
rongou Feb 4, 2021
fc9a00f
Improve `assert_eq` handling of scalar (#7220)
isVoid Feb 4, 2021
568df5b
Prepare Changelog for Automation (#7309)
galipremsagar Feb 4, 2021
8334700
Add column_device_view pointers to EncColumnDesc (#7097)
kaatish Feb 4, 2021
253dfdf
Fix copying dtype metadata after calling libcudf functions (#7271)
shwina Feb 4, 2021
fb33b94
Use `uvector` in `replace_nulls`; Fix `sort_helper::grouped_value` do…
isVoid Feb 4, 2021
3a52d93
Fix failing CI ORC test (#7313)
vuule Feb 4, 2021
e2f6952
Add Java unit tests for window aggregate 'collect' (#7121)
firestarman Feb 4, 2021
3fef7f7
Fix typo in cudf.core.column.string.extract docs (#7253)
adelevie Feb 5, 2021
f1a6616
Remove incorrect std::move call on return variable (#7319)
davidwendt Feb 5, 2021
26b8c60
Disallow constructing frames from a ColumnAccessor (#7298)
shwina Feb 5, 2021
0410a36
Fix bug when `iloc` slice terminates at before-the-zero position (#7277)
isVoid Feb 5, 2021
658e91a
Update 10 minutes to cuDF and CuPy with new APIs (#7158)
ChrisJar Feb 5, 2021
da0e794
Update readme (#7318)
shwina Feb 5, 2021
a86d5dd
Auto-label PRs based on their content (#7044)
jolorunyomi Feb 8, 2021
d3f5add
Unpin from numpy < 1.20 (#7335)
shwina Feb 9, 2021
26c2dfe
Add GHA to mark issues/prs as stale/rotten (#7388)
jjacobelli Feb 16, 2021
53ed28e
Update stale GHA with exemptions & new labels (#7395)
mike-wendt Feb 17, 2021
1544474
update changelog
raydouglass Feb 24, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@ Here are some guidelines to help the review process go smoothly.
features or make changes out of the scope of those requested by the reviewer
(doing this just add delays as already reviewed code ends up having to be
re-reviewed/it is hard to tell what is new etc!). Further, please do not
rebase your branch on master/force push/rewrite history, doing any of these
rebase your branch on main/force push/rewrite history, doing any of these
causes the context of any comments made by reviewers to be lost. If
conflicts occur against master they should be resolved by merging master
conflicts occur against main they should be resolved by merging main
into the branch used for making the pull request.

Many thanks in advance for your cooperation!
Expand Down
21 changes: 21 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Documentation for config - https://github.com/actions/labeler#common-examples

cuDF (Python):
- 'python/**'
- 'notebooks/**'

libcudf:
- 'cpp/**'

CMake:
- '**/CMakeLists.txt'
- '**/cmake/**'

cuDF (Java):
- 'java/**'

gpuCI:
- 'ci/**'

conda:
- 'conda/**'
11 changes: 11 additions & 0 deletions .github/workflows/labeler.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
name: "Pull Request Labeler"
on:
- pull_request_target

jobs:
triage:
runs-on: ubuntu-latest
steps:
- uses: actions/labeler@main
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
57 changes: 57 additions & 0 deletions .github/workflows/stale.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: Mark inactive issues and pull requests

on:
schedule:
- cron: "0 * * * *"

jobs:
mark-inactive-30d:
runs-on: ubuntu-latest
steps:
- name: Mark 30 day inactive issues
uses: actions/stale@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-issue-message: >
This issue has been labeled `inactive-30d` due to no recent activity in the past 30 days.
Please close this issue if no further response or action is needed.
Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
This issue will be labeled `inactive-90d` if there is no activity in the next 60 days.
stale-issue-label: "inactive-30d"
exempt-issue-labels: "0 - Blocked,0 - Backlog,good first issue"
days-before-issue-stale: 30
days-before-issue-close: -1
stale-pr-message: >
This PR has been labeled `inactive-30d` due to no recent activity in the past 30 days.
Please close this PR if it is no longer required.
Otherwise, please respond with a comment indicating any updates.
This PR will be labeled `inactive-90d` if there is no activity in the next 60 days.
stale-pr-label: "inactive-30d"
exempt-pr-labels: "0 - Blocked,0 - Backlog,good first issue"
days-before-pr-stale: 30
days-before-pr-close: -1
operations-per-run: 50
mark-inactive-90d:
runs-on: ubuntu-latest
steps:
- name: Mark 90 day inactive issues
uses: actions/stale@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-issue-message: >
This issue has been labeled `inactive-90d` due to no recent activity in the past 90 days.
Please close this issue if no further response or action is needed.
Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
stale-issue-label: "inactive-90d"
exempt-issue-labels: "0 - Blocked,0 - Backlog,good first issue"
days-before-issue-stale: 90
days-before-issue-close: -1
stale-pr-message: >
This PR has been labeled `inactive-90d` due to no recent activity in the past 90 days.
Please close this PR if it is no longer required.
Otherwise, please respond with a comment indicating any updates.
stale-pr-label: "inactive-90d"
exempt-pr-labels: "0 - Blocked,0 - Backlog,good first issue"
days-before-pr-stale: 90
days-before-pr-close: -1
operations-per-run: 50
9 changes: 9 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,15 @@ repos:
language: system
files: \.(cu|cuh|h|hpp|cpp|inl)$
args: ['-fallback-style=none']
- repo: local
hooks:
- id: mypy
name: mypy
description: mypy
pass_filenames: false
entry: mypy --config-file=python/cudf/setup.cfg python/cudf/cudf
language: system
types: [python]

default_language_version:
python: python3
211 changes: 211 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,213 @@
# cuDF 0.18.0 (24 Feb 2021)

## Breaking Changes 🚨

- Default `groupby` to `sort=False` (#7180) @isVoid
- Add libcudf API for parsing of ORC statistics (#7136) @vuule
- Replace ORC writer api with class (#7099) @rgsl888prabhu
- Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
- Replace parquet writer api with class (#7058) @rgsl888prabhu
- Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
- Fix default parameter values of `write_csv` and `write_parquet` (#6967) @vuule
- Align `Series.groupby` API to match Pandas (#6964) @kkraus14
- Share `factorize` implementation with Index and cudf module (#6885) @brandon-b-miller

## Bug Fixes 🐛

- Remove incorrect std::move call on return variable (#7319) @davidwendt
- Fix failing CI ORC test (#7313) @vuule
- Disallow constructing frames from a ColumnAccessor (#7298) @shwina
- fix java cuFile tests (#7296) @rongou
- Fix style issues related to NumPy (#7279) @shwina
- Fix bug when `iloc` slice terminates at before-the-zero position (#7277) @isVoid
- Fix copying dtype metadata after calling libcudf functions (#7271) @shwina
- Move lists utility function definition out of header (#7266) @mythrocks
- Throw if bool column would cause incorrect result when writing to ORC (#7261) @vuule
- Use `uvector` in `replace_nulls`; Fix `sort_helper::grouped_value` doc (#7256) @isVoid
- Remove floating point types from cudf::sort fast-path (#7250) @davidwendt
- Disallow picking output columns from nested columns. (#7248) @devavret
- Fix `loc` for Series with a MultiIndex (#7243) @shwina
- Fix Arrow column test leaks (#7241) @tgravescs
- Fix test column vector leak (#7238) @kuhushukla
- Fix some bugs in java scalar support for decimal (#7237) @revans2
- Improve `assert_eq` handling of scalar (#7220) @isVoid
- Fix missing null_count() comparison in test framework and related failures (#7219) @nvdbaranec
- Remove floating point types from radix sort fast-path (#7215) @davidwendt
- Fixing parquet benchmarks (#7214) @rgsl888prabhu
- Handle various parameter combinations in `replace` API (#7207) @galipremsagar
- Export mock aws credentials for s3 tests (#7176) @ayushdg
- Add `MultiIndex.rename` API (#7172) @isVoid
- Fix importing list &amp; struct types in `from_arrow` (#7162) @galipremsagar
- Fixing parquet precision writing failing if scale is equal to precision (#7146) @hyperbolic2346
- Update s3 tests to use moto_server (#7144) @ayushdg
- Fix JIT cache multi-process test flakiness in slow drives (#7142) @devavret
- Fix compilation errors in libcudf (#7138) @galipremsagar
- Fix compilation failure caused by `-Wall` addition. (#7134) @codereport
- Add informative error message for `sep` in CSV writer (#7095) @galipremsagar
- Add JIT cache per compute capability (#7090) @devavret
- Implement `__hash__` method for ListDtype (#7081) @galipremsagar
- Only upload packages that were built (#7077) @raydouglass
- Fix comparisons between Series and cudf.NA (#7072) @brandon-b-miller
- Handle `nan` values correctly in `Series.one_hot_encoding` (#7059) @galipremsagar
- Add `unstack()` support for non-multiindexed dataframes (#7054) @isVoid
- Fix `read_orc` for decimal type (#7034) @rgsl888prabhu
- Fix backward compatibility of loading a 0.16 pkl file (#7033) @galipremsagar
- Decimal casts in JNI became a NOOP (#7032) @revans2
- Restore usual instance/subclass checking to cudf.DateOffset (#7029) @shwina
- Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
- Fix to_csv delimiter handling of timestamp format (#7023) @davidwendt
- Pin librdkakfa to gcc 7 compatible version (#7021) @raydouglass
- Fix `fillna` &amp; `dropna` to also consider `np.nan` as a missing value (#7019) @galipremsagar
- Fix round operator&#39;s HALF_EVEN computation for negative integers (#7014) @nartal1
- Skip Thrust sort patch if already applied (#7009) @harrism
- Fix `cudf::hash_partition` for `decimal32` and `decimal64` (#7006) @codereport
- Fix Thrust unroll patch command (#7002) @harrism
- Fix loc behaviour when key of incorrect type is used (#6993) @shwina
- Fix int to datetime conversion in csv_read (#6991) @kaatish
- fix excluding cufile tests by default (#6988) @rongou
- Fix java cufile tests when cufile is not installed (#6987) @revans2
- Make `cudf::round` for `fixed_point` when `scale = -decimal_places` a no-op (#6975) @codereport
- Fix type comparison for java (#6970) @revans2
- Fix default parameter values of `write_csv` and `write_parquet` (#6967) @vuule
- Align `Series.groupby` API to match Pandas (#6964) @kkraus14
- Fix timestamp parsing in ORC reader for timezones without transitions (#6959) @vuule
- Fix typo in numerical.py (#6957) @rgsl888prabhu
- `fixed_point_value` double-shifts in `fixed_point` construction (#6950) @codereport
- fix libcu++ include path for jni (#6948) @rongou
- Fix groupby agg/apply behaviour when no key columns are provided (#6945) @shwina
- Avoid inserting null elements into join hash table when nulls are treated as unequal (#6943) @hyperbolic2346
- Fix cudf::merge gtest for dictionary columns (#6942) @davidwendt
- Pass numeric scalars of the same dtype through numeric binops (#6938) @brandon-b-miller
- Fix N/A detection for empty fields in CSV reader (#6922) @vuule
- Fix rmm_mode=managed parameter for gtests (#6912) @davidwendt
- Fix nullmask offset handling in parquet and orc writer (#6889) @kaatish
- Correct the sampling range when sampling with replacement (#6884) @ChrisJar
- Handle nested string columns with no children in contiguous_split. (#6864) @nvdbaranec
- Fix `columns` &amp; `index` handling in dataframe constructor (#6838) @galipremsagar

## Documentation 📖

- Update readme (#7318) @shwina
- Fix typo in cudf.core.column.string.extract docs (#7253) @adelevie
- Update doxyfile project number (#7161) @davidwendt
- Update 10 minutes to cuDF and CuPy with new APIs (#7158) @ChrisJar
- Cross link RMM &amp; libcudf Doxygen docs (#7149) @ajschmidt8
- Add documentation for support dtypes in all IO formats (#7139) @galipremsagar
- Add groupby docs (#7100) @shwina
- Update cudf python docstrings with new null representation (`&lt;NA&gt;`) (#7050) @galipremsagar
- Make Doxygen comments formatting consistent (#7041) @vuule
- Add docs for working with missing data (#7010) @galipremsagar
- Remove warning in from_dlpack and to_dlpack methods (#7001) @miguelusque
- libcudf Developer Guide (#6977) @harrism
- Add JNI wrapper for the cuFile API (GDS) (#6940) @rongou

## New Features 🚀

- Support `numeric_only` field for `rank()` (#7213) @isVoid
- Add support for `cudf::binary_operation` `TRUE_DIV` for `decimal32` and `decimal64` (#7198) @codereport
- Implement COLLECT rolling window aggregation (#7189) @mythrocks
- Add support for array-like inputs in `cudf.get_dummies` (#7181) @galipremsagar
- Default `groupby` to `sort=False` (#7180) @isVoid
- Add libcudf lists column count_elements API (#7173) @davidwendt
- Implement `cudf::group_by` (sort) for `decimal32` and `decimal64` (#7169) @codereport
- Add encoding and compression argument to CSV writer (#7168) @VibhuJawa
- `cudf::rolling_window` `SUM` support for `decimal32` and `decimal64` (#7147) @codereport
- Adding support for explode to cuDF (#7140) @hyperbolic2346
- Add libcudf API for parsing of ORC statistics (#7136) @vuule
- update GDS/cuFile location for 0.9 release (#7131) @rongou
- Add Segmented sort (#7122) @karthikeyann
- Add `cudf::binary_operation` `NULL_MIN`, `NULL_MAX` &amp; `NULL_EQUALS` for `decimal32` and `decimal64` (#7119) @codereport
- Add `scale` and `value` methods to `fixed_point` (#7109) @codereport
- Replace ORC writer api with class (#7099) @rgsl888prabhu
- Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
- Improve `digitize` API (#7071) @isVoid
- Add List types support in data generator (#7064) @galipremsagar
- `cudf::scan` support for `decimal32` and `decimal64` (#7063) @codereport
- `cudf::rolling` `ROW_NUMBER` support for `decimal32` and `decimal64` (#7061) @codereport
- Replace parquet writer api with class (#7058) @rgsl888prabhu
- Support contains() on lists of primitives (#7039) @mythrocks
- Implement `cudf::rolling` for `decimal32` and `decimal64` (#7037) @codereport
- Add `ffill` and `bfill` to string columns (#7036) @isVoid
- Enable round in cudf for DataFrame and Series (#7022) @ChrisJar
- Extend `replace_nulls_policy` to `string` and `dictionary` type (#7004) @isVoid
- Add segmented_gather(list_column, gather_list) (#7003) @karthikeyann
- Add `method` field to `fillna` for fixed width columns (#6998) @isVoid
- Manual merge of branch 0.17 into branch 0.18 (#6995) @shwina
- Implement `cudf::reduce` for `decimal32` and `decimal64` (part 2) (#6980) @codereport
- Add Ufunc alias look up for appropriate numpy ufunc dispatching (#6973) @VibhuJawa
- Add pytest-xdist to dev environment.yml (#6958) @galipremsagar
- Add `Index.set_names` api (#6929) @galipremsagar
- Add `replace_null` API with `replace_policy` parameter, `fixed_width` column support (#6907) @isVoid
- Share `factorize` implementation with Index and cudf module (#6885) @brandon-b-miller
- Implement update() function (#6883) @skirui-source
- Add groupby idxmin, idxmax aggregation (#6856) @karthikeyann
- Implement `cudf::reduce` for `decimal32` and `decimal64` (part 1) (#6814) @codereport
- Implement cudf.DateOffset for months (#6775) @brandon-b-miller
- Add Python DecimalColumn (#6715) @shwina
- Add dictionary support to libcudf groupby functions (#6585) @davidwendt

## Improvements 🛠️

- Update stale GHA with exemptions &amp; new labels (#7395) @mike-wendt
- Add GHA to mark issues/prs as stale/rotten (#7388) @Ethyling
- Unpin from numpy &lt; 1.20 (#7335) @shwina
- Prepare Changelog for Automation (#7309) @galipremsagar
- Prepare Changelog for Automation (#7272) @ajschmidt8
- Add JNI support for converting Arrow buffers to CUDF ColumnVectors (#7222) @tgravescs
- Add coverage for `skiprows` and `num_rows` in parquet reader fuzz testing (#7216) @galipremsagar
- Define and implement more behavior for merging on categorical variables (#7209) @brandon-b-miller
- Add CudfSeriesGroupBy to optimize dask_cudf groupby-mean (#7194) @rjzamora
- Add dictionary column support to rolling_window (#7186) @davidwendt
- Modify the semantics of `end` pointers in cuIO to match standard library (#7179) @vuule
- Adding unit tests for `fixed_point` with extremely large `scale`s (#7178) @codereport
- Fast path single column sort (#7167) @davidwendt
- Fix -Werror=sign-compare errors in device code (#7164) @trxcllnt
- Refactor cudf::string_view host and device code (#7159) @davidwendt
- Enable logic for GPU auto-detection in cudfjni (#7155) @gerashegalov
- Java bindings for Fixed-point type support for Parquet (#7153) @razajafri
- Add Java interface for the new API &#39;explode&#39; (#7151) @firestarman
- Replace offsets with iterators in cuIO utilities and CSV parser (#7150) @vuule
- Add gbenchmarks for reduction aggregations any() and all() (#7129) @davidwendt
- Update JNI for contiguous_split packed results (#7127) @jlowe
- Add JNI and Java bindings for list_contains (#7125) @kuhushukla
- Add Java unit tests for window aggregate &#39;collect&#39; (#7121) @firestarman
- verify window operations on decimal with java tests (#7120) @sperlingxx
- Adds in JNI support for creating an list column from existing columns (#7112) @revans2
- Build libcudf with -Wall (#7105) @trxcllnt
- Add column_device_view pointers to EncColumnDesc (#7097) @kaatish
- Add `pyorc` to dev environment (#7085) @galipremsagar
- JNI support for creating struct column from existing columns and fixed bug in struct with no children (#7084) @revans2
- Fastpath single strings column in cudf::sort (#7075) @davidwendt
- Upgrade nvcomp to 1.2.1 (#7069) @rongou
- Refactor ORC `ProtobufReader` to make it more extendable (#7055) @vuule
- Add Java tests for decimal casts (#7051) @sperlingxx
- Auto-label PRs based on their content (#7044) @jolorunyomi
- Create sort gbenchmark for strings column (#7040) @davidwendt
- Refactor io memory fetches to use hostdevice_vector methods (#7035) @ChrisJar
- Spark Murmur3 hash functionality (#7024) @rwlee
- Fix libcudf strings logic where size_type is used to access INT32 column data (#7020) @davidwendt
- Adding decimal writing support to parquet (#7017) @hyperbolic2346
- Add compression=&quot;infer&quot; as default for dask_cudf.read_csv (#7013) @rjzamora
- Correct ORC docstring; other minor cuIO improvements (#7012) @vuule
- Reduce number of hostdevice_vector allocations in parquet reader (#7005) @devavret
- Check output size overflow on strings gather (#6997) @davidwendt
- Improve representation of `MultiIndex` (#6992) @galipremsagar
- Disable some pragma unroll statements in thrust sort.h (#6982) @davidwendt
- Minor `cudf::round` internal refactoring (#6976) @codereport
- Add Java bindings for URL conversion (#6972) @jlowe
- Enable strict_decimal_types in parquet reading (#6969) @sperlingxx
- Add in basic support to JNI for logical_cast (#6954) @revans2
- Remove duplicate file array_tests.cpp (#6953) @karthikeyann
- Add null mask `fixed_point_column_wrapper` constructors (#6951) @codereport
- Update Java bindings version to 0.18-SNAPSHOT (#6949) @jlowe
- Use simplified `rmm::exec_policy` (#6939) @harrism
- Add null count test for apply_boolean_mask (#6903) @harrism
- Implement DataFrame.quantile for datetime and timedelta data types (#6902) @ChrisJar
- Remove **kwargs from string/categorical methods (#6750) @shwina
- Refactor rolling.cu to reduce compile time (#6512) @mythrocks
- Add static type checking via Mypy (#6381) @shwina
- Update to official libcu++ on Github (#6275) @trxcllnt

# cuDF 0.17.0 (10 Dec 2020)

## New Features
Expand Down Expand Up @@ -189,6 +399,7 @@
- PR #6855 Fix `.str.replace_with_backrefs` docs examples
- PR #6853 Fix contiguous split of null string columns
- PR #6861 Fix compile error in type_dispatch_benchmark.cu
- PR #6864 Handle contiguous_split corner case for nested string columns with no children
- PR #6869 Avoid dependency resolution failure in latest version of pip by explicitly specifying versions for dask and distributed
- PR #6806 Force install of local conda artifacts
- PR #6887 Fix typo and `0-d` numpy array handling in binary operation
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,23 +46,23 @@ Please see the [Demo Docker Repository](https://hub.docker.com/r/rapidsai/rapids

### CUDA/GPU requirements

* CUDA 10.0+
* NVIDIA driver 410.48+
* CUDA 10.1+
* NVIDIA driver 418.39+
* Pascal architecture or better (Compute Capability >=6.0)

### Conda

cuDF can be installed with conda ([miniconda](https://conda.io/miniconda.html), or the full [Anaconda distribution](https://www.anaconda.com/download)) from the `rapidsai` channel:

For `cudf version == 0.13` :
For `cudf version == 0.18` :
```bash
# for CUDA 10.1
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
cudf=0.13 python=3.7 cudatoolkit=10.1
cudf=0.18 python=3.7 cudatoolkit=10.1

# or, for CUDA 10.2
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
cudf=0.13 python=3.7 cudatoolkit=10.2
cudf=0.18 python=3.7 cudatoolkit=10.2

```

Expand Down
Loading