Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework logic in cudf::strings::split_record to improve performance #12729

Merged
merged 31 commits into from
Feb 21, 2023
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
a8f4509
Rework logic in cudf::strings::split_record to improve performance
davidwendt Feb 8, 2023
596eb86
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 8, 2023
0d8c67a
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 8, 2023
436cd58
improve gpu utilization
davidwendt Feb 8, 2023
d7dcb2a
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 8, 2023
f5d1bb0
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 8, 2023
8132d44
Merge branch 'perf-split-record' of github.com:davidwendt/cudf into p…
davidwendt Feb 8, 2023
cae6bb0
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 9, 2023
498f697
use CRTP for split/rsplit functors
davidwendt Feb 10, 2023
7a55f0f
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 10, 2023
d37c0e5
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 10, 2023
e49eb02
Merge branch 'perf-split-record' of github.com:davidwendt/cudf into p…
davidwendt Feb 10, 2023
5f61d59
refactor common code to split.cuh
davidwendt Feb 10, 2023
6315e17
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 10, 2023
a68cb97
fix style check
davidwendt Feb 10, 2023
b7c89bb
fix overlapping delimiter logic in forward split
davidwendt Feb 10, 2023
9637d13
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 10, 2023
3476581
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 13, 2023
309c23d
Merge branch 'perf-split-record' of github.com:davidwendt/cudf into p…
davidwendt Feb 13, 2023
ce1dae1
fix overlapped delimiters logic
davidwendt Feb 13, 2023
cd041a5
null rows should not create tokens
davidwendt Feb 13, 2023
9bfb9aa
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 13, 2023
cec08f2
add multi-byte delim gtests for split/rsplit
davidwendt Feb 14, 2023
f98fb29
add CUDF_CUDA_TRY to cudaMemSetAsync call
davidwendt Feb 14, 2023
df4fa9e
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 15, 2023
06222ca
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 16, 2023
23ec34a
Merge branch 'perf-split-record' of github.com:davidwendt/cudf into p…
davidwendt Feb 16, 2023
795841e
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 16, 2023
c73f2dd
fix comments per review
davidwendt Feb 16, 2023
efa0490
make some const vars constexpr
davidwendt Feb 16, 2023
524e038
Merge branch 'branch-23.04' into perf-split-record
davidwendt Feb 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions cpp/benchmarks/string/split.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2021-2022, NVIDIA CORPORATION.
* Copyright (c) 2021-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -62,7 +62,7 @@ static void generate_bench_args(benchmark::internal::Benchmark* b)
int const row_mult = 8;
int const min_rowlen = 1 << 5;
int const max_rowlen = 1 << 13;
int const len_mult = 4;
int const len_mult = 2;
davidwendt marked this conversation as resolved.
Show resolved Hide resolved
for (int row_count = min_rows; row_count <= max_rows; row_count *= row_mult) {
for (int rowlen = min_rowlen; rowlen <= max_rowlen; rowlen *= len_mult) {
// avoid generating combinations that exceed the cudf column limit
Expand Down
Loading