Simplify `bench/ann` scripts to Python based module #1642

divyegala · 2023-07-11T19:15:49Z

…cripts

cpp/bench/ann/scripts/run.py

cpp/bench/ann/scripts/get_dataset.py

benfred · 2023-07-13T17:21:20Z

cpp/bench/ann/scripts/get_dataset.py

+def convert_hdf5_to_fbin(path, normalize):
+    if normalize and "angular" in path:
+        p = subprocess.Popen(["python", "scripts/hdf5_to_fbin.py", "-n",
+                              "%s" % path])
+    else:
+        p = subprocess.Popen(["python", "scripts/hdf5_to_fbin.py",
+                              "%s" % path])
+    p.wait()


Instead of invoking via a subprocess - can we just call the python function directly?

Unfortunately, that script doesn't have a callable function. Would you prefer I refactor the script to make it work or we can do it later?

Co-authored-by: Ben Frederickson <github@benfrederickson.com>

dantegd

Added a few comments

cpp/bench/ann/scripts/data_export.py

cpp/bench/ann/scripts/get_dataset.py

…cripts

wphicks

Looks great! Just one suggestion to avoid old style string formatting, but otherwise I think this is good.

cpp/bench/ann/scripts/get_dataset.py

cjnolet

The changes are looking good. I did a first-pass to provide some feedback.

cpp/bench/ann/scripts/plot.py

dependencies.yaml

docs/source/cuda_ann_benchmarks.md

cjnolet

This is so close. I'm really really excited about this. The readme is looking great and the fact that we can now do these end-to-end are perfect. Most of my feedback now is about going just a little farther and easing everything possible for the new user.

docs/source/raft_ann_benchmarks.md

cjnolet · 2023-07-21T23:36:10Z

docs/source/raft_ann_benchmarks.md

+# (1) prepare dataset
+# download manually "Ground Truth" file of "Yandex DEEP"
+# suppose the file name is deep_new_groundtruth.public.10K.bin
+../../scripts/split_groundtruth.pl deep_new_groundtruth.public.10K.bin groundtruth


We should wrap this in python for consistency. It's just confusing seeing a bunch of python scripts and then seeing pearl.

I think we should slip this task to a follow up PR

I think that can work as an immediate follow-up. I'd prefer for it to still be worked into 23.08. We don't necessarily need to support every unique parameter combination initially- so long as we support what's needed to run a basic benchmark end-to-end for billion-scale.

docs/source/raft_ann_benchmarks.md

scripts/ann-benchmarks/get_dataset.py

achirkin · 2023-07-24T09:07:33Z

docs/source/raft_ann_benchmarks.md

+mamba env create --name raft_ann_benchmarks -f conda/environments/bench_ann_cuda-118_arch-x86_64.yaml
+conda activate raft_ann_benchmarks
+
+mamba install -c rapidsai libraft-ann-bench


Would it be realistic to setup a conda environment that does not depend on cuda conda packages and uses system cuda installation instead? I'd love to be able to use these scripts using docker containers with latest cuda drivers. In fact, I think this would be the main use case on devtech side: to test and adjust raft implementation for the upcoming hardware.

Sure, with some testing we can come up with a more minimal environment that does not include any cuda conda packages. Let me know if we can work on this together.

cjnolet

LGTM. Thanks so much @divyegala!

cjnolet · 2023-07-26T13:11:59Z

/merge

add utility to download and move files

d6b5f4e

divyegala added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 11, 2023

github-actions bot added the cpp label Jul 11, 2023

divyegala added 3 commits July 11, 2023 12:18

run pre-commit

0c6c33b

add copyright

1be4eb1

start working on runner script

a50cd97

divyegala added the python label Jul 11, 2023

working build and search

e9b4eca

github-actions bot removed the python label Jul 13, 2023

divyegala added 5 commits July 13, 2023 08:17

Merge remote-tracking branch 'upstream/branch-23.08' into bench-ann-s…

87ad2b3

…cripts

fix spelling

3a12d40

run flake8 manually

eb6d9a2

add data_export.py script

cd86fa3

run flake8 manually

a548b22

benfred reviewed Jul 14, 2023

View reviewed changes

divyegala and others added 4 commits July 14, 2023 09:46

Update cpp/bench/ann/scripts/run.py

df16559

Co-authored-by: Ben Frederickson <github@benfrederickson.com>

review suggestions

d4f30de

add docs

b665a64

spelling check

079d8ef

dantegd reviewed Jul 14, 2023

View reviewed changes

cpp/bench/ann/scripts/data_export.py Outdated Show resolved Hide resolved

cpp/bench/ann/scripts/data_export.py Outdated Show resolved Hide resolved

cpp/bench/ann/scripts/data_export.py Outdated Show resolved Hide resolved

cpp/bench/ann/scripts/get_dataset.py Outdated Show resolved Hide resolved

divyegala added 2 commits July 14, 2023 11:39

address review

c62c423

Merge remote-tracking branch 'upstream/branch-23.08' into bench-ann-s…

746c214

…cripts

cjnolet assigned divyegala Jul 17, 2023

divyegala requested review from benfred and dantegd July 17, 2023 19:32

wphicks reviewed Jul 17, 2023

View reviewed changes

cpp/bench/ann/scripts/get_dataset.py Outdated Show resolved Hide resolved

divyegala added 3 commits July 18, 2023 16:36

add faiss_gpu_ivf_sq

1430155

address review to use new string formatting, add plot.py

a38f21c

add end-to-end docs for b scale

94ddec4

divyegala added 2 commits July 18, 2023 18:32

add plotting

cf44279

correct executable=>algo strategy

7b4711e

cjnolet requested changes Jul 19, 2023

View reviewed changes

divyegala and others added 9 commits July 19, 2023 17:25

address review

7465b8d

Merge branch 'branch-23.08' into bench-ann-scripts

9b978c9

modify docs

76d45fd

fix some typos

494609e

run benchmarks with conda package

d46d49d

fix spelling

5adbf36

add build/search params to run.py

2f1e8ca

add destructors to fix running raft benchmarks

3ac8d76

move algos.yaml

ee61877

cjnolet requested changes Jul 21, 2023

View reviewed changes

achirkin reviewed Jul 24, 2023

View reviewed changes

cjnolet and others added 2 commits July 24, 2023 17:28

Merge branch 'branch-23.08' into bench-ann-scripts

dbfae90

address review

a0bf789

divyegala marked this pull request as ready for review July 25, 2023 01:20

divyegala requested review from a team as code owners July 25, 2023 01:20

add cmake example

7c1a6cf

divyegala requested a review from cjnolet July 25, 2023 19:22

cjnolet approved these changes Jul 26, 2023

View reviewed changes

raydouglass approved these changes Jul 26, 2023

View reviewed changes

rapids-bot bot merged commit f99a418 into rapidsai:branch-23.08 Jul 26, 2023
54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify `bench/ann` scripts to Python based module #1642

Simplify `bench/ann` scripts to Python based module #1642

divyegala commented Jul 11, 2023

benfred Jul 13, 2023

divyegala Jul 14, 2023

dantegd left a comment

wphicks left a comment

cjnolet left a comment

cjnolet left a comment

cjnolet Jul 21, 2023

divyegala Jul 24, 2023

cjnolet Jul 24, 2023

achirkin Jul 24, 2023

divyegala Jul 24, 2023

cjnolet left a comment

cjnolet commented Jul 26, 2023

Simplify bench/ann scripts to Python based module #1642

Simplify bench/ann scripts to Python based module #1642

Conversation

divyegala commented Jul 11, 2023

benfred Jul 13, 2023

Choose a reason for hiding this comment

divyegala Jul 14, 2023

Choose a reason for hiding this comment

dantegd left a comment

Choose a reason for hiding this comment

wphicks left a comment

Choose a reason for hiding this comment

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet Jul 21, 2023

Choose a reason for hiding this comment

divyegala Jul 24, 2023

Choose a reason for hiding this comment

cjnolet Jul 24, 2023

Choose a reason for hiding this comment

achirkin Jul 24, 2023

Choose a reason for hiding this comment

divyegala Jul 24, 2023

Choose a reason for hiding this comment

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet commented Jul 26, 2023

Simplify `bench/ann` scripts to Python based module #1642

Simplify `bench/ann` scripts to Python based module #1642