Parallel map step for `DistributedDataAnalyzer` map-reduce #5291

bm-synth · 2024-03-17T15:14:04Z

adds multi CPU-processing to the DistributedDataAnalyzer map operation (parallelism set with parameter num_workers). Works with a SharedMemory / Manager's queue per metric, written concurrently by processes.
much faster write_buffer_to_file in DistributedDataAnalyzer reduce operation by copying to cpu and "detaching" output tensor.

…peed into distributed_data_analyzer

bm-synth · 2024-04-16T07:53:10Z

@loadams @conglongli what's holding this PR?

…#5291) - adds multi CPU-processing to the `DistributedDataAnalyzer` map operation (parallelism set with parameter `num_workers`). Works with a `SharedMemory` / `Manager's` queue per metric, written concurrently by processes. - much faster `write_buffer_to_file` in `DistributedDataAnalyzer` reduce operation by copying to cpu and "detaching" output tensor. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Conglong Li <conglong.li@gmail.com>

bm-synth added 30 commits February 9, 2024 15:10

added assert of torch vs numpy types

14f2bbe

first draft

796341d

reverted to original master

07aa4b4

added metric type accumulate_value_over_samples

815a789

pre-commit

28a72e7

Merge branch 'master' into distributed_data_analyzer

e8dbf0b

Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…

ec3479f

…peed into distributed_data_analyzer

Update data_analyzer.py

38d7ce6

added check for single node reduce. added barriers

295fba6

more bug fixes

4144e42

new iteration, many bug fixes

a1e121c

bug fixes

e045753

Merge branch 'master' into distributed_data_analyzer

3a89116

fixing previous commit

cdc838c

Merge branch 'master' into distributed_data_analyzer

ba34a55

pre-commit

5c07710

Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…

87d7686

…peed into distributed_data_analyzer

recoverd master branch

f28e829

write sequentially to file

a634787

Merge branch 'master' into distributed_data_analyzer

848ffd5

fixes in sequential write

ec59f08

Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…

832874c

…peed into distributed_data_analyzer

pre-commit hooks

ea0d65f

Merge branch 'master' into distributed_data_analyzer

c6c9bc5

added main as example

56a9533

Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…

b4d8654

…peed into distributed_data_analyzer

Merge branch 'master' into distributed_data_analyzer

676dc1a

Update data_analyzer.py

6788af5

first working version. idx files differ

bd61d9c

Merge branch 'distributed_data_analyzer' of github.com:bm-synth/DeepS…

7ac5e45

…peed into distributed_data_analyzer

bm-synth added 5 commits March 14, 2024 09:03

merge with master

2e2ebee

first iteration, needs testing and debugging on num_threads>1

e4de554

first draft of multiprocessing

dadf0f9

first draft of multiprocessing

0ba9f93

pre-commit hooks

a761196

bm-synth changed the title ~~Parallel run in distributed data analyzer~~ Parallel map step for DistributedDataAnalyzer map-reduce Mar 17, 2024

bm-synth added 2 commits March 17, 2024 16:17

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

7b9d1c6

pre-commit hooks and merge conflicts

c5dad20

bm-synth marked this pull request as ready for review March 17, 2024 15:19

bm-synth requested a review from conglongli as a code owner March 17, 2024 15:19

bm-synth added 2 commits March 17, 2024 22:58

faster writes

4313e8d

pre-commit hooks

34965a0

bm-synth mentioned this pull request Mar 18, 2024

Serial data analyzer #5168

Closed

bm-synth added 5 commits March 23, 2024 10:20

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

fb5f8fa

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

2c30081

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

6d26d1a

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

afa7495

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

6106757

loadams and others added 3 commits April 16, 2024 14:01

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

6abfdda

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

444644b

Merge branch 'master' into parallel_run_in_distributed_data_analyzer

b130322

conglongli approved these changes Apr 18, 2024

View reviewed changes

conglongli self-assigned this Apr 18, 2024

conglongli enabled auto-merge April 18, 2024 20:16

conglongli added this pull request to the merge queue Apr 18, 2024

Merged via the queue into microsoft:master with commit 64defe6 Apr 18, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel map step for `DistributedDataAnalyzer` map-reduce #5291

Parallel map step for `DistributedDataAnalyzer` map-reduce #5291

bm-synth commented Mar 17, 2024 •

edited

bm-synth commented Apr 16, 2024

Parallel map step for DistributedDataAnalyzer map-reduce #5291

Parallel map step for DistributedDataAnalyzer map-reduce #5291

Conversation

bm-synth commented Mar 17, 2024 • edited

bm-synth commented Apr 16, 2024

Parallel map step for `DistributedDataAnalyzer` map-reduce #5291

Parallel map step for `DistributedDataAnalyzer` map-reduce #5291

bm-synth commented Mar 17, 2024 •

edited