[FEA] Add support for RMM logging to rapids-pytest-benchmark #27

rlratzel · 2020-05-22T00:09:19Z

The new API is here:
rapidsai/rmm#363

The rapids-pytest-benchmark code can effectively replace the code in gpu_metric_poller.py with a similar module responsible for reading the log file.
Since RMM logging adds significant overhead, the benchmark time measurements cannot be run when logging is enabled. This is the same case as GPU polling, so the approach already used in rapids-pytest-benchmark should not change (ie. separate runs done internally).
The peak memory usage is what we want to see, since this shows what the resource requirements of our algos are, and we obviously want them to be as low as possible to allow customers to run large datasets.
(need to confirm this detail) The log contains lines/entries for each alloc and how many bytes were requested, and each free and how many bytes were freed. Algos make several alloc/free calls throughout their lifetime, so the log could get large.
(this is based on the assumption above, which needs to be confirmed) To compute the peak memory usage, do the following:
- initialize vars max and total to 0.
- for each alloc, add the number of bytes to total. If total > max, set max = total
- for each free, subtract the number of bytes from total
- at the end of the benchmark run, return max as peak memory usage
- bonus: at the end of the benchmark run, total should be 0. If not, report total as the number of leaked bytes.

The text was updated successfully, but these errors were encountered:

rlratzel · 2020-06-21T03:04:22Z

@dillon-cullinan is working on this and is planning on having it done post-MVP timeframe.

ajschmidt8 · 2021-02-23T22:26:31Z

adding a link to the relevant rmm issue below for reference

ajschmidt8 · 2021-03-09T20:08:17Z

ajschmidt8 · 2021-03-19T14:33:39Z

Unblocked with the merge of rapidsai/rmm#722

ajschmidt8 · 2021-03-24T22:09:32Z

#62 is merged which closes this issue.

0.0.14 packages have been built and uploaded to Anaconda.org:

rlratzel added this to the Update cugraph nightly ASV benchmark reports with new results from new benchmarks milestone May 22, 2020

rlratzel self-assigned this May 22, 2020

rlratzel removed this from the Update cugraph nightly ASV benchmark reports with new results from new benchmarks milestone Jun 9, 2020

rlratzel assigned ajschmidt8 and dillon-cullinan Jun 11, 2020

rlratzel added this to the Finish MVP work milestone Jun 15, 2020

rlratzel added the in progress label Jun 17, 2020

rlratzel removed this from the Finish MVP work milestone Jun 21, 2020

ajschmidt8 unassigned dillon-cullinan Mar 2, 2021

ajschmidt8 mentioned this issue Mar 3, 2021

Replace nvidia-smi poller with RMM logger #62

Merged

2 tasks

ajschmidt8 added the blocked label Mar 9, 2021

ajschmidt8 removed the blocked label Mar 19, 2021

ajschmidt8 closed this as completed in #62 Mar 24, 2021

Provide feedback