[FEATURE] Benchmarking memory #516

mynameisvinn · 2021-01-30T16:30:55Z

🚨🚨 Feature Request

Related to an existing Issue
A new implementation (Improvement, Extension)

We should benchmark memory usage when fetching from a Hub dataset.

If your feature will improve `HUB`

In the near term, well-scoped memory benchmarks will assess new features. In the long term, it can be used to compare performance with other libraries such as Zarr and Tile.

Description of the possible solution

We could start with a client-side benchmark reading from a local volume, perhaps with memory-profiler.

The text was updated successfully, but these errors were encountered:

KrishnaChaitanya1 · 2021-02-01T09:33:56Z

Hi @mynameisvinn . I am interested in doing this. However I am new to benchmarking. Can you tell me how to do this? Jakub gave some pointers as to where to start. I am going through those

haiyangdeperci · 2021-02-01T09:39:35Z

@KrishnaChaitanya1 you may start from going through the documentation of the memory_profiler and applying it to Hub functions.

KrishnaChaitanya1 · 2021-02-01T09:41:05Z

@haiyangdeperci . Sure will check that.

haiyangdeperci · 2021-02-01T09:43:32Z

Great! You might also check #520 out.

KrishnaChaitanya1 · 2021-02-01T15:07:56Z

@haiyangdeperci @mynameisvinn. Is this what you were expecting?

mynameisvinn · 2021-02-01T15:22:13Z

@KrishnaChaitanya1 so far so good. Don't forget to track parameters (eg caching arguments) for reproducibility.

KrishnaChaitanya1 · 2021-02-02T14:26:51Z

Hi @mynameisvinn . Can you elaborate a bit?

haiyangdeperci · 2021-02-02T14:46:09Z

@KrishnaChaitanya1 This is a good start. Whenever you set up an environment for benchmarking there are different conditions that can affect the measured performance. Here (I believe) @mynameisvinn wanted you to parametrize the function that you are profiling. For instance, the cache or storage_cache arguments could be False or True, and you should be mindful of both.

mynameisvinn · 2021-02-02T14:53:03Z

@haiyangdeperci is on top of things per usual

KrishnaChaitanya1 · 2021-02-02T14:53:33Z

@haiyangdeperci Thanks for the info. So in a sense I should play a bit with the parameters right?

haiyangdeperci · 2021-02-02T15:07:04Z

@KrishnaChaitanya1 yeah, the idea is that you design the setup in a way that the same function can be run with different parameters

KrishnaChaitanya1 · 2021-02-02T15:07:57Z

@haiyangdeperci Got It. I will try doing it.

haiyangdeperci · 2021-02-02T15:08:58Z

Awesome, looking forward to your benchmarks

haiyangdeperci · 2021-02-08T17:36:10Z

@KrishnaChaitanya1 Hi, could you give us an update on this task? Let us know if you need help.

KrishnaChaitanya1 · 2021-02-09T12:59:04Z

Hi @haiyangdeperci . I was busy the last week. Didnt find time to work on this. I will make some progress this week.

haiyangdeperci · 2021-02-09T13:16:50Z

No worries, I was just checking what the status is 👍 Please take your time. If you happen to be free tomorrow, please join our benchmarking group call.

KrishnaChaitanya1 · 2021-02-12T13:38:42Z

Hi @haiyangdeperci . Here are some screenshots. The first one is using Hub and the second one is using tensorflow. Loaded Cifar100 on both. Check and let me know if this helps

haiyangdeperci · 2021-02-12T13:45:48Z

Hi, thanks for helping us out. I would suggest defining parameters as variables instead of just duplicating lines. Also, in this case you should control for cache and storage_cache separately.

KrishnaChaitanya1 · 2021-02-12T13:56:30Z

@haiyangdeperci . Sure I will try changing them to variables. Control for cache and storage_cache in the sense, I should use them separately right and not together

haiyangdeperci · 2021-02-12T14:47:20Z

@KrishnaChaitanya1 correct! Sorry if I was not precise enough

KrishnaChaitanya1 · 2021-02-14T14:19:32Z

Hi @haiyangdeperci . I modified the code to variables. Tested only for cache. Also added basic system information for flavor. Let me know if I should make any improvements.

KrishnaChaitanya1 · 2021-03-02T14:09:27Z

Hi @haiyangdeperci . Can you check the above comment and let me know your thoughts? I changed it to variables as you mentioned.

haiyangdeperci · 2021-03-02T14:25:02Z

@KrishnaChaitanya1 can you send in the entire code in the form of a PR? It would be easier for us to review that. I assume you are running this with cache variable equal to False and then True. That's all right. We need to gather these results in some readable format so you may include that as well. If you need more support, come to the benchmark call tomorrow, the team and other community members will help you out.

KrishnaChaitanya1 · 2021-03-02T15:01:39Z

@haiyangdeperci . Created PR. #638. I have a doubt, how do you remove those merge commits?

mynameisvinn · 2021-03-04T17:45:14Z

Closing this feature request due to inactivity and lack of interest. Will revive it if more users request it.

mynameisvinn added this to Committed in Development Roadmap Jan 30, 2021

mynameisvinn added good first issue Good for newcomers help wanted Extra attention is needed labels Jan 30, 2021

haiyangdeperci mentioned this issue Feb 3, 2021

[FEATURE] Benchmarking Process Enhancement Plan #529

Closed

14 tasks

benchislett mentioned this issue Mar 3, 2021

Benchmark restructuring and memory profiling #642

Merged

mynameisvinn closed this as completed Mar 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Benchmarking memory #516

[FEATURE] Benchmarking memory #516

mynameisvinn commented Jan 30, 2021 •

edited

KrishnaChaitanya1 commented Feb 1, 2021

haiyangdeperci commented Feb 1, 2021

KrishnaChaitanya1 commented Feb 1, 2021

haiyangdeperci commented Feb 1, 2021

KrishnaChaitanya1 commented Feb 1, 2021

mynameisvinn commented Feb 1, 2021

KrishnaChaitanya1 commented Feb 2, 2021

haiyangdeperci commented Feb 2, 2021

mynameisvinn commented Feb 2, 2021

KrishnaChaitanya1 commented Feb 2, 2021

haiyangdeperci commented Feb 2, 2021

KrishnaChaitanya1 commented Feb 2, 2021

haiyangdeperci commented Feb 2, 2021

haiyangdeperci commented Feb 8, 2021 •

edited

KrishnaChaitanya1 commented Feb 9, 2021

haiyangdeperci commented Feb 9, 2021

KrishnaChaitanya1 commented Feb 12, 2021

haiyangdeperci commented Feb 12, 2021

KrishnaChaitanya1 commented Feb 12, 2021

haiyangdeperci commented Feb 12, 2021

KrishnaChaitanya1 commented Feb 14, 2021

KrishnaChaitanya1 commented Mar 2, 2021

haiyangdeperci commented Mar 2, 2021

KrishnaChaitanya1 commented Mar 2, 2021

mynameisvinn commented Mar 4, 2021

[FEATURE] Benchmarking memory #516

[FEATURE] Benchmarking memory #516

Comments

mynameisvinn commented Jan 30, 2021 • edited

🚨🚨 Feature Request

If your feature will improve HUB

Description of the possible solution

KrishnaChaitanya1 commented Feb 1, 2021

haiyangdeperci commented Feb 1, 2021

KrishnaChaitanya1 commented Feb 1, 2021

haiyangdeperci commented Feb 1, 2021

KrishnaChaitanya1 commented Feb 1, 2021

mynameisvinn commented Feb 1, 2021

KrishnaChaitanya1 commented Feb 2, 2021

haiyangdeperci commented Feb 2, 2021

mynameisvinn commented Feb 2, 2021

KrishnaChaitanya1 commented Feb 2, 2021

haiyangdeperci commented Feb 2, 2021

KrishnaChaitanya1 commented Feb 2, 2021

haiyangdeperci commented Feb 2, 2021

haiyangdeperci commented Feb 8, 2021 • edited

KrishnaChaitanya1 commented Feb 9, 2021

haiyangdeperci commented Feb 9, 2021

KrishnaChaitanya1 commented Feb 12, 2021

haiyangdeperci commented Feb 12, 2021

KrishnaChaitanya1 commented Feb 12, 2021

haiyangdeperci commented Feb 12, 2021

KrishnaChaitanya1 commented Feb 14, 2021

KrishnaChaitanya1 commented Mar 2, 2021

haiyangdeperci commented Mar 2, 2021

KrishnaChaitanya1 commented Mar 2, 2021

mynameisvinn commented Mar 4, 2021

mynameisvinn commented Jan 30, 2021 •

edited

If your feature will improve `HUB`

haiyangdeperci commented Feb 8, 2021 •

edited