New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Benchmarking memory #516
Comments
Hi @mynameisvinn . I am interested in doing this. However I am new to benchmarking. Can you tell me how to do this? Jakub gave some pointers as to where to start. I am going through those |
@KrishnaChaitanya1 you may start from going through the documentation of the |
@haiyangdeperci . Sure will check that. |
Great! You might also check #520 out. |
@haiyangdeperci @mynameisvinn. Is this what you were expecting? |
@KrishnaChaitanya1 so far so good. Don't forget to track parameters (eg caching arguments) for reproducibility. |
Hi @mynameisvinn . Can you elaborate a bit? |
@KrishnaChaitanya1 This is a good start. Whenever you set up an environment for benchmarking there are different conditions that can affect the measured performance. Here (I believe) @mynameisvinn wanted you to parametrize the function that you are profiling. For instance, the |
@haiyangdeperci is on top of things per usual |
@haiyangdeperci Thanks for the info. So in a sense I should play a bit with the parameters right? |
@KrishnaChaitanya1 yeah, the idea is that you design the setup in a way that the same function can be run with different parameters |
@haiyangdeperci Got It. I will try doing it. |
Awesome, looking forward to your benchmarks |
@KrishnaChaitanya1 Hi, could you give us an update on this task? Let us know if you need help. |
Hi @haiyangdeperci . I was busy the last week. Didnt find time to work on this. I will make some progress this week. |
No worries, I was just checking what the status is 馃憤 Please take your time. If you happen to be free tomorrow, please join our benchmarking group call. |
Hi @haiyangdeperci . Here are some screenshots. The first one is using Hub and the second one is using tensorflow. Loaded Cifar100 on both. Check and let me know if this helps |
Hi, thanks for helping us out. I would suggest defining parameters as variables instead of just duplicating lines. Also, in this case you should control for |
@haiyangdeperci . Sure I will try changing them to variables. Control for |
@KrishnaChaitanya1 correct! Sorry if I was not precise enough |
Hi @haiyangdeperci . I modified the code to variables. Tested only for |
Hi @haiyangdeperci . Can you check the above comment and let me know your thoughts? I changed it to variables as you mentioned. |
@KrishnaChaitanya1 can you send in the entire code in the form of a PR? It would be easier for us to review that. I assume you are running this with |
@haiyangdeperci . Created PR. #638. I have a doubt, how do you remove those merge commits? |
Closing this feature request due to inactivity and lack of interest. Will revive it if more users request it. |
馃毃馃毃 Feature Request
We should benchmark memory usage when fetching from a Hub dataset.
If your feature will improve
HUB
In the near term, well-scoped memory benchmarks will assess new features. In the long term, it can be used to compare performance with other libraries such as Zarr and Tile.
Description of the possible solution
We could start with a client-side benchmark reading from a local volume, perhaps with memory-profiler.
The text was updated successfully, but these errors were encountered: