New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NetVLAD++ Model RAM consumption? #35
Comments
Hi @Wilann, NetVLAD++ considers a batch size of 256 by default.Maybe you can try a smaller batch size.
If your issue come from the inference (or evaluation), that takes a full video as input, you might want to change the parameter
|
Hi @SilvioGiancola, thank you for your response! I tried with a batch size of 1 and the program crashes after loading the 300 training games. Would you happen to know what the minimum required RAM is to run the code? |
Hi @Wilann , you are right, that implementation of NetVLAD++ actually pre-loads all the features in the RAM, for a faster training. |
Hi @SilvioGiancola, I actually managed to install more RAM into my PC, and NetVLAD++ is able to train on SoccerNet now (with alll 500 games, with ~55 GB of RAM used). For my own dataset I indeed have 80 games, and training works on that dataset as well (~20 GB of RAM used). I also computed ResNet-152 features at 5 fps, and would like to train on those, but my GPU RAM gets overloaded. I tried lowering the Also, once the code reaches the dataloaders using the datasets using
But I noticed you used Thank you for your response - really appreciate it. |
Hi @Wilann , In line 72, I am loading all the features in non-overlapping windows that I sample for training. That operation is memory intensive as it loads all the features in RAM. Your solution would be to push most of those operation in the getitem (line 136), but it will require further engineering for a decent loading speed. Regarding the peak in evaluation, you can skip it all together and validate on the loss to prevent overfitting. Inhere, I was checking overfitting either with the loss or with the Spotting Average-mAP, but actually using the loss was good enough. For testing or inference, you will need to have a sliding window with a stride of 1 frame, or skip a few for a lighter inference. Again, it is loading all features in RAM for a sake of simplicity, but a better engineering could solve your issue. I hope that will help you! Cheers, |
Hi @SilvioGiancola, Thank you so much for your response! I understand what you mean by moving the loading portion to the About this part:
I'm a bit confused when you say "validate on the loss". How do we validate on the loss? My understanding was that validation typically occurs on a portion of the dataset. Thanks so much for your time again - I really appreciate your help! |
Hi @Wilann , Since the RAM peak appears in evaluation ( That is also the reason why I have implemented 2 validation dataset, for classification or for spotting:
|
Hi @SilvioGiancola, When you say:
Is the spotting performance the Average-mAP? If so, how I understand it is that I basically I skip using the tesitng dataset for now, and only use Another few questions:
2.1. Here again in 2.2. Here on the next line, why is 1 being added to the label index?
Which I interpret as:
I believe Thank you so much for your help again - really appreciate it. |
Hi @Wilann , Yes, by spotting performance I meant the Average-mAP, and yes, you can skip the test set for sake of memory, and use either the
|
Hi @SilvioGiancola, Thank you for the reply again! Following up on my questions:
If I understand correctly, the extracted frame features may not span the entire video duration? What's the cause of this? Also, printing something like this
I can see that the sanity check works, and I believe
Again, in line 119, why is the background class being set to 0 here? Why not just omit it all together if it's not being used? (I don't think its being used at least)
Thank you for your time again - really appreciate all the help. |
|
Hi again,
Thank you for your time again - really appreciate the help! |
It can come from the optimization, that diverged for some reason. Reduce the LR and check. Also, flipping the video inverts the past and future, so you might want to use the NetVLAD baseline instead of NetVLAD++/CALF. |
Any idea what is the recommend GPU RAM size for train and inference ? @SilvioGiancola |
Hi @ldfandian , the message you quote contains your answer: ~55GB of RAM. |
Thanks for the quick reply. @SilvioGiancola And, I need to implement a getItem in order to reduce memory RAM. For GPU RAM, it looks to be tunable (by batch_size) to be as small as you like~ correct? |
The answer is no. Please read the whole thread and you will understand why. |
Hi SoccerNet Dev Team,
I've managed to plug my own dataset into NetVLAD++, but am unable to train due to overloading my 32 GB of RAM.
I have ~80 matches of ~50 minutes of badminton matches with ResNet-152 features sampled at 5fps. After loading my dataset, I have ~18/32 GB of RAM used. The program gets killed while loading the model. I'm confused why, as it's only ~5.5 GB as shown in the TorchInfo summary below. I believe I should still have ~8 GB to spare. Is this a feature of NetVLAD++ specifically? I noticed that in #28 experiments were done with 60-90 GB of RAM.
Thank you for reading, and looking forwawrd to your insights!
TorchInfo Summary:
The text was updated successfully, but these errors were encountered: