-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steady RAM Usage Increase During Video Inference using video.py #39
Comments
@dusty-nv bump. Basically this is what I mentioned few times during our conversations. In my use case where I inference Tried to do I did a memory profiling and looks like it's
|
Hi guys, thanks for reporting this and providing the charts - will look into this. @ms1design are you using streaming mode for generation? Is your generation script essentially like nano_llm/chat/example.py ? Can you try setting this line to Also I take it you are running the normal latest NanoLLM container on JetPack 6? Thanks for the debugging info you have gathered! |
on my end...yes im running the latest NanoLLM container on jetpack 6. Thx for looking in this issue. |
Ok yea, thanks. Weird thing here is that I have recently been running VLM/VLA benchmarks for hours at a time and have not encountered this. I wonder if your circumstances are resolved in main branch? I will have the 24.8 container release out in the next couple days.
…________________________________
From: chain-of-immortals ***@***.***>
Sent: Saturday, August 24, 2024 12:36:11 PM
To: dusty-nv/NanoLLM ***@***.***>
Cc: Dustin Franklin ***@***.***>; Mention ***@***.***>
Subject: Re: [dusty-nv/NanoLLM] Steady RAM Usage Increase During Video Inference using video.py (Issue #39)
on my end...yes im running the latest NanoLLM container on jetpack 6. Thx for looking in this issue.
—
Reply to this email directly, view it on GitHub<#39 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGKZACPRGD65F36GSFTLZTCY7XAVCNFSM6AAAAABM2GGFXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBYGQ2DSNZZGE>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
i was able to notice is a much more pronounced ram increase when i was streaming 4k images versus lower resolution. it is not as noticable when streaming lower resolution streams. Looking forward to testing the new release, when its available. thx. |
It's streaming and yes it's still not yet using Plugins.
@dusty-nv yes, I did that also, but unfortunatelly with the same results: @dusty-nv would you be so kind to share your benchmark logic? Let me explain how mine works, maybe the issue is in my loop:
I'm running
@dusty-nv any hints on this? |
@chain-of-immortals Similar as me - when I reduce the length of system prompt I can go beyond 250 samples before OOM. |
Hello,
I’ve been running some tests using the nano_llm.vision.video module with live camera streaming on AGX Orin 64gb model.
with the following parameters,
--model Efficient-Large-Model/VILA1.5-13b
--max-images 5
--max-new-tokens 3
--prompt 'do you see a moniter in the frame? reply in binary 0 is no and 1 is yes'
I noticed a steady increase in RAM usage during these tests and wanted to get some clarification on what might be causing this.
Here are the details:
Setup:
First, I used a USB camera streaming at 640x480 resolution.
Then, I tested with another camera streaming at 4K resolution.I have attached the graph of the ram usage in both the cases.
Observation: In both cases, I observed a continuous climb in RAM usage over time, which persisted throughout the streaming session. Much quicker ramp up in the case of 4k images.
I’m wondering if this behavior could be attributed to how frames are handled or any other aspects of the video processing pipeline in the script. Is there any known issue or specific configuration I should be aware of that might help address this?
Also How should i think about the optimal size of the video frames i should be feeding this OpenVila1.5 13b model?
Any insights or suggestions would be greatly appreciated.
Thank you!
The text was updated successfully, but these errors were encountered: