-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overheat problem? - Jetson crash with two cameras simultaneously #882
Comments
The crash does not happen if I operate the Jetson in 5W mode. So maybe a combination of power supply and overheat... |
I'm now having attached a fan and run the Jetson in 5W mode. There shouldn't be any heat problems anymore. The power supply is able to provide at least 12 W. All USB cameras are attached to a powered USB hub. The object detection runs with 16 fps, about 5 fps for each camera. However, the entire solution crashes very quickly... That being said at the moment the Jetson Nano with its GPU cannot compete with an RPI4B with a Coral TPU. This is not only performing at > 22 fps with three cams, it is also rock stable :( |
I think I found the reason for the crash: I'm now using two networks of the same kind and feed them separately from each cam instead of just one That doesn't crash anymore. There is still the fan applied (I guess it will not work w/o) but the power supply is still the Raspberry PI plug with 5V 2.5 A. Both windows display 24 fps, but I don't believe it. Power mode back to 10 W At least it doesn't crash anymore. Hurray...
|
If it doesn't happen in 5W mode, and if when it does happen the Nano completely shuts off, then it is very likely a power supply issue. I recommend upgrading your power supply to a 5V/4A barrel jack adapter. |
@dusty-nv It didn't happen again in 2 cam mode with the changes mentioned in my last post (two nets). But is is constantly happening with 3 cams. I will consider to purchase a better power plug. Thanks |
Hi! I bought the "LEICKE 5V 4A" power supply, and it works perfectly. |
Yes, I will get mine on Friday. BTW: Is the problem you mentioned with inference and RTSP somehow related to mine here #885? |
Hi! I've run your code on my jetson with:
And it gives to me 24 fps for each window but that's the network speed. To see the display frame rate I used Here's the code:
I've also modified a bit of your code to use opencv and gstreamer instead of the jetson utils methods. To me works 20 fps on one camera and about 6 fps on second one, detection included. The reason of this choose is that I find jetson inference very buggy and not so well documented, on the other hand I used to work with opencv for my personal projects. It's been only a month that I have a jetson nano, so I'm still figuring out how it works :) Btw here the code for opencv + gstreamer that I used. Please note that I'm also figuring out how gstreamer works, but I know that it is very capable and nvidia deepstream uses it, so I'm considering to learn also nvidia deep stream. That library seems to be the best option to make this kind of stuff on the jetson except for the fact that nobody uses it. Bye! |
mhm I don't think is somehow related, but I see that you found a solution. |
In fact the contributor of this project found, but he also could not reproduce the issue... |
I will provide you my results with 2 cams, maybe with 3 on Friday. But I can at least confirm the 24 or 15 fps claim, the network gives. In fact one can see, that the real fps is way lower for each separate cam. |
On Nano with SSD-Mobilenet-v2 (the 90-class COCO model), you will get around ~24 FPS total for the network. With a higher batch size you could get more FPS, but my code is setup for batch_size=1. The other Jetson devices also get higher FPS of course. Also retraining the model with only the classes you want will greatly improve the FPS. Most folks do not need the full 90 classes from the COCO model - that is a lot of classes for detection model. Instead you could pick and choose like this part of the tutorial: https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md |
Thanks for your comment, Dusty. What exactly is this batch size and how could that be increased? |
The batch size means how many images are processed at once. Since you have N cameras in theory you could do a batch size of N. Its not easily changeable in my code though as it requires the pre/post-processing be setup for batching too, and most users of my library only need batch=1. The TensorRT samples show batching but I'm not sure those explicitly do SSD-Mobilenet.
…________________________________
From: neilyoung <notifications@github.com>
Sent: Wednesday, January 6, 2021 5:56:02 PM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>
Cc: Dustin Franklin <dustinf@nvidia.com>; Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
Thanks for your comment, Dusty. What exactly is this batch size and how could that be increased?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#882 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK5OB6DZIKQZ65Z7YVTSYTTAFANCNFSM4VSYFH2Q>.
|
Also DeepStream can handle batching automatically and has optimized object detection models. The Transfer Learning Toolkit can be used to prune the models and make them faster, and run them with DeepStream.
…________________________________
From: Dustin Franklin <dustinf@nvidia.com>
Sent: Wednesday, January 6, 2021 6:05:58 PM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>; dusty-nv/jetson-inference <reply@reply.github.com>
Cc: Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
The batch size means how many images are processed at once. Since you have N cameras in theory you could do a batch size of N. Its not easily changeable in my code though as it requires the pre/post-processing be setup for batching too, and most users of my library only need batch=1. The TensorRT samples show batching but I'm not sure those explicitly do SSD-Mobilenet.
________________________________
From: neilyoung <notifications@github.com>
Sent: Wednesday, January 6, 2021 5:56:02 PM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>
Cc: Dustin Franklin <dustinf@nvidia.com>; Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
Thanks for your comment, Dusty. What exactly is this batch size and how could that be increased?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#882 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK5OB6DZIKQZ65Z7YVTSYTTAFANCNFSM4VSYFH2Q>.
|
Hi dusty, thank you for your time, |
DeepStream has many users, they just tend to be for production deployment and using multiple camera streams. Check out the DeepStream forums and these Python samples:
https://github.com/NVIDIA-AI-IOT/deepstream_python_apps
…________________________________
From: Federico Lanzani <notifications@github.com>
Sent: Wednesday, January 6, 2021 6:25:18 PM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>
Cc: Dustin Franklin <dustinf@nvidia.com>; Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
Also DeepStream can handle batching automatically and has optimized object detection models. The Transfer Learning Toolkit can be used to prune the models and make them faster, and run them with DeepStream.
…
________________________________ From: Dustin Franklin dustinf@nvidia.com<mailto:dustinf@nvidia.com> Sent: Wednesday, January 6, 2021 6:05:58 PM To: dusty-nv/jetson-inference jetson-inference@noreply.github.com<mailto:jetson-inference@noreply.github.com>; dusty-nv/jetson-inference reply@reply.github.com<mailto:reply@reply.github.com> Cc: Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882<#882>) The batch size means how many images are processed at once. Since you have N cameras in theory you could do a batch size of N. Its not easily changeable in my code though as it requires the pre/post-processing be setup for batching too, and most users of my library only need batch=1. The TensorRT samples show batching but I'm not sure those explicitly do SSD-Mobilenet.
________________________________ From: neilyoung notifications@github.com<mailto:notifications@github.com> Sent: Wednesday, January 6, 2021 5:56:02 PM To: dusty-nv/jetson-inference jetson-inference@noreply.github.com<mailto:jetson-inference@noreply.github.com> Cc: Dustin Franklin dustinf@nvidia.com<mailto:dustinf@nvidia.com>; Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882<#882>) Thanks for your comment, Dusty. What exactly is this batch size and how could that be increased? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#882 (comment)<#882 (comment)>>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGK5OB6DZIKQZ65Z7YVTSYTTAFANCNFSM4VSYFH2Q.
Hi dusty, thank you for your time,
I've tried to implement deepstream in my project but very few people seem to be using it and I didn't found courses or guides that teach how to use it, also the documentation is a bit confusing to me. I've also had problems using nvidia docker (on ubuntu).
Surely deepstream seems VERY interesting and with a lot of potential for my projects, but I found complex to learn how to use it. I'd be happy if you could suggest me some courses or some good guides that teaches how deepstream works and how can be implemented.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#882 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK2VU6QCXDCNO4LK7ZTSYTWN5ANCNFSM4VSYFH2Q>.
|
Hi Dusty, this sounds interesting. Unfortunately I don't currently have the time to do a re-training. Would you have by chance a re-trained model (e.g. persons only) for a quick test and assesment? |
Got the 20 W power plug. Same issue. Crashes with 3 USB cams. |
When you say crashes, do you mean that the Nano powers off? Or that the app crashes to the desktop? If the later, what error/exception is thrown?
I don't have the person-only model, but there is one with DeepStream/TLT.
…________________________________
From: neilyoung <notifications@github.com>
Sent: Thursday, January 7, 2021 6:10:53 AM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>
Cc: Dustin Franklin <dustinf@nvidia.com>; Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
Got the 20 W power plug. Same issue. Crashes with 3 USB cams.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#882 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK2F4GU4BMQFTZ3CC4DSYWJD3ANCNFSM4VSYFH2Q>.
|
Yes, the videos freeze for a second then it all goes down and the entire machine reboots. OK, thanks for checking for the model. Would it be possible to lead me to that DeepStream? Right now the Jetson is going to loose the race against Coral for performance and stability. I would give it at least yet another week before I give up, since I think, your solution has the better future. |
Shutting off entirely means its likely a hardware problem, either power or USB related. Hopefully the previous kernel log is saved and you could check that - otherwise you can get the kernel log while its running on another machine via debug UART.
This page has the PeopleNet model -
https://ngc.nvidia.com/catalog/models/nvidia:tlt_peoplenet
See the section Instructions to deploy these models with DeepStream
…________________________________
From: neilyoung <notifications@github.com>
Sent: Thursday, January 7, 2021 8:24:50 AM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>
Cc: Dustin Franklin <dustinf@nvidia.com>; Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
Yes, the videos freeze for a second then it all goes down and the entire machine reboots.
OK, thanks for checking for the model. Would it be possible to lead me to that DeepStream? Right now the Jetson is going to loose the race against Coral for performance and stability. I would give it at least yet another week before I give up, since I think, your solution has the better future.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#882 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK6AMSMD6W57QX627GDSYWY2FANCNFSM4VSYFH2Q>.
|
I was running https://www.jetsonhacks.com/2019/04/19/jetson-nano-serial-console/ |
The situation is absolutely reproducible. It starts, it runs (sometimes it doesn't reach image display state), it freezes, it reboots. The crash dump is in the first few lines of this log. The following is just reboot. This is the log dumped from I suppose it will remain a mystery... :( |
It could be a problem with the USB. I will get 3 identical new USB cams tomorrow, let's see what happens then. |
Is there any good explanation for this: I'm running your "my-detection.py" sample with just one USB camera. While the network reports 24 fps, the display frame rate is only 15. From the visible impression I can confirm these 15 fps. EDIT: The camera is giving 60 fps if I drop the inference. |
You might want to re-arrange how you are plugging them in, trying a different hub/ect, to see if that is related.
There is other processing outside of the network such as drawing the bounding box overlays, rendering with OpenGL, ect. In particular I think it's the OpenGL synchronization that is slowing it down - need to dig into this more. I think if you run it headlessly (i.e. streaming out via RTP) the overall FPS may be higher. |
OK, makes sense. I will create a headless version tomorrow, just monitoring the detections and the fps achieved w/o drawing. My productive approach in the end is also headless. |
BTW: From your experience: Would one single USB3 hub be capable to take the load of 3 cams? Well, the hub of course, but the upload too? Anyway I will get three USB cams of one kind tomorrow, will test again then. |
Wait. 640 x 480 selected as options. It starts again with 11 fps, then it goes up to 31, then it drops down again.... Man, any explanation for this behaviour? If it runs at 30 fps the latency is near 0.. Sometimes a trace "gstreamer message qos => v4lsrc0" appearas", but this seem to not have an impact... BTW: I noticed this toggling fps already on my Mac Book... Strange... |
Oh, that is weird. From the
If I try something similar in my app I'm getting this: global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (801) open OpenCV | GStreamer warning: cannot find appsink in manual pipeline [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created or alternatively if I remove the "name=mysink" OpenCV(4.1.1) /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): v4l2src device=/dev/video1 ! image/jpeg, width=640, height=480 ! jpegdec ! video/x-raw ! appsink in function 'icvExtractPattern' |
That qos (quality of service) message typically appears when the USB camera or bus is not keeping up with providing the data at the expected rate |
Ok. Thanks. This would explain why the fps is toggling. But it would not explain why I can’t use the same pipeline in my app. |
Hmm, I'm not sure about cv2.VideoCapture, but I see in your bottom error message you have What if you specify |
Since I opened all three devices all came up with the same error. I just copied the wrong. Let me test your suggestion |
Hmm no. [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (1757) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src0 reported: Internal data stream error. [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (886) open OpenCV | GStreamer warning: unable to start pipeline [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (1757) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src1 reported: Internal data stream error. [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (886) open OpenCV | GStreamer warning: unable to start pipeline [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (1757) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src2 reported: Internal data stream error. [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (886) open OpenCV | GStreamer warning: unable to start pipeline [ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created |
Just these two pipes are working:
|
I cannot even find that |
It is from this line: https://github.com/dusty-nv/jetson-utils/blob/833fc7998e34d852672277730a11aeed90024959/camera/gstCamera.cpp#L181
…________________________________
From: neilyoung <notifications@github.com>
Sent: Monday, January 11, 2021 4:56:54 PM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>
Cc: Dustin Franklin <dustinf@nvidia.com>; Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
I cannot even find that video/x-raw sink in your code... Magic...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#882 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK7V7TQXGRMGLNZQVSLSZNX2NANCNFSM4VSYFH2Q>.
|
Oh sh... sorry, I was there already :/ Didn't find it with github search since it was in your jetons.utils. However, this seems to work well in CPP, not in Python, for whatever reasons... |
Your pipeline works perfectly from the command line, just not from python.
|
OK, I'm coming closer. Not w.r.t. the non working "video/x-raw" sink. That does not even work with a Logitech C920. But also the Logitech showed these "lost frames". And it turns out - the inference is the problem. More specifically: Running inference and capturing on the same thread is a very bad idea. The inference slows down the capturing and this in turn leads to lost jpeg frames on decoding. I think you mentioned that somehow, to split capture and inference and put it into different threads. My tomorrow doing. Besides that my cameras are really shitty. I will return them |
Does your C920 support H264? Mine works smoothly with H264.
…________________________________
From: neilyoung <notifications@github.com>
Sent: Monday, January 11, 2021 5:57:54 PM
To: dusty-nv/jetson-inference <jetson-inference@noreply.github.com>
Cc: Dustin Franklin <dustinf@nvidia.com>; Mention <mention@noreply.github.com>
Subject: Re: [dusty-nv/jetson-inference] Overheat problem? - Jetson crash with two cameras simultaneously (#882)
OK, I'm coming closer. Not w.r.t. the non working "video/x-raw" sink. That does not even work with a Logitech C920. But also the Logitech showed these "lost frames". And it turns out - the inference is the problem. More specifically: Running inference and capturing on the same thread is a very bad idea. The inference slows down the capturing and this in turn leads to lost jpeg frames on decoding. I think you mentioned that somehow, to split capture and inference and put it into different threads. My tomorrow doing.
Besides that my cameras are really shitty. I will return them
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#882 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGKZFY3DOCZ3OHZLHZLTSZN67FANCNFSM4VSYFH2Q>.
|
@dusty-nv I know you have a Logitch C920. Would it be possible for you to open Python3 on your Jetson and enter these two lines and tell me the result?
Here I'm getting
And then, please do the same from the console:
This works here, just why the hell? :) |
I think that cv2.VideoCapture expects the It is similar to what is shown here under If you can use H.264 instead of MJPEG, that would be preferred in my experience. |
Hmm. Makes sense. Wondering, why H.264 is more efficient, though. Videoconvert is just that slow.... |
H.264 is much newer and uses temporal compression (i.e. takes into account multiple frames via motion vectors). MJPEG is essentially a bunch of standalone JPEG images pasted together that don't take multiple frames into account. The MJPEG hardware codecs also seem more glitchy and less reliable in my experience, whereas H.264 works smoothly. videoconvert is CPU-only, which is why it is slow. In my code I manually use CUDA to convert the colorspace, which is why it is just |
Thanks for the elaboration. Very interesting. Well, I tried H.264 (I'm used to it in other contexts) and it works quite well. The thing is, that my target cams will not be able to support hardware H.264, so I have to stick with MJPEG. I would like to take the opportunity to ask you: Do you think it is OK to have the Jetson Nano getting in memory trouble, once I try to open three 640x480 cams in raw mode at once? I'm wondering, the box has 4 GB, but I cannot open more than 2. The third goes in to memory allocation error. Then (completely OT): Do you know, how to control the verbosity of the jetson logs? Especially these?
I'm forwarding my log-level to the jetson core, but these traces are not to be silenced :) |
My understanding of the memory error, is that it's not related to the Nano's system memory (which it has plenty of for this), but rather USB buffer memory allocated in Linux's V4L2 UVC device driver: https://developer.ridgerun.com/wiki/index.php?title=Birds_Eye_View/Getting_the_Code/Building_and_Installation_Guide#Patch_to_support_4_USB_cameras To change the logging level, you can play around with the logging arguments:
--log-file=FILE output destination file (default is stdout)
--log-level=LEVEL message output threshold, one of the following:
* silent
* error
* warning
* success
* info
* verbose (default)
* debug
--verbose enable verbose logging (same as --log-level=verbose)
--debug enable debug logging (same as --log-level=debug) |
I'm setting the log-level already to silent, to no avail. Thanks for the link |
Are you using detectnet/detectnet.py or your own app? If you are using your own, you need to pass the system CLI args like is done here:
Otherwise it won't pick up that you set |
I see what the problem was. These traces come from
but the log level has been set later. However, I'm now instantiating the inference engine later, after the log-level is set. This remains, even with "silent"
Don't bother. Thanks for your help! |
That one must have snuck by as a printf, I just fixed it in commit 66bae1a You might want to use |
Yepp. Works. Thanks. Will close this thread now. Thanks again for your extraordinary help. |
Last message: This one... jetson.inference -- detectNet loading network using argv command line params |
Ah, that one needed downgraded to LogDebug() because it is before See commit 5718df1 for the change. |
Hi,
I took this sample code and used it with
/dev/video
, a Logitech C920.I was using a 5V 2.5A Raspberry PI power supply via micro USB. No fan.
Generally this worked fine at about 24 fps reported. The heat sink got pretty hot, though. The reported AO temperature after the test was about 68 degrees.
Then I plugged another USB cam (ESP 1080) and just doubled the code.
This is for sure not optimal, but it worked. I got two windows of the two cams. The reported FPS was still 24 fps, but the movements in the displayed images did show me, that the 24 fps distributed equally, so that each cam came to about 12 fps.
But... it crashed after several minutes. The Jetson switched off. Completely.
I first thought, it might be an "undervoltage" problem, but there is no trace of this. In two SSH windows I observed the output of
dmesg -wH
andjournalctl
. No problem with the power supply indicated.Just both windows did show this after a while:
Jan 04 10:21:14 jetson kernel: FAN rising trip_level:1 cur_temp:51000 trip_temps[2]:61000
For me it seems, that the Jetson is getting warm and he tries to switch on the fan :)
In a third SSH window I observed the output of
tegrastats
.This is what appeared until the sudden death.
I can't see a remarkable overheat either...
I'm absolutely not sure, what causes the (reproducible) crashes. I will apply a fan for the next attempts, though.
Any further ideas, who to approach the problem?
The text was updated successfully, but these errors were encountered: