Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opendatacam fails to start if camera feed not available on boot #368

Closed
vsaw opened this issue Jan 28, 2021 · 20 comments · Fixed by #513
Closed

Opendatacam fails to start if camera feed not available on boot #368

vsaw opened this issue Jan 28, 2021 · 20 comments · Fixed by #513
Milestone

Comments

@vsaw
Copy link
Collaborator

vsaw commented Jan 28, 2021

Currently Opendatacam only tries a few times to connect to Darknet. If that fails no further attempt to connect to darknet will be made. In situations where the camera feed is flaky (e.g. connected via network) this means that if the camera is not available during boot, Opendatacam will not start up successfully

I propose to change the code to infinitely try to connect to darknet to make startup more robust regarding camera outages.

@tdurand
Copy link
Member

tdurand commented Jan 29, 2021

Yes, that would be a good thing !

@AlexHolly
Copy link

Steps:

  1. Start Docker Container
  2. Connect IP Camera

Log:

[yolo] params: iou loss: ciou (4), iou_norm: 0.07, cls_norm: 1.00, scale_x_y: 1.05
Total BFLOPS 60.137 
avg_outputs = 500162 
 Allocate additional workspace_size = 26.22 MB 
Demo
net.optimized_memory = 0 
mini_batch = 1, batch = 8, time_steps = 1, train = 0 
nms_kind: greedynms (1), beta = 0.600000 
nms_kind: greedynms (1), beta = 0.600000 
nms_kind: greedynms (1), beta = 0.600000 
Loading weights from yolov4.weights...Done! Loaded 162 layers from weights-file 
[tcp @ 0x5606272e03a0] Connection to tcp://192.168.1.1:554?timeout=0 failed: No route to host

(darknet:45): GStreamer-CRITICAL **: 11:04:05.745: 
Trying to dispose element pipeline0, but it is in PAUSED instead of the NULL state.
You need to explicitly set elements to the NULL state before
dropping the final reference, to allow them to clean up.
This problem may also be caused by a refcounting bug in the
application or some element.


 seen 64, trained: 32032 K-images (500 Kilo-batches_64) 
video file: rtsp://secret:secret@192.168.1.1
[ WARN:0] global /var/local/git/opencv/modules/videoio/src/cap_gstreamer.cpp (886) open OpenCV | GStreamer warning: unable to start pipeline

(darknet:45): GStreamer-CRITICAL **: 11:04:05.745: 
Trying to dispose element videoconvert0, but it is in PLAYING instead of the NULL state.
You need to explicitly set elements to the NULL state before
dropping the final reference, to allow them to clean up.
This problem may also be caused by a refcounting bug in the
application or some element.

(darknet:45): GStreamer-CRITICAL **: 11:04:05.746: 
Trying to dispose element appsink0, but it is in READY instead of the NULL state.
You need to explicitly set elements to the NULL state before
dropping the final reference, to allow them to clean up.
This problem may also be caused by a refcounting bug in the
application or some element.


(darknet:45): GStreamer-CRITICAL **: 11:04:05.746: gst_element_post_message: assertion 'GST_IS_ELEMENT (element)' failed
Something went wrong: connect ECONNREFUSED 172.19.0.4:8070
Too much retries, YOLO took more than 3 min to start, likely an error
0


// Manually open http://localhost:8080/ in browser

already started
Something went wrong: connect ECONNREFUSED 172.19.0.4:8070
Too much retries, YOLO took more than 3 min to start, likely an error
0

// Manually open http://localhost:8080/ in browser

already started
Something went wrong: connect ECONNREFUSED 172.19.0.4:8070
Too much retries, YOLO took more than 3 min to start, likely an error
0

@munsterlander
Copy link
Contributor

munsterlander commented Nov 7, 2021

@vsaw @tdurand So dissecting this, what I noticed is, in Opendatacam.js, in the listenToYOLO function, if self.HTTPRequestListeningToYOLO encounters an error, then Opendatacam.isListeningToYOLO never gets reset to false. As such, in the error function, it just returns and that's it. Meaning, this block of code is never executed:

      setTimeout(() => {
        Logger.log('Retry connect to YOLO');
        self.listenToYOLO(Opendatacam.yolo, urlData);
        Opendatacam.HTTPRequestListeningToYOLOMaxRetries--;
      }, HTTP_REQUEST_LISTEN_TO_YOLO_RETRY_DELAY_MS);

SO, I did the following and it works-ish. The bounding boxes now restart and move on the screen but the video is still froze. I am looking for where that call is to get that moving again.

self.HTTPRequestListeningToYOLO.on('error', (e) => {
	if(Opendatacam.isListeningToYOLO) Opendatacam.isListeningToYOLO = false;
	res.emit('error', e);
      });

Edit: I have been going at this all day. Its starting to get blurry now. A call is failing to be made - I think - or there is a timing issue for an update with the WebcamStream or AppStateManagement or something. I will pick this up tomorrow. One thing to note, with the above fix, the rest of the calls get made, the recording is started and persisted in the DB and the recording button is updating its value to Stop Recording (which when pressed, stops the recording). So I just need to figure out how to get the video restarted because everything else is working.

@vsaw
Copy link
Collaborator Author

vsaw commented Nov 7, 2021

This is great news! Do you think it would be possible to have ODC try indefinitely to connect to Darknet?

This piece of code only handles the detections. The video is handled by the MjpegProxy.js. Could be that this also needs fixing.

What happens when you refresh your browser while the recording is running? Does this show the video?

@munsterlander
Copy link
Contributor

munsterlander commented Nov 7, 2021

Well, I did a ton of debug statements and what I observed was that ODC never waits the full 3 minutes like its supposed to. That missing statement short-circuits the process - if there was an error. On startup, an error was never generated and it engages the setTimeout like it's supposed to and the longest I had to wait was about 45 seconds. In my opinion, it should not try infinitely, but the retry time and max time should be moved to config, so users can set it to whatever (I will give that a try real quick). Infinite loops just seem wrong.

Edit: Ok, moving to the config works fine as expected. If you want to go that route, I will make a quick PR. As far as refreshing, it shows that it is recording, the bounding boxes are moving, but now you have the grey screen (no video at all) (EDIT4: I NEEDED TO WAIT LONGER) and I saw this in the console:

JSON_sender: new client 24
already started
Already listening
SSE: All clients disconnected, cannot send update
SSE: Sending update to clients
 JSON-stream sent. 

I had been looking at MjpegProxy.js but I haven't quite figured out the Redux state manager and how its binding the URL with the startdate and when that all happens. My initial reaction is that we need to reset that call or remount the component.

Edit2: Ok, so I just did a hack to see if I could force something and then looking at MainPage.js, I noticed that when isListeningToYOLO=false (which we set above), InitializingView is supposed to be shown and it does not. If that was working, then the WebcamStream component would be reset. SO, I think there is an issue where when it gets set, its not updating Redux. I have minimal experience with React and Redux though, so I may be completely wrong on statemanagement and lifecycle on this, but that is where my head is at right now.

Edit3: Alright! Calling http://192.168.1.15:8080/webcam/stream?date=1636300519736 this directly resulted in the stream appearing in time with the bounding boxes. So I just compared Master and it shows the InitializingView when its reloading but development is not. So the I think the issue is in MainPage.js - meaning that this.props.isListeningToYOLO is never changing and as such the views aren't switching.

munsterlander added a commit to munsterlander/opendatacam that referenced this issue Nov 7, 2021
Bases on the discussion here:  opendatacam#368, I added the calls to the new config options.
@vsaw
Copy link
Collaborator Author

vsaw commented Nov 7, 2021

Currently, if ODC fails to connect to Darknet, users would need to restart the ODC Docker container. I believe this is not acceptable, which is why I vote to try indefinitely.

However, I would be okay for the process to time out, if users can restart the ODC to Darknet connection without having to restart the ODC Docker container.

@munsterlander
Copy link
Contributor

munsterlander commented Nov 7, 2021

Well, with the PR above, you could set max timeout to be like 86400 which would be 24 hours. Surely that is long enough right? Also, I mean fundamentally, we just remove the check for isListeningToYOLO and it would be an infinite loop, so we could do that

@munsterlander
Copy link
Contributor

munsterlander commented Nov 7, 2021

@vsaw So just to clarify all my edits above and get your take, that single line of code that I mentioned above fixes the camera issue that I think is what a lot of the issues around the loss of camera may be about - if they are using dev branch. If you wait and do a refresh, everything comes back to life like it is supposed to. In master, when you hit record on a video, it goes to the Initializing screen again. That should be set by this.props.isListeningToYOLO. When I update that value in Opendatacam.js, how does it get updated in the Redux store?

Sorry for all the mentions as well, we are just working in a bunch of different threads and I want make sure there is awareness as I am somewhat stuck right now other than waiting about 5 -10 seconds and manually hitting refresh to bring everything back.

@vsaw
Copy link
Collaborator Author

vsaw commented Nov 7, 2021

@vsaw So just to clarify all my edits above and get your take, that single line of code that I mentioned above fixes the camera issue that I think is what a lot of the issues around the loss of camera may be about - if they are using dev branch.

I'm not sure I know what you are referring to. All I've seen is #492 where you make the intervals configurable via config.js

If you wait and do a refresh, everything comes back to life like it is supposed to.

You need to understand that Darknet is consumed in two different places. Opendatacam.js takes care of the JSON Stream for the counting logic. MjpegProxy.js forwards the video stream to the browser. The error in Opendatacam.js caused the "too many retries" error and stopped ODC from connecting to Darknet got the UI stuck in the initializing view.

When you see the UI control but no video, this is an error in MjpegProxy.js. Which I have not yet had the chance to properly investigate. This however is only a "cosmetic" error, as ODC only needs the JSON stream to perform the counting logic.

In master, when you hit record on a video, it goes to the Initializing screen again. That should be set by this.props.isListeningToYOLO. When I update that value in Opendatacam.js, how does it get updated in the Redux store?

Please not that starting a recording behaves differently for files and "live video" sources. Files trigger a restart of Darknet and put ODC back into the initializing view until ODC can connect to Darknet again. We do this to process the file from the beginning instead of wherever it was when you pressed record. For "live video" this is obviously not possible we so just start the counting immediately without triggering a new "initializing" view. This happens in v3.0.2 as well as development branches.

Sorry for all the mentions as well, we are just working in a bunch of different threads and I want make sure there is awareness as I am somewhat stuck right now other than waiting about 5 -10 seconds and manually hitting refresh to bring everything back.

As mentioned before I believe this is an issue with MjpegProxy.js and not caused by OpenDataCam.js or the React/Redux states.

@munsterlander
Copy link
Contributor

munsterlander commented Nov 7, 2021

Ok, I will keep tearing apart MjpegProxy.js. It just seems odd, that when I am setting isListeningToYOLO to false in Opendatacam.js, the loader is not being displayed which should say, Restarting to process video file.

Maybe MjpegProxy is throwing a blocking error or something, but it all works like it's supposed to (button changes to stop recording and the bounding boxes move) in the time that if the loading screen was displayed, it would conceal the frozen frame.

munsterlander added a commit to munsterlander/opendatacam that referenced this issue Nov 7, 2021
As documented here: opendatacam#368 (comment) this code resets the value to false so the onError routine works correctly.  The video is still frozen and I am working on resetting MjpegProxy.js or the component.
@munsterlander
Copy link
Contributor

munsterlander commented Nov 7, 2021

@vsaw

Files trigger a restart of Darknet and put ODC back into the initializing view until ODC can connect to Darknet again.

I think what I have been trying to say is, that ^ is not happening. It never goes to the initializing view when you hit record. If it did, I think when it switches back to live view, that would be enough to get the video back. Again, I could be completely wrong, but I understand everything you are saying. What I would like to add is the mjpeg proxy url works perfectly, so I don't think there is an issue with that, but again, I could be wrong.

@tdurand
Copy link
Member

tdurand commented Nov 8, 2021

When I update that value in Opendatacam.js, how does it get updated in the Redux store?

Just to answer how does the redux store works, the state of the UI is updated each time the servers pushes an update via the SSE (server side event) : https://opendatacam.github.io/opendatacam/apidoc/#api-Tracker-Data

The update is pushed here from what I remember: https://github.com/opendatacam/opendatacam/blob/development/server/Opendatacam.js#L368 , which should happen on each frame that we get from YOLO

And then received and merged on the front-end side here: https://github.com/opendatacam/opendatacam/blob/master/statemanagement/app/AppStateManagement.js#L219

You can use an extension on your browser to debug the values in the redux store : https://github.com/zalmoxisus/redux-devtools-extension

Regarding the webcam stream, it is handled here: https://github.com/opendatacam/opendatacam/blob/development/components/shared/WebcamStream.js#L36 , and effectively as there is the getTime() function in the componentDidMount() react should re-render when the isListeningToYOLO is toggled true and false (as the url is not the same, react is forced to re-render) .. I remember hacking a long time with this to get something a bit reliable..

But as the front-end end does not hook directly into darknet mjpeg stream but via the proxy on the server side.. maybe here there is some stuff to investigate

@munsterlander
Copy link
Contributor

munsterlander commented Nov 8, 2021

@tdurand Thank you so much! I was suspicioning this was the flow, but I wasn't exactly certain so you have definitely filled in some key events. With the addition of the one line of code above, I think it solves the majority of these camera / feed issues. What the current issue is though, is that when isListeningToYOLO is being changed to false, it should trigger the Initializing View to be shown, which is does not, so I believe that based on what you said above, the change I made is not propagating correctly outside of the function. I am going to run some tests watching the /tracker/sse to see what happens. I will get back to everyone later today with my results - just trying to find time in amongst all of my other work.

Edit: I am so close to having this resolved. Where I set isListeningToYOLO, a call to this.sendUpdateToClients(); was also needed as the front-end was never being notified of the change. Now with those two lines in place, the initializing view is shown perfectly and switches back to live view. The only issue I have, is now the screen is just grey (no video shown) with the UI controls present and the bounding boxes moving. Just need to jumpstart that video and this will be solved.

@munsterlander
Copy link
Contributor

munsterlander commented Nov 9, 2021

@vsaw @tdurand I finally figured it out and have it fixed. I do have one question though, but first the code changes:

In Opendatacam.js in listenToYOLO() you need to add/replace:

self.HTTPRequestListeningToYOLO.on('error', (e) => {
       if(Opendatacam.isListeningToYOLO) {     
            Opendatacam.isListeningToYOLO = false;
            this.sendUpdateToClients();
        }
	res.emit('error', e);
});

In server.js in express.get('/webcam/stream', (req, res) => { it needs to be reverted to what is in master branch:

      const urlData = getURLData(req);
      // Proxy MJPEG stream from darknet to avoid freezing issues
      return new MjpegProxy(`http://${urlData.address}:${config.PORTS.darknet_mjpeg_stream}`).proxyRequest(req, res);

Which brings me to my question. Why in development was it changed to see if an mjpegProxy already existed? What impact to the application is there based on these changes?

The only thing I can see that is different - and I just may have never noticed this before - but in the terminal that is running the application, prior to hitting record, its a steady stream of output, after hitting record and it restarts, the FPS:18.0 AVG_FPS: 18.0 output seems to come in chunks, BUT everything works as expected. Items are tracked, areas work, things get persisted to the DB, so it may be ok and just the way it is.

Thoughts on this approach?

@tdurand
Copy link
Member

tdurand commented Nov 9, 2021

From what I remember the mjpeg stream was changed to be a single instance to be able to have several clients on OpenDataCam without having the stream freeze on one of them ?? The changes come from that PR: #304

When you do remove this it will actually re-init the mjpeg proxy each time we do a request to /webcam/stream , and that's maybe why it fixes the problem you had... but this remove the ability to have several clients (open several tabs for example) connected to OpenDataCam.

From what I understood the problem here is that darknet isn't yet initialized but we already init the mjpeg proxy and this fails, and then we always use that failed instance... we need some logic to be able to reset it no ? something like (see comment below)

self.HTTPRequestListeningToYOLO.on('error', (e) => {
       if(Opendatacam.isListeningToYOLO) {     
            Opendatacam.isListeningToYOLO = false;
            this.sendUpdateToClients();
            // HERE reset mjpeg proxy to null so it will create a new instance in the next call to /webcam/stream
        }
	res.emit('error', e);
});

@munsterlander
Copy link
Contributor

munsterlander commented Nov 9, 2021

Ok, so mjpgProxy was not accessible due to scope, but I was able to to set it this way which keeps multiple client connections available:

    express.get('/webcam/stream', (req, res) => {
    // Proxy MJPEG stream from darknet to avoid freezing issues
      if(!Opendatacam.isListeningToYOLO && mjpgProxy !== null) mjpgProxy = null;
      const urlData = getURLData(req);
      if (mjpgProxy == null) {
        mjpgProxy = new MjpegProxy(`http://${urlData.address}:${config.PORTS.darknet_mjpeg_stream}`);
      } 
      return mjpgProxy.proxyRequest(req, res); 
    });

This has been tested and it works. If acceptable, I would like to update this PR: #492 because that adds user configurable timeouts for the restarting of YOLO so you can do infinite if you wanted. I will be adding another PR for some minor spelling and grammar corrections in the UI if acceptable as well.

@tdurand
Copy link
Member

tdurand commented Nov 9, 2021

sounds a good fix, good job !!

@vsaw
Copy link
Collaborator Author

vsaw commented Nov 9, 2021

Yes! Thank for finding the cause! I'll comment on your PR regarding the fix as I have a few suggestions.

@nagajavisetty
Copy link

Hi, I am working on opendatacam but it is not detecting the objects. Do you know anyone about this issue?

@vsaw vsaw linked a pull request Jan 25, 2022 that will close this issue
@vsaw
Copy link
Collaborator Author

vsaw commented Jan 25, 2022

Closed via #513

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants