Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworking of App #370

Open
SeanKetring opened this issue Jun 28, 2015 · 14 comments
Open

Reworking of App #370

SeanKetring opened this issue Jun 28, 2015 · 14 comments
Assignees
Labels

Comments

@SeanKetring
Copy link
Contributor

It was discussed yesterday that we could benefit from having an app which could control the bot and also display a feed from a camera as well as the environment constructed from lidar.

Currently we have the framework of an app from previous years robots. The task is to take an modify this app to display this new information.

This task is included as an end of summer milestone goal

@SeanKetring SeanKetring modified the milestone: End of summer. Jun 28, 2015
@kvijay1995 kvijay1995 self-assigned this Jul 1, 2015
@kvijay1995
Copy link
Contributor

I'd like to take on the task of routing the live camera feed to the app.

@kvijay1995
Copy link
Contributor

An update to this issue: I have successfully been able to stream video from my laptop's webcam (using python and openCV) to an android app over the internet and I believe this can be replicated with the BBB running python code to stream the video to the android app.

However this is still under progress since there are various smaller issues that need to be resolved; the most important of them being the delay in displaying the video on the app (about a second).

If anyone's done something like this before, I'd appreciate some ideas on how to minimize this delay. I will restrain from spilling all the technical details here for the sake of conciseness.

@dfarrell07
Copy link
Member

If anyone's done something like this before, I'd appreciate some ideas on how to minimize this delay. I will restrain from spilling all the technical details here for the sake of conciseness.

I've done a lot of performance stuff generally, but nothing specific to video/streaming/Android/etc.

Again, just generally speaking, the first step is to quantitatively describe the sources of the delay. Which steps of the stack are taking a long time? What resource are they short on (CPU, RAM, HDD/SDD backing store, network throughput, network latency, ...)? Fix the lowest hanging fruit and re-measure.

Hope that helps. If you give some of the relevant tech details, I might be able to help more.

@AhmedSamara
Copy link
Member

I will restrain from spilling all the technical details here for the sake of conciseness.

This is exactly the right place to spill all of the technical details :)

@kvijay1995
Copy link
Contributor

@dfarrell07 That does help a lot. I'm going to run some tests later at different points in the stack to see which takes the most time.

@AhmedSamara haha fair enough:

So the stack goes like this

Server side:

  1. Python grabs frames from laptop's webcam using openCV
  2. compresses the frames to jpeg format
  3. sets up a tcp socket and waits for client
  4. Once client has connected send the size of the image in bytes
  5. Waits for client confirmation
  6. Once client confirms, send the image data in bytes
  7. Waits for client confirmation, then loops back to step 4.

Client side:

  1. Sets up basic stuff
  2. Creates a tcp socket and attempts to connect to server.
  3. Upon connection and receiving the image size, send confirmation.
  4. Reads all of the bytes from the socket's input stream.
  5. Sends confirmation to server.
  6. Gets the reference of the ImageView object (ImageView is an android class to display pictures)
  7. Constructs a bitmap from the data received and sets the ImageView reference to it.
  8. Loops back to step 3

I hope that this addresses most of the technical details.

Now regarding the delay in the stream, I have two suspicions as to what could cause it:

  1. I'm not pausing anywhere when I'm reading the images and sending it over the network. Maybe this is causing overhead at the client side, with the thread trying to update the ImageView too rapidly.

  2. ImageView may not be the conventional way to stream video as it's more suited for images. Unfortunately android does not support m-jpeg as a video format so we might need to switch over to h.264 or something like that.

@AhmedSamara
Copy link
Member

Is the delay better on different wifi networks?

As I understand it, the delay would probably happen at client.step 4, the
"recieve step" would last for as long as is required to send the entire
message each time.

I don't know if the delay is necessarily a thing we can solve in code (at
least not without adding an order of magnitude of complexity).
IP cameras are a thing, and they're specifically made for people with high
real-time requirements for video surveillance. We'd have to recreate
whatever they do to actually get zero delay.

Could you go ahead and post your code? (It doesn't matter if it's not done,
in-progress projects are perfectly appropriate for github).

On Tue, Jul 21, 2015 at 10:28 PM, Vijay Thiagarajan <
notifications@github.com> wrote:

@dfarrell07 https://github.com/dfarrell07 That does help a lot. I'm
going to run some tests later at different points in the stack to see which
takes the most time.

@AhmedSamara https://github.com/AhmedSamara haha fair enough:

So the stack goes like this

Server side:

  1. Python grabs frames from laptop's webcam using openCV
  2. compresses the frames to jpeg format
  3. sets up a tcp socket and waits for client
  4. Once client has connected send the size of the image in bytes
  5. Waits for client confirmation
  6. Once client confirms, send the image data in bytes
  7. Waits for client confirmation, then loops back to step 4.
  • Client side:*
    1. Sets up basic stuff
    2. Creates a tcp socket and attempts to connect to server.
    3. Upon connection and receiving the image size, send confirmation.
    4. Reads all of the bytes from the socket's input stream.
    5. Sends confirmation to server.
    6. Gets the reference of the ImageView object (ImageView is an android
      class to display pictures)
    7. Constructs a bitmap from the data received and sets the ImageView
      reference to it.
    8. Loops back to step 3

I hope that this addresses most of the technical details.

Now regarding the delay in the stream, I have two suspicions as to what
could cause it:

  1. I'm not pausing anywhere when I'm reading the images and sending it
    over the network. Maybe this is causing overhead at the client side, with
    the thread trying to update the ImageView too rapidly.

  2. ImageView may not be the conventional way to stream video as it's more
    suited for images. Unfortunately android does not support m-jpeg as a video
    format so we might need to switch over to h.264 or something like that.


Reply to this email directly or view it on GitHub
#370 (comment).

Ahmed Samara
Computer Engineering

@PaladinEng
Copy link
Contributor

If you are not committed to the solution you are already using, my team working on the Firefighting Drone Challenge was able to stream video from a webcam, through the BBB, over wi-fi, last semester. We used gstreamer, and were able to tweak it to only a 300ms delay. That took about 40 hours to get up an running. Derek Molloy also lays out another way to do it in his book Exploring Beaglebone.
I didn't work on that part personally, but if you are interested in discussing it, I might be able to get my teammate in next week. We also did a write up for the class.

@dfarrell07
Copy link
Member

@kvijay1995 - Super quickly, it sounds like there are quite a few RTTs (round trip times) per image frame. One RTT would typically be ~10-20ms at best (just latency, not counting throughput delay from the large image).

@kvijay1995
Copy link
Contributor

@AhmedSamara I haven't tried it on different wifi networks so I wouldn't know if that's the reason. I'll look into IP cams and try to replicate their methodology. Ahmed the code is posted at https://github.com/kvijay1995/Video-Client and https://github.com/kvijay1995/Video-Server

@PaladinEng I think this is the right time to be open to ideas so that would be great. I looked into gstreamer and it looks like both android and linux are supported so that's a good sign.

Do you happen to know which video codec you guys used (h.264, mjpeg, etc.)?. I believe Derek Molloy uses v4l2 (Video4Linux2) in conjunction with avconv to stream video. That would be great if I can talk to your teammate; only problem is I won't be in for another 2 weeks but I can do anytime after. Would it be possible to meet him then? In the meantime I can do some more research about this.

@dfarrell07 - Yes that's true; I'll try to eliminate those acknowledgement messages from the client and see if that helps.

Thank you guys for your input! :)

@kennychuang
Copy link

I am wondering what resolution and framerate are you sending the video over? Having a 1080/720 at 30fps is taxing. The question to ask then is, do we actually need this fidelity or can we use a lower quality? Changing these might help a bit.

One possible way of lowering the latency is to reduce the payload of each packet. Sending big packets may be slower than sending many lesser packets. This can be done by Fragmenting/Splitting up each packet into a certain max size.

A solution for removing step four as pointed out by Ahmed could be to merge the size of the jpeg and data together or just send them without the confirmation, as order in TCP is maintained (assuming the same connection). I think that the confirmation in step seven can also be discarded safely. Unless the confirmations are there for other precautions.

@kvijay1995
Copy link
Contributor

@kennychuang Right now the resolution I'm sending over is 480p (640*480) and the framerate is not being controlled right now. I am planning to control the frame rate as we move forward however.

I haven't looked into fragmenting the packets yet but for reference each image I'm sending right now is no more than 90,000 bytes. I'll try fragmenting too to see if it helps.

I've already removed all of the confirmation messages as suggested by you, Ahmed and Daniel. I'll commit the revised code later tonight. Thanks for you input Kenny

Here's the updated stack if anyone's interested:
Server side:

  1. Sets up a tcp socket and waits for client until connection before moving on.
  2. (Once client has connected) Grabs frames from laptop's webcam using openCV
  3. Compresses the frames to jpeg format
  4. Send the size of the image in bytes along with a linefeed as delimiter and then sends the image
  5. Loops back to step 2 and repeats until client disconnects

Client side:

  1. Sets up basic stuff
  2. Creates a tcp socket and attempts to connect to server.
  3. (Upon connection) Receives the image size, linefeed and full image
  4. Gets the reference of the ImageView object (ImageView is an android class to display pictures)
  5. Constructs a bitmap from the data received and sets the ImageView reference to it.
  6. Loops back to step 3

@SeanKetring
Copy link
Contributor Author

Can someone post an update for this? Its still an active and ongoing issue in real life but the Github issue is out of date

@mynameis7 can you describe the issue you were seeing with the app? It looked really good the other night, but I remember it couldn't collect data?

@kvijay1995 are you still having an issue with delay in video processing?

@kvijay1995
Copy link
Contributor

Yea, there's still some delay in the video processing. From my talk with Alwyn, we came to a conclusion that there will always be a delay in this since the beagle bone hardware is not designed for realtime video processing. We can certainly optimize the video streaming process, but the question is how relevant or useful would this be for our end goals since this might turn out to be quite time consuming.

@dfarrell07
Copy link
Member

I'll try fragmenting too to see if it helps.

I'm not sure I'd recommend that route. Packet size optimization for perf is a very complex problem, without quite a bit more background/study I think it's a poor use of time.

we came to a conclusion that there will always be a delay

+1, it's about figuring out which properties you need (latency, jitter, throughput) and how optimized they need to be.

question is how relevant or useful would this be for our end goals since this might turn out to be quite time consuming.

Very important question to keep in mind, it does smell a bit like a time sink/rabbit hole.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants