Support for v4l2loopback #2232

martinellimarco · 2021-04-03T14:52:54Z

This request is to open a discussion about support for v4l2loopback that will also allows to support OBS on linux, as many have requested.

The code is working and it was tested with a few different phones and computers but other tests are more than welcome.

How to test

To test it first you'll need to modprobe v4l2loopback, then run scrcpy --v4l2sink /dev/video0 -N and open the loopback with some other program like VLC, ffplay or OBS (tested all 3 of them).

The video stream can be simultaneously seen in scrcpy window, recorded with --record and sinked to v4l2loopback with --v4l2sink.

It has not been tested without SCRCPY_LAVF_HAS_NEW_CODEC_PARAMS_API so there may be problems there.

How it works

All the magic is inside v4l2sink.c which is heavily based on decoder.c and recorder.c

In practice it uses v4l2enc from libavdevice as muxer in the same way that recorder.c use mp4 or matroska.

The key difference here is that v4l2loopback and most v4l2 clients expects to find a raw video frame and not an H264 encoded frame so like decoder.c it decodes the AVPacket to an AVFrame that is then re-encoded into an AVPacket with AV_CODEC_ID_RAWVIDEO.
The encoded packet is sent to v4l2enc and streamed to /dev/video0.

The code in v4l2sink.c and recorder.c are very similar and one can think that they can be merged together, but there are subtle differences here and there and for many reasons I prefer to keep them distinct.

RFC on known limitation

There is a limitation at the moment, it doesn't handle screen rotation, but it seems to be a limit of v4l2loopback where you can't change the resolution as long as anyone is using the device.

In practice if you start with the phone in portrait it will work only in portrait, if you start in landscape it will work only in landscape. To change orientation you have to close both scrcpy and the receiving application, rotate the phone and restart. Closing only one of them doesn't work.

Unless someone else come up with an idea to work around this I see two ways of handling this.

Idea 1, scrcpy can rotate back the frame so at least the image will be visible and one can rotate in OBS or other programs without any problem.

Idea 2, scrcpy can copy the frame centered in a squared texture. Eg, suppose that the frame is 720x1280, scrcpy will send to v4l2loopback a 1280x1280 texture with the frame centered and correctly rotated.

In your opinion what is the best solution? I see pro and cons in both of them.

Some devices use big display id values. Refs #2009 <#2009>

The process API provides the system-specific implementation, the adb API uses it to expose adb commands.

Declare all the source files (including the platform-specific ones) at the beginning.

Adding "simple" in the function name brings no benefit.

On Linux, waitpid() both waits for the process to terminate and reaps it (closes its handle). On Windows, these actions are separated into WaitForSingleObject() and CloseHandle(). Expose these actions separately, so that it is possible to send a signal to a process while waiting for its termination without race condition. This allows to wait for server termination normally, but kill the process without race condition if it is not terminated after some delay.

It had been replaced by struct sc_port_range in scrcpy.h.

The header libavformat/version.h was included, but not libavcodec/version.h. As a consequence, the LIBAVCODEC_VERSION_INT definition depended on the caller includes.

The size, point and position structs were defined in common.h. Move them to coords.h so that common.h could be used for generic code to be included in all source files.

Include config.h and compat.h in common.h, and include common.h from all source files.

This enables necessary functions once for all. As a consequence, define common.h before any other header.

The function control_msg_serialize() returns a size_t.

The function process_wait() returned a bool (true if the process terminated successfully) and provided the exit code via an output parameter exit_code. But the returned value was always equivalent to exit_code == 0, so just return the exit code instead.

There were two versions: process_wait() and process_wait_noclose(). Expose a single version with a flag (it was already implemented that way internally).

The current process could be waited both by run_file_handler() and file_handler_stop(). To avoid the race condition, wait the process without closing, then close with mutex locked.

An "adb push" command is not terminated by SIGTERM.

Terminating the file handler current process may be either a "push" or "install" command.

Small unsigned integers promote to signed int. As a consequence, if v is a uint8_t, then (v << 24) yields an int, so the left shift is undefined if the MSB is 1. Cast to uint32_t to yield an unsigned value. Reported by USAN (meson x -Db_sanitize=undefined): runtime error: left shift of 255 by 24 places cannot be represented in type 'int'

The port_range is used from "struct server_params", the copy in "struct server" was unused.

Make strdup() available on all platforms.

The functions SDL_malloc(), SDL_free() and SDL_strdup() were used only because strdup() was not available everywhere. Now that it is available, use the native version of these functions.

The goal is to expose a consistent API for system tools, and paves the way to make the "core" independant of SDL in the future.

Add a function to assert that the mutex is held (or not).

There were only two frames simultaneously: - one used by the decoder; - one used by the renderer. When the decoder finished decoding a frame, it swapped it with the rendering frame. Adding a third frame provides several benefits: - the decoder do not have to wait for the renderer to release the mutex; - it simplifies the video_buffer API; - it makes the rendering frame valid until the next call to video_buffer_take_rendering_frame(), which will be useful for swscaling on window resize.

The flag is used only locally, there is no need to store it in the screen structure.

Video buffer is a tool between a frame producer and a frame consumer. For now, it is used between a decoder and a renderer, but in the future another instance might be used to swscale decoded frames.

As soon as the stream is started, the video buffer could notify a new frame available. In order to pass this event to the screen without race condition, the screen must be initialized before the screen is started.

Make the decoder independant of the SDL even mechanism, by making the consumer register a callback on the video_buffer.

A skipped frame is detected when the producer offers a frame while the current pending frame has not been consumed. However, the producer (in practice the decoder) is not interested in the fact that a frame has been skipped, only the consumer (the renderer) is. Therefore, notify frame skip via a consumer callback. This allows to manage the skipped and rendered frames count at the same place, and remove fps_counter from decoder.

Most of the fields are initialized dynamically.

The function screen_init_rendering had too many parameters.

Use a single function to initialize the screen instance.

During a frame swap, one of the two frames involved can be released.

This helps to use mingw toolchains which are not in /usr/bin path. PR #2185 <#2185> Signed-off-by: Romain Vimont <rom@rom1v.com>

The option is --encoder, not --encoder-name.

Virtual device is only for keyboard sources, not mouse or touchscreen sources. Here is the value of InputDevice.getDevice(-1).toString(): Input Device -1: Virtual Descriptor: ... Generation: 2 Location: built-in Keyboard Type: alphabetic Has Vibrator: false Has mic: false Sources: 0x301 ( keyboard dpad ) InputDevice.getDeviceId() documentation says: > An id of zero indicates that the event didn't come from a physical > device and maps to the default keymap. <https://developer.android.com/reference/android/view/InputEvent#getDeviceId()> However, injecting events with a device id of 0 causes event.getDevice() to be null on the client-side. Commit 26529d3 used -1 as a workaround to avoid a NPE on a specific Android TV device. But this is a bug in the device system, which wrongly assumes that input device may not be null. A similar issue was present in Flutter, but it is now fixed: - <flutter/flutter#30665> - <flutter/engine#7986> On the other hand, using an id of -1 for touchscreen events (which is invalid) causes issues for some apps: <#2125 (comment)> Therefore, use a device id of 0. An alternative could be to find an existing device matching the source, like "adb shell input" does. See getInputDeviceId(): <https://android.googlesource.com/platform/frameworks/base.git/+/master/cmds/input/src/com/android/commands/input/Input.java> But it seems better to indicate that the event didn't come from a physical device, and it would not solve #962 anyway, because an Android TV has no touchscreen. Refs #962 <#962> Fixes #2125 <#2125>

BUTTON_PRIMARY must not be set for touch events: > This button constant is not set in response to simple touches with a > finger or stylus tip. The user must actually push a button. <https://developer.android.com/reference/android/view/MotionEvent#BUTTON_PRIMARY> Fixes #2169 <#2169>

PR #2052 <#2052> Signed-off-by: Romain Vimont <rom@rom1v.com>

PR #824 <#824> Signed-off-by: Yu-Chen Lin <npes87184@gmail.com> Signed-off-by: Romain Vimont <rom@rom1v.com>

rom1v · 2021-04-03T15:38:43Z

Thank you for your work 👍

Test

I managed to make it work on /dev/video2 (/dev/video0 is my webcam, and I'm not sure why my /dev/video1 is used for):

./run d --v4l2sink /dev/video2 -N

I play it with:

vlc --network-caching=0 v4l2:///dev/video2

However, there is sometimes a huge delay between the device and what is displayed in VLC. Especially at the beginning (seconds of delay), then a bit less but still a big delay. It a bit better if I limit the size (-m800), but there is still an important delay.
Do you experience the same issue?

Branch

Contributions should be based on dev, not master (which is basically the latest release): https://github.com/Genymobile/scrcpy/blob/master/BUILD.md#branches

Unfortunately, a lot of refactors/renaming occurred recently on dev, so it will impact your code 😕 (use strdup() directly instead of SDL_strdup(), the mutex helpers have changed, etc.).

Comments

It has not been tested without SCRCPY_LAVF_HAS_NEW_CODEC_PARAMS_API so there may be problems there.

No worries, at some point I will probably remove support for the old APIs anyway.

so like decoder.c it decodes the AVPacket to an AVFrame that is then re-encoded into an AVPacket with AV_CODEC_ID_RAWVIDEO.

If the video is also displayed in a window, the stream will be decoded twice. Maybe it could be refactored to use the decoded frames only once (but it could be done later separately).

There is a limitation at the moment, it doesn't handle screen rotation, but it seems to be a limit of v4l2loopback where you can't change the resolution as long as anyone is using the device.

By default, --lock-video-orientation is set to -1 (unlocked): https://github.com/Genymobile/scrcpy#lock-video-orientation
If v4l2 is enabled, -1 could be forbidden and the default set to 0.

Bug

Also, at this point this is a detail, but if you pass an existing output device (/dev/video3 for example), ASAN reports a use-after-free:

output

$ ./run d --v4l2sink /dev/video3 -N
INFO: scrcpy 1.17 <https://github.com/Genymobile/scrcpy>
DEBUG: Using SCRCPY_SERVER_PATH: d/server/scrcpy-server
d/server/scrcpy-server: 1 file pushed, 0 skipped. 4.9 MB/s (87873 bytes in 0.017s)
[server] INFO: Device: LGE Nexus 5 (Android 6.0.1)
DEBUG: Starting stream thread
ERROR: Failed to open output device: /dev/video3
ERROR: Could not open v4l2sink
INFO: Finishing v4l2sink...
=================================================================
==2513813==ERROR: AddressSanitizer: heap-use-after-free on address 0x61b0000000a0 at pc 0x55af42b18af7 bp 0x7f16b9ffd0b0 sp 0x7f16b9ffd0a8
READ of size 8 at 0x61b0000000a0 thread T2 (stream)
[server] DEBUG: Using encoder: 'OMX.qcom.video.encoder.avc'
    #0 0x55af42b18af6 in v4l2sink_close ../app/src/v4l2sink.c:219
    #1 0x55af42b12b24 in run_stream ../app/src/stream.c:293
    #2 0x7f16cadf56de  (/usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0+0x8b6de)
    #3 0x7f16cae7beb8  (/usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0+0x111eb8)
    #4 0x7f16ca1e7ea6 in start_thread nptl/pthread_create.c:477
    #5 0x7f16ca304dee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfddee)

0x61b0000000a0 is located 32 bytes inside of 1504-byte region [0x61b000000080,0x61b000000660)
freed by thread T2 (stream) here:
    #0 0x7f16cca9cb6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
    #1 0x55af42b18476 in v4l2sink_open ../app/src/v4l2sink.c:189
    #2 0x55af42b12050 in run_stream ../app/src/stream.c:231
    #3 0x7f16cadf56de  (/usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0+0x8b6de)

previously allocated by thread T2 (stream) here:
    #0 0x7f16cca9da3c in __interceptor_posix_memalign ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:226
    #1 0x7f16caf32b64 in av_malloc (/usr/lib/x86_64-linux-gnu/libavutil.so.56+0x3cb64)
    #2 0x55af42b12050 in run_stream ../app/src/stream.c:231
    #3 0x7f16cadf56de  (/usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0+0x8b6de)

Thread T2 (stream) created by T0 here:
    #0 0x7f16cca482a2 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:214
    #1 0x7f16cae7bf20  (/usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0+0x111f20)

SUMMARY: AddressSanitizer: heap-use-after-free ../app/src/v4l2sink.c:219 in v4l2sink_close
Shadow bytes around the buggy address:
  0x0c367fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c367fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c367fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c367fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c367fff8000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c367fff8010: fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd
  0x0c367fff8020: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c367fff8030: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c367fff8040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c367fff8050: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c367fff8060: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==2513813==ABORTING

(you can enable ASAN with meson configure x -Db_sanitize=address)

martinellimarco · 2021-04-03T17:53:37Z

Hi, thank you for reviewing this. I wasn't expecting a reply this fast.

For VLC I experience the same, but I have the same problem with a capture card and another webcam, I think that VLC is not really meant for realtime capture.

With ffplay -i /dev/videoN or OBS I have no lag whatsoever. As a comparison I have on my screen the scrcpy window near ffplay or OBS and the output is the same on both windows.

Thank you for pointing out I had to use the dev branch. I was so exited to work on this that i completely forgot to check the obsvious.

It's not a big problem, I did this project in a few hours, I think I'll manage to rebase it quickly.

About your comments, I agree that the frame is decoded twice and I don't really like it either. In fact my first try was to share the video buffer to get the decoded frame. That was working very well when the video was displayed in a window but otherwise there was some logic to change around in screen.c and decoder.c regarding EVENT_NEW_FRAME and I didn't want to change too much outside my file, even considering the fact that this is only limited to linux.

I can rework it to share the video buffer and decode the frame once if you prefer so.

I will follow your suggestion about --lock-video-orientation and set a default of 0.

martinellimarco · 2021-04-03T22:25:31Z

Please ignore this, I made a mess pushing my dev branch to master. I will create a new pull request.

It allows to send the video stream to /dev/videoN, so that it can be captured (like a webcam) by any V4L2-compatible tool. Refs #2232 <#2232> Refs #2233 <#2233> Co-authored-by: Romain Vimont <rom@rom1v.com>

It allows to send the video stream to /dev/videoN, so that it can be captured (like a webcam) by any v4l2-capable tool. PR #2232 <#2232> PR #2233 <#2233> PR #2268 <#2268> Co-authored-by: Romain Vimont <rom@rom1v.com>

rom1v · 2021-04-25T13:04:03Z

Merged into dev (#2268).

rom1v added 30 commits January 4, 2021 08:16

Increase display id range

aa8b571

Some devices use big display id values. Refs #2009 <#2009>

Split command into process and adb

4bd9da4

The process API provides the system-specific implementation, the adb API uses it to expose adb commands.

Move conditional src files in meson.build

cc6f502

Declare all the source files (including the platform-specific ones) at the beginning.

Rename process_simple_wait to process_wait

821c175

Adding "simple" in the function name brings no benefit.

Remove unused struct port_range

1e21519

It had been replaced by struct sc_port_range in scrcpy.h.

Fix compat missing include

037be4a

The header libavformat/version.h was included, but not libavcodec/version.h. As a consequence, the LIBAVCODEC_VERSION_INT definition depended on the caller includes.

Move common structs to coords.h

6385b8c

The size, point and position structs were defined in common.h. Move them to coords.h so that common.h could be used for generic code to be included in all source files.

Group common includes into common.h

59feb2a

Include config.h and compat.h in common.h, and include common.h from all source files.

Define feature test macros in common.h

ab912c2

This enables necessary functions once for all. As a consequence, define common.h before any other header.

Factorize meson compiler variable initialization

8dbb167

Fix size_t incorrectly assigned to int

94eff0a

The function control_msg_serialize() returns a size_t.

Simplify process_wait()

b8edcf5

The function process_wait() returned a bool (true if the process terminated successfully) and provided the exit code via an output parameter exit_code. But the returned value was always equivalent to exit_code == 0, so just return the exit code instead.

Expose a single process_wait()

6a50231

There were two versions: process_wait() and process_wait_noclose(). Expose a single version with a flag (it was already implemented that way internally).

Fix file_handler process race condition

7afd149

The current process could be waited both by run_file_handler() and file_handler_stop(). To avoid the race condition, wait the process without closing, then close with mutex locked.

Kill process with SIGKILL signal

b566700

An "adb push" command is not terminated by SIGTERM.

Improve file handler error message

d8e9ad2

Terminating the file handler current process may be either a "push" or "install" command.

Remove unused custom event

8e83f3e

Remove unused port_range field

ace438e

The port_range is used from "struct server_params", the copy in "struct server" was unused.

Provide strdup() compat

c0dde0f

Make strdup() available on all platforms.

Replace SDL_strdup() by strdup()

30e619d

The functions SDL_malloc(), SDL_free() and SDL_strdup() were used only because strdup() was not available everywhere. Now that it is available, use the native version of these functions.

Wrap SDL thread functions into scrcpy-specific API

f6320c7

The goal is to expose a consistent API for system tools, and paves the way to make the "core" independant of SDL in the future.

Expose thread id

d2689fc

Expose mutex assertions

21d206f

Add a function to assert that the mutex is held (or not).

Add mutex assertions

54f5c42

Assert non-recursive usage of mutexes

c53bd4d

Make use_opengl local

862948b

The flag is used only locally, there is no need to store it in the screen structure.

Log mipmaps error only if mipmaps are enabled

a566635

rom1v and others added 16 commits March 6, 2021 22:58

Make video buffer more generic

441d3fb

Video buffer is a tool between a frame producer and a frame consumer. For now, it is used between a decoder and a renderer, but in the future another instance might be used to swscale decoded frames.

Initialize screen before starting the stream

c50b958

As soon as the stream is started, the video buffer could notify a new frame available. In order to pass this event to the screen without race condition, the screen must be initialized before the screen is started.

Use a callback to notify a new frame

fb9f984

Make the decoder independant of the SDL even mechanism, by making the consumer register a callback on the video_buffer.

Remove screen static initializer

955da3b

Most of the fields are initialized dynamically.

Group screen parameters into a struct

597c54f

The function screen_init_rendering had too many parameters.

Simplify screen initialization

cc48b24

Use a single function to initialize the screen instance.

Factorize frame swap

386f017

Release frame data as soon as possible

eb7e107

During a frame swap, one of the two frames involved can be released.

meson: Do not use full path with mingw tools name

d1789f0

This helps to use mingw toolchains which are not in /usr/bin path. PR #2185 <#2185> Signed-off-by: Romain Vimont <rom@rom1v.com>

Fix encoder parameter suggestion

429fdef

The option is --encoder, not --encoder-name.

Pass scrcpy-noconsole arguments through to scrcpy

dd453ad

PR #2052 <#2052> Signed-off-by: Romain Vimont <rom@rom1v.com>

Export static method to power off screen in Device

fb0bcae

PR #824 <#824> Signed-off-by: Yu-Chen Lin <npes87184@gmail.com> Signed-off-by: Romain Vimont <rom@rom1v.com>

Support power off on close

1d615a0

PR #824 <#824> Signed-off-by: Yu-Chen Lin <npes87184@gmail.com> Signed-off-by: Romain Vimont <rom@rom1v.com>

v4l2loopback support

77447ff

martinellimarco changed the base branch from master to dev April 3, 2021 22:17

martinellimarco closed this Apr 3, 2021

martinellimarco mentioned this pull request Apr 3, 2021

Support for v4l2loopback #2233

Closed

rom1v mentioned this pull request Apr 19, 2021

Add v4l2loopback support #2268

Closed

rom1v added v4l2 webcam labels Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for v4l2loopback #2232

Support for v4l2loopback #2232

martinellimarco commented Apr 3, 2021

rom1v commented Apr 3, 2021

martinellimarco commented Apr 3, 2021

martinellimarco commented Apr 3, 2021

rom1v commented Apr 25, 2021

Support for v4l2loopback #2232

Support for v4l2loopback #2232

Conversation

martinellimarco commented Apr 3, 2021

How to test

How it works

RFC on known limitation

rom1v commented Apr 3, 2021

Test

Branch

Comments

Bug

martinellimarco commented Apr 3, 2021

martinellimarco commented Apr 3, 2021

rom1v commented Apr 25, 2021