Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate Go-FFmpeg Bindings #24

Closed
ericxtang opened this issue May 31, 2017 · 5 comments
Closed

Evaluate Go-FFmpeg Bindings #24

ericxtang opened this issue May 31, 2017 · 5 comments

Comments

@ericxtang
Copy link
Member

Right now we invoke ffmpeg through a command. This is bad practice. We should use a native go binding for ffmpeg.

Examples:

Does anyone know which package is better?

@j0sh
Copy link
Collaborator

j0sh commented Dec 20, 2017

Livepeer Native FFmpeg

Goal Perform media processing using the native FFmpeg API within LPMS.

Why Allows shipping LPMS without distributing or requiring additional runtime dependencies (whether executables or shared libraries), taking advantage of Go's ability to produce self-contained executables.

Third Party cgo bindings

Many exist, in various states of upkeep, and with various levels of sugar wraping the APIs. The nice thing is they generally allow us to write more idiomatic Go while hiding the gritty details of the libav API. The drawback is that the gritty details of the libav API may be hidden, when we most need it. From a cursory inspection of several projects, the API sugar does not seem like an issue for the current goals of LPMS, although some projects are missing a few features that would have to be ported in.

Here are some of the better bindings, and a very quick, incomplete perusal of what would be required to make these production grade.

Custom cgo bindings

These would be bindings we write ourselves.

  • One fewer dependency
  • We can expose to Go exactly the API that we need for Go, minimizing the LPMS contact surface with cgo.
  • Statically linking libav appears easier this way
  • We can bundle exactly what we need: we don't need libavdevice, or libavfilter, and can pin to an exact version of libav.

Threading

The biggest concern with using third party libav-cgo bindings are heavyweight operations disrupting goroutine scheduling. However, we should still be less prone to issues such as [3][4] since the number of encodes can be precisely bounded.

Generally, the fewer times we have to cross the go-cgo boundary, the better. In practice, for a custom cgo implementation, this would mean an entirely self-contained API similar to the following:

int segmentVideo(char *inputStream, char *outputPrefix, ...segmentParams)
{
... read input and write output until EOF ...
}

int transcodeVideo(char *inputStream, char *outputPrefix, ...codecParams)
{
... read input and write output until EOF ...
}

Note these APIs would block until the input EOFs. This leads to straightforward semantics for goroutine interaction: each API call would get its own thread without the risk of unexpected thread growth.

Open questions:

Cgo within a goroutine gets its own thread. Does each concurrent cgo thread count against GOMAXPROCS? Could we have trouble squeezing out more concurrent encodes even if a cgo thread is blocked waiting for input from within ffmpeg? If so, we may need to manage media processing outside the Go runtime entirely, using pthreads. Wild idea: do this within Rust, which will statically verify thread safety and expose a C-compatible FFI.

Handling output segments

There are various ways to handle the output segments, but those can be handled entirely within Go. FFmpeg can generate the HLS manifests for us. We can maintain the current method of polling the file system for changes, use file system notifications [1], or feed back information on new segments via message passing.

Licensing

FFmpeg itself is LGPL, but some of the better codecs are GPL'd or nonfree (eg, x264 or FDK AAC). Will linking and distributing these codecs have ramificiations even though LPMS is MIT licensed? In any case, it is good to be aware. Running a separate ffmpeg executable sidesteps this problem somewhat, although I'm not sure if the current usage qualifies as "intimate communication" [2]

Next Steps

Hands-on evaluation of one or two of the go bindings. Check for these:

  1. Static compilation
  2. How much work would be required to make it compatible with latest FFmpeg
  3. API completeness ; can it do everything we need it to do?
  4. Scalability within the Go runtime

Failing that, we'll write custom bindings.

[1] https://github.com/fsnotify/fsnotify

[2] https://www.gnu.org/licenses/gpl-faq.en.html#GPLPlugins

[3] https://www.cockroachlabs.com/blog/the-cost-and-complexity-of-cgo/

[4] https://groups.google.com/forum/#!topic/golang-nuts/8gszDBRZh_4

@dob
Copy link
Member

dob commented Dec 20, 2017

Interesting comment about potentially using Rust to wrap the concurrent encoding threads. This is outside the scope of this task, but one other consideration related to this is the verifiability of the encoding in the Truebit Virtual Machine.

Truebit uses a flavor of web assembly within the VM, so "tasks" written in language that can target WASM can be verified, including C and Rust. But not Go. I figured that when we got to the point of integration we would essentially have to treat the ffmpeg portion of the encoding job as the verifiable piece, and compile the same version of ffmpeg into wasm. And we may run into challenges with boundaries if we natively embed within go code. But the idea of keeping a clean boundary, or wrapping the interface to the task in something like Rust is interesting, because then potentially the same code can be used both for verification and in the node without jumping through hoops.

This is a longer discussion on verification of course, and there are other techniques we can use.

I agree with the next steps. I think in evaluating any of the existing library choices we'd likely have to be comfortable with the fact going in that we'll likely be forking/maintaining the library to keep it up to date for our purposes without necessarily needing to support the full range of the ffmpeg ecosystem. Our use case is pretty limited at the moment.

@ericxtang
Copy link
Member Author

Related to the threading question, we currently execute the transcoding command with -threads 1 because ffmpeg output becomes indeterministic if we allow multiple threads. This is a current limitation due to our verification method.

It's great you brought up the relationship between C threads and GOMAXPROCS. But either way, in my mind it's not a huge issue. People have suggested using global locks to manage the relationships between goroutines and Cgo threads (for example, we could have a lock per-stream to make sure we will have at most the 1 Cgo thread per stream - this would be fine for the live stream case).

About handling the outputs - we are actually currently ignoring the HLS manifest during segmentation. We basically only move the ts segments around and re-construct the manifest at the edge media server. Is there a good reason to keep the original manifest?

Great job with your analysis and looking into a new language. The next steps sound great. If you can keep the code from your experiments, I'd love to be able to try them out on my local machine.

@j0sh
Copy link
Collaborator

j0sh commented Dec 26, 2017

Thanks for the feedback Doug and Eric.

we'd likely have to be comfortable with the fact going in that we'll likely be forking/maintaining the library to keep it up to date for our purposes without necessarily needing to support the full range of the ffmpeg ecosystem.

Maintaining our own bindings is looking increasingly likely, although maybe not any of these libs.

It's great you brought up the relationship between C threads and GOMAXPROCS.

The concern with this is actually whether we might need more CGo threads than we have cores available (or otherwise >GOMAXPROCS). For example, one thread for reading input, and another for encoding (some encoding profiles are likely to be faster than realtime. Or maybe with a shared decoding/demuxing context, we don't want slower profiles to block progress on faster ones.) Maybe in that case, we could schedule them manually. Or it's a non-issue entirely. I'm not sure yet.

Is there a good reason to keep the original manifest?

Not that I can think of right now.

@j0sh
Copy link
Collaborator

j0sh commented Dec 26, 2017

Sample code for stream copy using ffgopeg

https://gist.github.com/j0sh/ffe816e4bca5dd8be92803c597efd8bd#file-readme-md

This does not (yet) correctly copy all frames; additional work on AVPacket is needed. See below.

Overview of changes so far to the go-ffmpeg bindings:

https://github.com/targodan/ffgopeg/compare/develop...j0sh:livepeer?expand=1

Further work required for the bindings

  • AVPacket bindings to set stream ID, timestamps, etc . There are also duplicate AVPacket declarations in Go's avcodec and avformat bindings which makes sense in a transcoding context, but the abstraction leaks when stream copying (transmuxing) -- see the use of unsafe [2]
  • AVFrame bindings before we can allocate buffers correctly for scaling.
  • Clean up depreciation warnings : convert from using deprecated ffmpeg accessor functions to struct member access. We won't be using most of these though (if any). The choice becomes whether to:
    • Maintain support for features we don't need. Hopefully upstream the changes, or take over stewardship of the project for the benefit of the Go community.
    • Maintain our own incompatible fork and drop the features entirely. Over time, as the FFmpeg API evolves, this will converge towards the minimal set of features required for Livepeer.
  • Avio handling is missing entirely. For segment muxers (which we are mostly dealing with here), avio is not needed. However, if we were to, say, write to a single MP4 file in the future, then we'd need to bind avio. With custom avio contexts and the associated C callbacks, this would become even tricker (for example, if we wanted to utilize in-memory buffers).
  • Complete wrapping error codes and function return values into Go-friendly return codes, to present a uniform interface for error checking. For example, output functions generally return ints, while input functions return ReturnCode.
  • Need to double check all the functions we are using for correctness. Fixed a leak here: j0sh/ffgopeg@d49a3cb

Ergonomics

The API is somewhat Go-friendly but still a rather literal mapping to FFmpeg, including the need to do cleanup manually for AVFrame, AVPacket and the various contexts. Essentially, knowledge of the FFmpeg API is required in addition to learning the Go API itself. The benefit of such a literal API mapping seems slight, aside from aviding writing C directly. Even then, there would not be much insulation from C-related errors. In fact, the bindings themselves introduce an additional error surface; see the memory leak that was fixed.

Packaging

Some flaws with go packaging, or perhaps more with how ffgopeg is using it.

The structure of packages such as ffgopeg (with multiple interdependent local packages) makes it more difficult to maintain upstream compatibility while keeping in-progress development repositories in another remote location. Notably, dependent package paths are hard-coded as in in [1], which, in this case, percolates down to a typechecking error. This makes the release repository at gpkg.in the bottleneck for distributing experimental changes using Go's built-in packaging mechanisms. While we can mitigate this at compile-time by manually fixing up the repository in $GOPATH [3], it will become cumbersome as the project scales.

Given the pace of engagement with this package, and the packaging issues, we'd probably be better maintaining our own (incompatible) fork, as has been suggested.

Conclusion and next steps

This is one of the better Go bindings. However, the FFmpeg API is both expansive and a moving target. It would take several weeks to get a production grade integration working and throughly tested, in addition to ongoing maintenance effort. Even then, it is unclear what benefit we get from the bindings themselves as opposed to a more focused Go API that exposes precisely the features we need; for example RtmpToHLS(...) implemented in C.

Note that RtmpToHLS is a function that we'd have to write anyway; whether it's implemented directly in C, CGo or a shell call out to FFmpeg (as is currently done). Hence, implementing the function directly in C would have a quicker turnaround, just because it avoids the intermediate step of fixing the CGo bindings.

[1] https://github.com/j0sh/ffgopeg/blob/livepeer/avcodec/avcodec.go#L21
[2] https://gist.github.com/j0sh/ffe816e4bca5dd8be92803c597efd8bd#file-transmux-go-L79
[3] https://gist.github.com/j0sh/ffe816e4bca5dd8be92803c597efd8bd#file-readme-md

@j0sh j0sh changed the title ffmpeg Native Integration Evaluate Go-FFmpeg Bindings Jan 24, 2018
@j0sh j0sh closed this as completed Jun 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants