Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLS video handling/storage/state refactor #151

Merged
merged 38 commits into from
Oct 14, 2020

Conversation

gabek
Copy link
Member

@gabek gabek commented Sep 16, 2020

I apologize this is so big. It's like half of the app or something. I guess I'd suggest, if you want to look through it, picking a piece at a time.

  • Storage provider changes (local, s3)
  • HLS writing/handling/cleanup
  • Creating offline state (streamstate)

Also by checking out this PR and running it with both local and remote/S3 storage I'd love to know how it works for you. In general it should function pretty much the same as it did before, I'm hoping none of you can even notice the changes!

Overview

This change deals with how we get HLS segments and playlists, and what we do with them after they're received. So it touches the transcoder -> internal hls segment writing -> storage -> cleanup flow.

This change will require a lot of testing before anybody will want to run it in their production environments.

How we determine new HLS files are written

Previously we had a file monitor that would poll the filesystem looking for new HLS chunks and playlist updates. With this new approach we take advantage of ffmpeg's feature of using HTTP to push files to an endpoint.

A new localhost-only HTTP server starts listening and ffmpeg is told to push all transcoding results there. This way ffmpeg is directly handing off transcoding results to us directly instead of writing them to disk and us having to keep tabs on those indirectly. While this sounds and feels kind of weird (using HTTP internally to get results), it seemed like a good option that gave us more control.

File writer receiver service

Because, as mentioned above, we get transcoder results via HTTP we need something to listen for that. This is a new FileWriterReceiverService that listens on localhost only and accepts responses from the transcoder's PUTs.
It writes these files to disk, and then passes the paths to the hlsHandler (below)

Handling HLS updates

There's a new middleman, the hlsHandler that is given a storage provider and is told about HLS updates, and passes them on to the storage provider. While this seems like a useless middle layer at the moment, this is going to be the key in future recordings (#102) functionality. As the hls handler will not only pass live segments to the storage provider, but also be responsible for building the ongoing recording.

Storage providers

Standardized storage providers more. All writing to disk is done to the "private" HLS path now. As a result, "local" is now a first class storage provider that is treated the same as S3. In this case the "Save" task of the local provider is simply moving the file from the private HLS path to the public one (under webroot).

S3, as our remote storage provider, now monitors for long-running save operations and alerts on them in the console.

Changes to how we reference remotely stored HLS content

For remote storage providers previously we would rewrite the variant's HLS playlist and point to the absolute URLs of the remote segments. Now this is simplified. Now we only rewrite the master playlist and upload the variant playlists to remote storage as well as the segments. That means the playlist requests never hit the Owncast server AND all that work having to rewrite the playlist multiple every few seconds is no longer needed because the variant playlists can reference segments by URL relatively.

Offline state

There are two scenarios with generating offline content:

  1. A "reset" state where the only content is the offline video clip. This is what happens when you first start the server and after the 5min reset timer fires. This simply passes the offline clip to the transcoder and treats it like a new, short, stream.

  2. Appending offline clip to an existing stream. This is what happens when a live stream ends and the transcoder completes its work. In this case the existing HLS playlist is manually edited, and a single segment is appended to the end. This takes place for every variant's playlist that's configured.

Change
As a result of scenario 2 I removed the option to have custom offline content in the config file due to the requirement that we must know how long that clip is so it can be appended to the HLS playlist correctly. Also this clip is pre-transcoded to a .ts file, so it can be simply appended without a transcoding step being required. I could see a future where we allow for a little more flexibility for this, but right now it's best to be specific.

Cleanup

Previously we relied on a feature of ffmpeg to delete old files on our behalf. Because the ffmpeg is running decoupled from the owncast instance and talking over HTTP, there's no way for it to delete these files anymore. Instead we have a new hlsFilesystemCleanup that reproduces this functionality. It deletes old, live, segments from disk.

Performance Monitoring

The new performanceTimer.go utility is used for timing the average time it takes to create segments and upload them (if using external storage). It's not totally scientific and it tries to throw out outliers. If it's too long then warnings are displayed in the console:

WARN[2020-10-06T22:45:14-07:00] slow encoding for variant 0 if this continues you may see buffering or errors. troubleshoot this issue by visiting https://owncast.online/docs/troubleshooting/
WARN[2020-10-06T22:45:14-07:00] Possible slow uploads: average upload S3 save duration 5.0839772650000001 ms troubleshoot this issue by visiting https://owncast.online/docs/troubleshooting/

@gabek gabek added this to the v0.0.3 milestone Sep 16, 2020
@gabek gabek marked this pull request as draft September 16, 2020 18:09
@gabek gabek force-pushed the gek/parse-output-replace-filemonitor branch 2 times, most recently from 337f227 to d3b8cee Compare September 22, 2020 04:50
@gabek gabek force-pushed the gek/parse-output-replace-filemonitor branch 5 times, most recently from 9909979 to 88a93c9 Compare September 29, 2020 05:48
@gabek gabek force-pushed the gek/parse-output-replace-filemonitor branch 4 times, most recently from fa3fb37 to cfac939 Compare October 2, 2020 21:26
@gabek gabek force-pushed the gek/parse-output-replace-filemonitor branch from 021eed5 to f5089d0 Compare October 7, 2020 05:22
@gabek gabek force-pushed the gek/parse-output-replace-filemonitor branch from 54014ea to 92c00d8 Compare October 7, 2020 06:08
@gabek gabek marked this pull request as ready for review October 7, 2020 06:08
@gabek gabek changed the title WIP: HLS refactor HLS video handling/storage/state refactor Oct 7, 2020
@gabek gabek force-pushed the gek/parse-output-replace-filemonitor branch from 92c00d8 to 12cd051 Compare October 8, 2020 19:00
@mattdsteele
Copy link
Contributor

Conceptually this all sounds good. If I've gleamed anything from overhearing colleagues discuss k8s integration patterns, using localhost-only HTTP is pretty common these days.

Because the ffmpeg is running decoupled from the owncast instance

Could you describe this a bit more? Is this a change from how it previously worked? I'm just worried about cleaning up orphaned child processes when the main process dies.

I'll review what I can, but maybe it makes more sense for me to just try testing it out on a server, and see how it behaves?

@gabek
Copy link
Member Author

gabek commented Oct 9, 2020

Because the ffmpeg is running decoupled from the owncast instance

Could you describe this a bit more? Is this a change from how it previously worked? I'm just worried about cleaning up orphaned child processes when the main process dies.

It's only slightly different than it is now, but not as far as child processes. That part is the same. The difference is while ffmpeg is still the same child process running on the same machine, ffmpeg doesn't know that. Previously ffmpeg knew to write files to disk, and then could clean up those files later. But now ffmpeg doesn't use the filesystem for storing output, since it's pushing the results elsewhere. Conceptually this could be extrapolated to think how this could actually be running on a completely different server and have the results pushed over the network.

I'd love if you could give it a spin!

@mattdsteele
Copy link
Contributor

Trying it out on my instance, https://stream.steele.blue/

So far it's working great!

One thing to note; I did had to update my S3 config. Previously I was referencing the endpoint just with the hostname:

s3:
  endpoint: us-east-1.linodeobjects.com

I had to update it to https://us-east-1.linodeobjects.com so the paths would work out.

From what I can tell this is all properly referenced in the docs, so I think I just had a weirdly-configured setup that happened to let me upload to S3.

@gabek
Copy link
Member Author

gabek commented Oct 10, 2020

From what I can tell this is all properly referenced in the docs, so I think I just had a weirdly-configured setup that happened to let me upload to S3.

Interesting! I can't think what might have changed around that, but it does point to us maybe wanting to normalize URLs internally when using free-form user-supplied strings. We should keep this in mind for 0.0.4 when we start supporting config updates through the admin site, we could probably do some validation around these values.

@gabek
Copy link
Member Author

gabek commented Oct 13, 2020

I plan to merge this in this week. It's a big change and I think the next step is to get it in so those working off master will get some testing hours against it. I'll also deploy it to a testing server and get some long-duration streams on it. Between now and then let me know if you have any major concerns.

@gabek gabek force-pushed the gek/parse-output-replace-filemonitor branch from 12cd051 to 5d49329 Compare October 14, 2020 21:03
@gabek
Copy link
Member Author

gabek commented Oct 14, 2020

@mattdsteele @graywolf336 @geekgonecrazy @gingervitis @jeyemwey Merging this in! Let me know if you see anything functioning differently.

@gabek gabek merged commit 6ea9aff into master Oct 14, 2020
@gabek gabek deleted the gek/parse-output-replace-filemonitor branch October 14, 2020 21:07
@gabek gabek mentioned this pull request Nov 18, 2020
gabek pushed a commit that referenced this pull request Apr 26, 2022
Bumps [antd](https://github.com/ant-design/ant-design) from 4.15.3 to 4.15.4.
- [Release notes](https://github.com/ant-design/ant-design/releases)
- [Changelog](https://github.com/ant-design/ant-design/blob/master/CHANGELOG.en-US.md)
- [Commits](ant-design/ant-design@4.15.3...4.15.4)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants