New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Split orchestrator and transcoder #575

Open
wants to merge 7 commits into
base: master
from

Conversation

1 participant
@j0sh
Contributor

j0sh commented Oct 5, 2018

The mechanics of the split are as follows:

Add a Transcoder interface, along with concrete implementations of LocalTranscoder and RemoteTranscoder .

type Transcoder interface {
	Transcode(fname string, profiles []ffmpeg.VideoProfile) ([][]byte, error)
}

Each orchestrator maintains a list of transcoders. The local transcoder can be included in this list. Whether the local transcoder is included depends on whether the orchestrator uses the -transcoder flag (which enables local transcoding) or the -orchestrator flag (which disables local transcoding). Discussion of the flags happened here.

Currently, only the first transcoder in the list is used. To test remote transcoders, run the orchestrator in -orchestrator mode and connect a transcoder by starting another Livepeer node with -orchAddr .

The networking follows the "Orchestrator-Transcoder Network Flow" as described here with only minor changes to the message definitions.

Remote transcoders are asynchronously assigned "tasks" via streaming RPC. Results are POST'd directly via HTTP multipart along with the the task ID. Notification from HTTP to the transcode loop is done via channels, which unblocks the pending segment in the transcode loop. Remote transcoders have a timeout of 8 seconds before cancellation, which partially mitigates #570 .

Further TODOs (may be split into separate PRs or done later)

  • Fix existing tests
  • Add new tests
  • Fallbacks in case a transcoder fails
  • Send each profile to a separate transcoder
  • Transcoder rotation, LRU style
  • Reconnect remote transcoder if connection drops
  • Don't reconnect on unrecoverable errors (eg, invalid secret)
  • Graceful shutdown of transcoder-only nodes (exit listening loop)
  • Disable geth connectivity for transcoder-only nodes
  • Use "Orchestrator" rather than "Transcoder" in CLI
  • Capacity limits (eg, N concurrent transcodes)
  • Write transcoder info to orchestrator DB, incl. remote IP
Transcoder
BroadcasterNode NodeType = iota
OrchestratorNode
TranscoderNode

This comment has been minimized.

@j0sh

j0sh Oct 5, 2018

Contributor

These changes were because of a type conflict with the Transcoder interface that's also defined in core

@j0sh

j0sh Oct 5, 2018

Contributor

These changes were because of a type conflict with the Transcoder interface that's also defined in core

Handler: &lp,
// XXX doesn't handle streaming RPC well; split remote transcoder RPC?
//ReadTimeout: HTTPTimeout,
//WriteTimeout: HTTPTimeout,

This comment has been minimized.

@j0sh

j0sh Oct 5, 2018

Contributor

Discussion around this issue: #576

@j0sh

j0sh Oct 5, 2018

Contributor

Discussion around this issue: #576

@j0sh j0sh added this to Active in Weekly Sprints Oct 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment