Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orchestrator Discovery Changes #604

Merged
merged 22 commits into from Dec 10, 2018
Merged

Conversation

angyangie
Copy link
Contributor

@angyangie angyangie commented Nov 26, 2018

O Discovery Changes

CLI Flags

Broadcaster needs to be started with an -orchAddr address:port option in order to specify the orchestrator to use. This is an interim solution while the full negotiation piece is being developed.

If using a standalone transcoder, needs to be started with -standaloneTranscoder, along with the existing -orchAddr and -orchSecret options. This was needed to disambiguate broadcaster mode from standalone transcoder mode, since we opted to not require any more flags for the broadcaster with its use of -orchAddr. Discuss this here: #605

Segment Signing

Segment signature needs to be updated to incorporate the desired profiles from the orchestrator.

Old signature:

sig = broadcaster.Sign(streamId | seqNo | segment.dataHash)

This data was taken from the ethTypes.Segment struct which reflects how the information is processed on-chain during verification. Since we're not updating the on-chain processing right now, there is a new struct core.SegmentMetadata from which the signature is constructed. The old eth/types.Segment struct is left unmodified.

New signature:

type SegmentMetadata struct {
	ManifestID ManifestID
	Seq        int64
	Hash       ethcommon.Hash
	Profiles   []ffmpeg.VideoProfile
	OS         *net.OSInfo
}

sig = broadcaster.sign(manifestId | seqNo | profiles | segment.dataHash)

The ManifestID is used here rather than the StreamID since it is a little bit shorter, and the additional (often incorrect) information carried by the appended rendition string is not needed.

The composition of the signature and the representation of the profile is subject to change once we have flexible transcoding options via non-deterministic verification and price menus, but for now we continue to use the existing []ffmpeg.VideoProfile bytes-representation created by concatenating strings from a lookup table.

Networking

  • Renamed GetTranscoder : TranscoderRequest -> TranscoderInfo to GetOrchestrator : OrchestratorRequest -> OrchestratorInfo

The GetOrchestrator request is the discovery mechanism.

// This request is sent by the broadcaster in `GetOrchestrator` to request
// information on which transcoder to use.
message OrchestratorRequest {

  // Ethereum address of the broadcaster
  bytes address = 1;

  // Broadcaster's signature over its address
  bytes sig   = 2;
}

// The orchestrator sends this in response to `GetTranscoder`, containing the
// transcoder URI, associated credentials authorizing the broadcaster to
// use the transcoder, and miscellaneous data related to the job.
message OrchestratorInfo {

  // URI of the transcoder to use for submitting segments.
  string transcoder  = 1;

  // Signals the authentication method to expect within `credentials`. This
  // field is opaque to the broadcaster, and should be passed to the transcoder.
  string authType    = 2;

  // Credentials to verify the request has been authorized by an orchestrator.
  // This field is opaque to the broadcaster.
  string credentials = 3;

  // Orchestrator returns info about own input object storage, if it wants it to be used.
  repeated OSInfo storage = 32;
}

Note it is likely GetOrchestrator will need to be invoked twice: once during the periodic O discovery, and again before a broadcast starts for more up-to-information such as storage credentials and PM parameters.

In that respect, the OrchestratorInfo.Storage field may seem redundant for O Discovery alone, but otherwise the differences are not enough to warrant a separate RPC type for the two GetOrchestrator invocations.

Broadcaster interface session split

Separate out transient per-session data from the more permanent interface exposing the broadcaster's address and its signing capability. Previously, they were mixed, which led to unclear semantics around the data required for O Discovery.

O Discovery and segment submission only requires the address and its signature as give by GetOrchestrator.

type Broadcaster interface {
	Address() ethcommon.Address
	Sign([]byte) ([]byte, error)
}

// Session-specific state for broadcasters
type BroadcastSession struct {
	Broadcaster      Broadcaster
	ManifestID       core.ManifestID
	Profiles         []ffmpeg.VideoProfile
	OrchestratorInfo *net.OrchestratorInfo
	OrchestratorOS   drivers.OSSession
	BroadcasterOS    drivers.OSSession
}

Orchestrator S3 Path Layout

The locations of the source and transcoded results may differ in their path prefixes. See this writeup for more details on the issue. We generate different prefixes each time the session is initialized (corresponding to the first solution listed).

@angyangie angyangie force-pushed the ar/define-orchestrator-interface branch 4 times, most recently from 131c415 to 1409fa6 Compare November 26, 2018 22:02
@@ -154,6 +155,10 @@ func main() {
n.NodeType = core.Broadcaster
}

// if *orchAddr != "" {
// offchainOrch := n.OrchestratorSelector.NewOffchainOrchestrator(*orchAddr)
Copy link
Contributor Author

@angyangie angyangie Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the way things are now, we'd have to use offchainOrch as an input to StartMediaServer, then gotRTMPStreamHandler, and startBroadcast ... unless we added an address field to NewLivepeerNode in livepeernode.go. What do you think @j0sh ?

@angyangie angyangie force-pushed the ar/define-orchestrator-interface branch 2 times, most recently from e840d2d to 18cb9f0 Compare November 26, 2018 23:14
@j0sh j0sh force-pushed the ar/define-orchestrator-interface branch 5 times, most recently from 09f536b to 5f24456 Compare November 28, 2018 07:03
@angyangie angyangie force-pushed the ar/define-orchestrator-interface branch from 48cb456 to a564e76 Compare November 29, 2018 17:22
@j0sh j0sh force-pushed the ar/define-orchestrator-interface branch 2 times, most recently from 0e9a0c7 to 3496ef1 Compare November 29, 2018 17:53
@angyangie angyangie force-pushed the ar/define-orchestrator-interface branch 2 times, most recently from 4822197 to 52c1131 Compare November 30, 2018 16:46
This was referenced Dec 5, 2018
@angyangie angyangie force-pushed the ar/define-orchestrator-interface branch from 7e2a07b to 3fbcd2b Compare December 5, 2018 15:17
@angyangie angyangie changed the title Implement Offchain OrchestratorSelector interface Orchestrator Discovery Changes Dec 5, 2018
@j0sh j0sh force-pushed the ar/define-orchestrator-interface branch 3 times, most recently from 0f1d85f to 8c05a92 Compare December 5, 2018 22:25
@angyangie angyangie force-pushed the ar/define-orchestrator-interface branch from 8c05a92 to bf1058d Compare December 5, 2018 23:09
@j0sh j0sh force-pushed the ar/define-orchestrator-interface branch from bf1058d to 6b02eed Compare December 5, 2018 23:34
@@ -21,9 +23,33 @@ const (
HashLength = 32
)

type SegmentMetadata struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave a comment here explaining why we are tracking ManifestID instead of StreamID? (Already explained in the PR notes, but want to make sure it stays with the code)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we explain it? The PR notes work for reviewers who understand the previous state of the codebase and need to relate to how it's changed, but it seems a little strange to refer to "previous state" for comments in an updated codebase.

Presumably the PR note is this one:

The ManifestID is used here rather than the StreamID since it is a little bit shorter, and the additional (often incorrect) information carried by the appended rendition string is not needed.

Copy link
Member

@ericxtang ericxtang Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to explain the semantics of what it means to have a ManifestID and multiple VideoProfile in SegmentMetadata. Theoretically, a video segment can ONLY have 1 VideoProfile. It looks like we are only use SegmentMetadata to signal transcoding preference. I don't want someone to misuse this for another purpose (for example, representing metadata for a specific segment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SegmentMetadata does represent metadata for a specific segment, since it also holds the segment Hash and Seq.

In theory the Profiles and OS could also change per-segment as well.

The (only?) reason ManifestID is here is to act as an index to look up the transcoding loop. We could mitigate the need for a ManifestID in the metadata if we had kept the AuthToken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The ManifestID is also part of the segment signature, but I don't think it's strictly necessary for Streamflow.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@j0sh and I talked about renaming SegmentMetadata to SegTranscodingMetadata. Can we do that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

README.md Outdated

- `livepeer --rinkeby --orchestrator -orchSecret asdf`

The orchSecret is a shared secret used to authenticate remote transcoders.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like secret can be any arbitrary string? If so, let's put it in the comment.

@@ -187,6 +189,17 @@ func main() {
n.NodeType = core.BroadcasterNode
}

if n.NodeType == core.BroadcasterNode {
if *orchAddr == "" {
glog.Info("No orchestrator specified; transcoding will not happen")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we exit here for now? In the next iteration when we implement O discovery via onchain info, we can take it out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but it'd mean stopping support for view-only source streams.

Although I suppose we might also want to figure out how to support view-only mode once we use onchain info.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see. Ok I'm on board for keeping things this way then. view-only feels useful.

@@ -90,6 +91,31 @@ func TxDataToVideoProfile(txData string) ([]ffmpeg.VideoProfile, error) {
return profiles, nil
}

func BytesToVideoProfile(txData []byte) ([]ffmpeg.VideoProfile, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BytesToVideoProfile and VideoProfileToBytes have different expectation of what "Bytes" should be. For symmetry sake, can we make them the same?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this? There isn't a VideoProfileToBytes anywhere. However, there is a TxDataToVideoProfile which takes a hex-encoded string.

We can express TxDataToVideoProfile in terms of BytesToVideoProfile by decoding the hex input into bytes, but I think we'll be removing TxDataToVideoProfile entirely once we do a second pass of cleanups -- there's code in the ethclient and DB that still depends on it.

Copy link
Member

@ericxtang ericxtang Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VideoProfileToBytes is in common/videoprofile_ids.go. Just realized they have different semantics (this is just an unfortunate coincidence of private function naming)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, videoProfileToBytes -- caps.

I can rename the videoProfileToBytes function to makeVideoProfileByteMap if that's clearer.

@@ -455,7 +454,7 @@ func SubmitSegment(bcast Broadcaster, seg *stream.HLSSegment, nonce uint64) (*ne
data = []byte(seg.Name)
}

ti := bcast.GetTranscoderInfo()
ti := sess.OrchestratorInfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename ti to oi?

more generally, can you share the ideas behind using very short variable names? (for example in this case I'm inclined to name this variable at least oInfo if not orchInfo or even longer :) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more generally, can you share the ideas behind using very short variable names?

Just a particular style for short-lived local variables. The meaning can be clearly seen from the variable declaration, although you are correct here it would've been better with oi.

This is mostly my particular style of programming, but it seems to be shared by the golang community. See https://talks.golang.org/2014/names.slide

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for sharing your general school of thought! I like the content in the link you shared :)

uri, err := url.ParseRequestURI(addr)
if err != nil {
glog.Error("Could not parse orchestrator URI: ", err)
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not follow the pattern of returning two return values (o, err) like we do in other places, e.g. NewLivepeerNode?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that in the next round of updates.

@eladmallel
Copy link
Contributor

Making sure I fully understand this - why do we need B to share their address and signature in OrchestratorRequest?

@angyangie
Copy link
Contributor Author

angyangie commented Dec 10, 2018

@eladmallel we use both in verifyOrchestratorReq to verify the Orchestrator request! But I think you might be alluding to the fact that we can make the Sig from the address, so why store both?

@j0sh
Copy link
Contributor

j0sh commented Dec 10, 2018

why do we need B to share their address and signature in OrchestratorRequest?

The idea is to give O an opportunity to distinguish who's sending the request and authenticate the sender; that can't (currently) be inferred from the TLS connection alone.

@angyangie
Copy link
Contributor Author

@j0sh now that Elad brought it up .. why can't we just store the address and just generate the Sig when we need it?

@j0sh
Copy link
Contributor

j0sh commented Dec 10, 2018

why can't we just store the address and just generate the Sig when we need it?

Perhaps I'm missing something, but that's pretty much what we do now.

@angyangie
Copy link
Contributor Author

angyangie commented Dec 10, 2018

@j0sh I mean, instead of storing both in the protobuf, we could just do sig, err := b.Sign([]byte(fmt.Sprintf("%v", b.Address().Hex()))) in verifyOrchestratorReq and only have to store address in the protobuf. Not a big deal at all, but was just wondering if there was a reason we were storing the sig and the address in the protobuf! I haven't worked with protobufs for a long time, but it seems to me thus far that we use the protobuf to store variables that will be used all over the codebase (and save ourselves the trouble of having to pass these variables through tons of functions before they're actually used anywhere, etc.)

@j0sh
Copy link
Contributor

j0sh commented Dec 10, 2018

we could just do sig, err := b.Sign([]byte(fmt.Sprintf("%v", b.Address().Hex()))) in verifyOrchestratorReq

The orchestrator can't generate a signature on behalf of B; it can only check a sig given an address, so we need to supply both here.

we use the protobuf to store variables that will be used all over the codebase

The only one I can think of is OrchestratorInfo. Are there others?

It's debatable whether we should continue using the net.OrchestratorInfo struct to propagate orchestrator state throughout the broadcast flow, because what we need to transmit on-the-wire (via protobufs) may not necessarily correspond directly to the state that we need to track internally on the node. But for now, there isn't much of a discrepancy, so it's OK until things change.

having to pass these variables through tons of functions before they're actually used anywhere, etc

Can you elaborate on where this is an issue?

@eladmallel
Copy link
Contributor

My question is specifically around what does O gain from being able to authenticate the sender of the initial OrchestratorRequest?

Authentication is obviously key when receiving a request to do work. However, in this simple request for information, I'm not sure I understand the importance of authentication.

Does this clarify the question better? @j0sh @angyangie

@j0sh
Copy link
Contributor

j0sh commented Dec 10, 2018

@eladmallel O can customize the response based on the sender. Preferential pricing or parameterization, etc.

@angyangie angyangie deleted the ar/define-orchestrator-interface branch January 7, 2019 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants