-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial API #1
Conversation
Fills out the README file to describe more about TAMS. sem-ver: feature
Adds API definitions from https://github.com/bbc/rd-cloudfit-squirrel-media-store/tree/f701200 Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk> Co-authored-by: Callum Lunn <callum.lunn@bbc.co.uk> Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk> Co-authored-by: James Sandford <james.sandford@bbc.co.uk> Co-authored-by: Robert Wadge <robert.wadge@bbc.co.uk>
Adds an option to allow using the API without any credentials, for testing purposes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the README clear and concise. A couple suggestions inline and otherwise LGTM.
Also adds help text to Make commands
Adds a basic GHA-based CI workflow that renders documentation and uploads it
Also updates CONTRIBUTING.md to note that some parts of the template may not apply.
Applies various suggestions from code review Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>
ffcd5dd
to
b3c06f1
Compare
Adds a note that the schema for Flows is closely related to the AMWA IS-04 schema. Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>
2908ae1
to
745efe6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few changes here and there.
As for diagrams, I think the prose we have here is a good starting point. It is concise and enough to get folk interested. We needn't add extra work with diagrams right this second.
We don't have anything ready to go, because we haven't done anything in this context before. But I suspect there will be a diagram or two that seem obvious next week. We should add them then.
README.md
Outdated
A mock API server will start at <http://localhost:4010> | ||
|
||
## Design | ||
The store handles Flows (based on the Flows in the [AMWA NMOS MS-04 model](https://specs.amwa.tv/ms-04/releases/v1.0.0/docs/2.1._Summary_and_Definitions.html)), which exist on an infinite timeline, are immutable and can be grouped by Sources (from the same document). Each grain of media (_e.g._ frame of video or audio samples) in a Flow can be uniquely addressed by a `<flow_id, timestamp>` tuple (or more commonly a flow ID and timerange). Because the store is immutable, a flow ID and timerange can be used to pass media by reference, only referring back to the store to retrieve the actual media essence when it needs to be transformed or rendered to a screen or speaker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"from the same document", whilst not ambiguous, does bump the reader out of their flow.
Suggest you change bracketed link to a footnote, and add the same footnote link to bloth "flow" and "source". You may want to say "NMOS flow[1]" of "flow[1] (the NMOS term)" or something like that.
In this case, I think adding the oxford comma after immutable in "...infinite timeline, are immutable and can be grouped..." probably helps the reader.
You could break "(or more commonly a flow ID and timerange)" into a new sentence: "A timerange and flow ID refers to a sequence of grains."
Replace the final sentence with:
This unique address for each grain is powerful - since it is guaranteed to refer to a specific frame, or audio samples, it can be safely passed around other tools or programs. At any time the unique address can be exchanged for the media data by an API call. But if that is not needed, media work can be done purely by reference.
And if you write markdown with a line break between sentences, the renderer will group them together into a paragraph without line breaks :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestions here, thanks. I've ended up rewriting this a bit in f69917e: I've moved some of the points around to put Flows and Sources into the same link at the top (trying to avoid footnotes, since then all the links should arguably become footnotes?) and talk about timeranges before timestamps since they're the more useful one anyway. I've taken your words about unique addresses directly.
Yes, for something this long new line = new sentence might have been the way to go 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(trying to avoid footnotes, since then all the links should arguably become footnotes?)
I disagree. A footnote is writing tool. A hyperlink can go anywhere.
api/TimeAddressableMediaStore.yaml
Outdated
# | ||
openapi: 3.1.0 | ||
info: | ||
title: Time Addressable Media Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think "Time-Addressable" is more clearly descriptive of what's going on.
This media store type provides HTTP URLs for uploading and downloading media objects in buckets. | ||
|
||
The response will include a PUT URL for each storage location that a client uses to upload the flow segment | ||
object. The client is expected to register the flow segment using the /flows/{flowId}/segments endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we had agreed to change this to the "auto-registration" workflow described here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change is that you POST to get the list of upload URLs (instead of GET). As you say in that doc, POSTing the entire segment creates a bottleneck, and I don't think you can prevent the client sending the body in order to redirect them to the object store - you'd always need the bandwidth to receive at least some of the segment.
This way you still have to register the segments, but the act of requesting storage URLs is allowed to have side effects, like triggering a process that deletes segments that weren't registered
README.md
Outdated
|
||
Grains are grouped into Flow Segments, containing for example one second of content, wrapped in a container format such as MPEG-TS. The store provides a mechanism to upload and register new segments, and an interface to request all the segments covering a particular timerange and their download URLs; an approach inspired by chunked streaming protocols like HTTP Live Streaming. | ||
|
||
Segments may be stored separately from the metadata linking them to a position on the timeline, separating the metadata and data planes. For example our implementation uses an object store (_e.g._ AWS S3) for the segments and avoids proxying them through our servers, taking advantage of the scalability of cloud object storage. Segments are also de-coupled from their point in the timeline by a link between their `<flow_id, timestamp>` tuple and the underlying object ID, so a single segment can appear at multiple points in multiple flows. This allows for copy-on-write semantics: immutability means a new Flow must be created to make changes to existing parts of the timeline, but for unmodified portions of the timeline that is a metadata operation, and the original Flow's segments are re-used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"avoids proxying them through our servers" takes a bit of unpacking. How about "...for the segments, passing S3 URLs to the client to upload directly, taking advantage..."
The final clause, after the colon, is subtly mangled. (What does "that" refer to in "that is a metadata operation?) I suggest
When a new flow contains material that already exists in the store, the new <flow_id, timestamp>
tuple points to the existing object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - I've tweaked this in f69917e as well
Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk>
e8a3c7e
to
f69917e
Compare
|
||
BBC R&D have demonstrated use of the TAMS approach as part of composable, software-defined workflows which can run in the cloud, on-premise or in a hybrid environment. We've built a prototype implementation of this API, along with services for movement and transformation of streams and files, which serve as the media backend for other projects such as our [remote wildlife camera](https://www.bbc.co.uk/rd/blog/2022-04-video-cloud-media-store-ingest-service) work. For more detail on the architecture and motivation see **TODO: Link to blog**. | ||
BBC R&D have demonstrated use of the TAMS approach as part of composable, software-defined workflows which can run in the cloud, on-premise or in a hybrid environment. | ||
We've built a prototype implementation of this API, along with services for movement and transformation of streams and files, which serve as the media backend for other projects such as our [remote wildlife camera](https://www.bbc.co.uk/rd/blog/2022-04-video-cloud-media-store-ingest-service) work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One sentence per line 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One important typo - the heading. And a couple of sentence-per-lines you missed. Otherwise LGTM
Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk> Co-authored-by: Callum Lunn <callum.lunn@bbc.co.uk> Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk> Co-authored-by: James Sandford <james.sandford@bbc.co.uk> Co-authored-by: Robert Wadge <robert.wadge@bbc.co.uk>
adfd049
to
a94383d
Compare
Details
Pivotal Story
Story URL: https://www.pivotaltracker.com/story/show/185968237
Related PRs
N/A
Links to external test runs/working deployment
N/A
Submitter PR Checks
(tick as appropriate)
Reviewer PR Checks
(tick as appropriate)
Info on Cloudfit PRs