Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial API #1

Merged
merged 18 commits into from
Sep 14, 2023
Merged

Add initial API #1

merged 18 commits into from
Sep 14, 2023

Conversation

samdbmg
Copy link
Member

@samdbmg samdbmg commented Sep 13, 2023

Details

Pivotal Story

Story URL: https://www.pivotaltracker.com/story/show/185968237

Related PRs

N/A

Links to external test runs/working deployment

N/A

Submitter PR Checks

(tick as appropriate)

  • Added bbc/rd-apmm-cloudfit team as a reviewer
  • PR completes task/fixes bug
  • Tests exercise code appropriately
  • New features and API breaks are flagged in commit messages using magic strings
  • Repo maintainer is notified that a release is required
  • Documentation updated (README, Confluence, Docstrings, API spec, Engineering Guide, etc.)
  • Downstream repos have been checked for potential breaks & fixed as needed
  • APIs/UIs/CLIs updated as required
  • PR added to Pivotal story
  • Follow-up stories added to Pivotal
  • Any temporary code/configuration removed (e.g. test deployment environment, temporary commontooling branch)
  • Any pins against pre-releases have been removed

Reviewer PR Checks

(tick as appropriate)

  • PR completes task/fixes bug
  • Tests exercise code appropriately
  • Design makes sense, and fits with our current code base
  • Code is easy to follow
  • PR size is sensible
  • Commit history is sensible and tidy

Info on Cloudfit PRs

samdbmg and others added 5 commits September 13, 2023 15:25
Fills out the README file to describe more about TAMS.

sem-ver: feature
Adds API definitions from
https://github.com/bbc/rd-cloudfit-squirrel-media-store/tree/f701200

Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk>
Co-authored-by: Callum Lunn <callum.lunn@bbc.co.uk>
Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>
Co-authored-by: James Sandford <james.sandford@bbc.co.uk>
Co-authored-by: Robert Wadge <robert.wadge@bbc.co.uk>
Adds an option to allow using the API without any credentials, for
testing purposes
@samdbmg samdbmg requested a review from a team September 13, 2023 14:36
Copy link
Contributor

@philipnbbc philipnbbc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the README clear and concise. A couple suggestions inline and otherwise LGTM.

api/Dockerfile.multi Outdated Show resolved Hide resolved
api/Dockerfile.multi Outdated Show resolved Hide resolved
api/TimeAddressableMediaStore.yaml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
api/TimeAddressableMediaStore.yaml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
samdbmg and others added 7 commits September 14, 2023 09:40
Also adds help text to Make commands
Adds a basic GHA-based CI workflow that renders documentation and
uploads it
Also updates CONTRIBUTING.md to note that some parts of the template may
not apply.
Applies various suggestions from code review

Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>
README.md Outdated Show resolved Hide resolved
@samdbmg samdbmg mentioned this pull request Sep 14, 2023
18 tasks
Adds a note that the schema for Flows is closely related to the AMWA
IS-04 schema.

Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>
Copy link
Contributor

@AndrewGibb AndrewGibb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few changes here and there.

As for diagrams, I think the prose we have here is a good starting point. It is concise and enough to get folk interested. We needn't add extra work with diagrams right this second.

We don't have anything ready to go, because we haven't done anything in this context before. But I suspect there will be a diagram or two that seem obvious next week. We should add them then.

README.md Outdated
A mock API server will start at <http://localhost:4010>

## Design
The store handles Flows (based on the Flows in the [AMWA NMOS MS-04 model](https://specs.amwa.tv/ms-04/releases/v1.0.0/docs/2.1._Summary_and_Definitions.html)), which exist on an infinite timeline, are immutable and can be grouped by Sources (from the same document). Each grain of media (_e.g._ frame of video or audio samples) in a Flow can be uniquely addressed by a `<flow_id, timestamp>` tuple (or more commonly a flow ID and timerange). Because the store is immutable, a flow ID and timerange can be used to pass media by reference, only referring back to the store to retrieve the actual media essence when it needs to be transformed or rendered to a screen or speaker.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"from the same document", whilst not ambiguous, does bump the reader out of their flow.
Suggest you change bracketed link to a footnote, and add the same footnote link to bloth "flow" and "source". You may want to say "NMOS flow[1]" of "flow[1] (the NMOS term)" or something like that.

In this case, I think adding the oxford comma after immutable in "...infinite timeline, are immutable and can be grouped..." probably helps the reader.

You could break "(or more commonly a flow ID and timerange)" into a new sentence: "A timerange and flow ID refers to a sequence of grains."

Replace the final sentence with:
This unique address for each grain is powerful - since it is guaranteed to refer to a specific frame, or audio samples, it can be safely passed around other tools or programs. At any time the unique address can be exchanged for the media data by an API call. But if that is not needed, media work can be done purely by reference.

And if you write markdown with a line break between sentences, the renderer will group them together into a paragraph without line breaks :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestions here, thanks. I've ended up rewriting this a bit in f69917e: I've moved some of the points around to put Flows and Sources into the same link at the top (trying to avoid footnotes, since then all the links should arguably become footnotes?) and talk about timeranges before timestamps since they're the more useful one anyway. I've taken your words about unique addresses directly.

Yes, for something this long new line = new sentence might have been the way to go 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(trying to avoid footnotes, since then all the links should arguably become footnotes?)
I disagree. A footnote is writing tool. A hyperlink can go anywhere.

README.md Outdated Show resolved Hide resolved
#
openapi: 3.1.0
info:
title: Time Addressable Media Store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think "Time-Addressable" is more clearly descriptive of what's going on.

This media store type provides HTTP URLs for uploading and downloading media objects in buckets.

The response will include a PUT URL for each storage location that a client uses to upload the flow segment
object. The client is expected to register the flow segment using the /flows/{flowId}/segments endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had agreed to change this to the "auto-registration" workflow described here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is that you POST to get the list of upload URLs (instead of GET). As you say in that doc, POSTing the entire segment creates a bottleneck, and I don't think you can prevent the client sending the body in order to redirect them to the object store - you'd always need the bandwidth to receive at least some of the segment.
This way you still have to register the segments, but the act of requesting storage URLs is allowed to have side effects, like triggering a process that deletes segments that weren't registered

README.md Outdated

Grains are grouped into Flow Segments, containing for example one second of content, wrapped in a container format such as MPEG-TS. The store provides a mechanism to upload and register new segments, and an interface to request all the segments covering a particular timerange and their download URLs; an approach inspired by chunked streaming protocols like HTTP Live Streaming.

Segments may be stored separately from the metadata linking them to a position on the timeline, separating the metadata and data planes. For example our implementation uses an object store (_e.g._ AWS S3) for the segments and avoids proxying them through our servers, taking advantage of the scalability of cloud object storage. Segments are also de-coupled from their point in the timeline by a link between their `<flow_id, timestamp>` tuple and the underlying object ID, so a single segment can appear at multiple points in multiple flows. This allows for copy-on-write semantics: immutability means a new Flow must be created to make changes to existing parts of the timeline, but for unmodified portions of the timeline that is a metadata operation, and the original Flow's segments are re-used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"avoids proxying them through our servers" takes a bit of unpacking. How about "...for the segments, passing S3 URLs to the client to upload directly, taking advantage..."

The final clause, after the colon, is subtly mangled. (What does "that" refer to in "that is a metadata operation?) I suggest
When a new flow contains material that already exists in the store, the new <flow_id, timestamp> tuple points to the existing object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - I've tweaked this in f69917e as well

Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk>

BBC R&D have demonstrated use of the TAMS approach as part of composable, software-defined workflows which can run in the cloud, on-premise or in a hybrid environment. We've built a prototype implementation of this API, along with services for movement and transformation of streams and files, which serve as the media backend for other projects such as our [remote wildlife camera](https://www.bbc.co.uk/rd/blog/2022-04-video-cloud-media-store-ingest-service) work. For more detail on the architecture and motivation see **TODO: Link to blog**.
BBC R&D have demonstrated use of the TAMS approach as part of composable, software-defined workflows which can run in the cloud, on-premise or in a hybrid environment.
We've built a prototype implementation of this API, along with services for movement and transformation of streams and files, which serve as the media backend for other projects such as our [remote wildlife camera](https://www.bbc.co.uk/rd/blog/2022-04-video-cloud-media-store-ingest-service) work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One sentence per line 🎉

Copy link
Contributor

@AndrewGibb AndrewGibb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important typo - the heading. And a couple of sentence-per-lines you missed. Otherwise LGTM

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
samdbmg and others added 2 commits September 14, 2023 19:05
Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk>
Co-authored-by: Callum Lunn <callum.lunn@bbc.co.uk>
Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>
Co-authored-by: James Sandford <james.sandford@bbc.co.uk>
Co-authored-by: Robert Wadge <robert.wadge@bbc.co.uk>
@samdbmg samdbmg merged commit 3cb29eb into main Sep 14, 2023
2 checks passed
@samdbmg samdbmg deleted the sammg-add-api branch September 14, 2023 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants