Add initial API #1

samdbmg · 2023-09-13T14:36:15Z

Details

Adds current state of API from Squirrel (specifically https://github.com/bbc/rd-cloudfit-squirrel-media-store/tree/f701200)
Adds a README describing TAMS
Switches build process to use public images, and adds a basic CI process
Adds contributing guidelines, ICLA, licence file etc.

Pivotal Story

Story URL: https://www.pivotaltracker.com/story/show/185968237

Related PRs

N/A

Links to external test runs/working deployment

N/A

Submitter PR Checks

(tick as appropriate)

Reviewer PR Checks

(tick as appropriate)

PR completes task/fixes bug
Tests exercise code appropriately
Design makes sense, and fits with our current code base
Code is easy to follow
PR size is sensible
Commit history is sensible and tidy

Info on Cloudfit PRs

The checks above are guidelines. They don't all have to be ticked, but they should all have been considered.
For more details on how to asses PRs, see: https://github.com/bbc/rd-apmm-docs-ways-of-working/blob/master/workflow/PRChecklist.md

Fills out the README file to describe more about TAMS. sem-ver: feature

Adds API definitions from https://github.com/bbc/rd-cloudfit-squirrel-media-store/tree/f701200 Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk> Co-authored-by: Callum Lunn <callum.lunn@bbc.co.uk> Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk> Co-authored-by: James Sandford <james.sandford@bbc.co.uk> Co-authored-by: Robert Wadge <robert.wadge@bbc.co.uk>

Adds an option to allow using the API without any credentials, for testing purposes

philipnbbc

I found the README clear and concise. A couple suggestions inline and otherwise LGTM.

api/Dockerfile.multi

api/TimeAddressableMediaStore.yaml

README.md

api/TimeAddressableMediaStore.yaml

README.md

Also adds help text to Make commands

Adds a basic GHA-based CI workflow that renders documentation and uploads it

Also updates CONTRIBUTING.md to note that some parts of the template may not apply.

Applies various suggestions from code review Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>

README.md

Adds a note that the schema for Flows is closely related to the AMWA IS-04 schema. Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>

AndrewGibb

A few changes here and there.

As for diagrams, I think the prose we have here is a good starting point. It is concise and enough to get folk interested. We needn't add extra work with diagrams right this second.

We don't have anything ready to go, because we haven't done anything in this context before. But I suspect there will be a diagram or two that seem obvious next week. We should add them then.

AndrewGibb · 2023-09-14T09:02:39Z

README.md

+A mock API server will start at <http://localhost:4010>
+
+## Design
+The store handles Flows (based on the Flows in the [AMWA NMOS MS-04 model](https://specs.amwa.tv/ms-04/releases/v1.0.0/docs/2.1._Summary_and_Definitions.html)), which exist on an infinite timeline, are immutable and can be grouped by Sources (from the same document). Each grain of media (_e.g._ frame of video or audio samples) in a Flow can be uniquely addressed by a `<flow_id, timestamp>` tuple (or more commonly a flow ID and timerange). Because the store is immutable, a flow ID and timerange can be used to pass media by reference, only referring back to the store to retrieve the actual media essence when it needs to be transformed or rendered to a screen or speaker. 


"from the same document", whilst not ambiguous, does bump the reader out of their flow.
Suggest you change bracketed link to a footnote, and add the same footnote link to bloth "flow" and "source". You may want to say "NMOS flow[1]" of "flow[1] (the NMOS term)" or something like that.

In this case, I think adding the oxford comma after immutable in "...infinite timeline, are immutable and can be grouped..." probably helps the reader.

You could break "(or more commonly a flow ID and timerange)" into a new sentence: "A timerange and flow ID refers to a sequence of grains."

Replace the final sentence with:
This unique address for each grain is powerful - since it is guaranteed to refer to a specific frame, or audio samples, it can be safely passed around other tools or programs. At any time the unique address can be exchanged for the media data by an API call. But if that is not needed, media work can be done purely by reference.

And if you write markdown with a line break between sentences, the renderer will group them together into a paragraph without line breaks :D

Good suggestions here, thanks. I've ended up rewriting this a bit in f69917e: I've moved some of the points around to put Flows and Sources into the same link at the top (trying to avoid footnotes, since then all the links should arguably become footnotes?) and talk about timeranges before timestamps since they're the more useful one anyway. I've taken your words about unique addresses directly.

Yes, for something this long new line = new sentence might have been the way to go 😄

(trying to avoid footnotes, since then all the links should arguably become footnotes?)
I disagree. A footnote is writing tool. A hyperlink can go anywhere.

README.md

AndrewGibb · 2023-09-14T09:05:54Z

api/TimeAddressableMediaStore.yaml

+#
+openapi: 3.1.0
+info:
+  title: Time Addressable Media Store


I still think "Time-Addressable" is more clearly descriptive of what's going on.

AndrewGibb · 2023-09-14T10:04:31Z

api/TimeAddressableMediaStore.yaml

+        This media store type provides HTTP URLs for uploading and downloading media objects in buckets.
+
+        The response will include a PUT URL for each storage location that a client uses to upload the flow segment
+        object. The client is expected to register the flow segment using the /flows/{flowId}/segments endpoint


I thought we had agreed to change this to the "auto-registration" workflow described here

The change is that you POST to get the list of upload URLs (instead of GET). As you say in that doc, POSTing the entire segment creates a bottleneck, and I don't think you can prevent the client sending the body in order to redirect them to the object store - you'd always need the bandwidth to receive at least some of the segment.
This way you still have to register the segments, but the act of requesting storage URLs is allowed to have side effects, like triggering a process that deletes segments that weren't registered

AndrewGibb · 2023-09-14T10:34:39Z

README.md

+
+Grains are grouped into Flow Segments, containing for example one second of content, wrapped in a container format such as MPEG-TS. The store provides a mechanism to upload and register new segments, and an interface to request all the segments covering a particular timerange and their download URLs; an approach inspired by chunked streaming protocols like HTTP Live Streaming.
+
+Segments may be stored separately from the metadata linking them to a position on the timeline, separating the metadata and data planes. For example our implementation uses an object store (_e.g._ AWS S3) for the segments and avoids proxying them through our servers, taking advantage of the scalability of cloud object storage. Segments are also de-coupled from their point in the timeline by a link between their `<flow_id, timestamp>` tuple and the underlying object ID, so a single segment can appear at multiple points in multiple flows. This allows for copy-on-write semantics: immutability means a new Flow must be created to make changes to existing parts of the timeline, but for unmodified portions of the timeline that is a metadata operation, and the original Flow's segments are re-used.


"avoids proxying them through our servers" takes a bit of unpacking. How about "...for the segments, passing S3 URLs to the client to upload directly, taking advantage..."

The final clause, after the colon, is subtly mangled. (What does "that" refer to in "that is a metadata operation?) I suggest
When a new flow contains material that already exists in the store, the new <flow_id, timestamp> tuple points to the existing object.

Thanks - I've tweaked this in f69917e as well

Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk>

AndrewGibb · 2023-09-14T17:07:42Z

README.md


-BBC R&D have demonstrated use of the TAMS approach as part of composable, software-defined workflows which can run in the cloud, on-premise or in a hybrid environment. We've built a prototype implementation of this API, along with services for movement and transformation of streams and files, which serve as the media backend for other projects such as our [remote wildlife camera](https://www.bbc.co.uk/rd/blog/2022-04-video-cloud-media-store-ingest-service) work. For more detail on the architecture and motivation see **TODO: Link to blog**.
+BBC R&D have demonstrated use of the TAMS approach as part of composable, software-defined workflows which can run in the cloud, on-premise or in a hybrid environment.
+We've built a prototype implementation of this API, along with services for movement and transformation of streams and files, which serve as the media backend for other projects such as our [remote wildlife camera](https://www.bbc.co.uk/rd/blog/2022-04-video-cloud-media-store-ingest-service) work.


One sentence per line 🎉

AndrewGibb

One important typo - the heading. And a couple of sentence-per-lines you missed. Otherwise LGTM

README.md

Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk> Co-authored-by: Callum Lunn <callum.lunn@bbc.co.uk> Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk> Co-authored-by: James Sandford <james.sandford@bbc.co.uk> Co-authored-by: Robert Wadge <robert.wadge@bbc.co.uk>

samdbmg and others added 5 commits September 13, 2023 15:25

Add README describing TAMS

37a3057

Fills out the README file to describe more about TAMS. sem-ver: feature

Use public images for build process

adb019f

Allow no credentials

f81f57f

Adds an option to allow using the API without any credentials, for testing purposes

Add CONTRIBUTING document and ICLA

331875f

samdbmg requested a review from a team September 13, 2023 14:36

philipnbbc approved these changes Sep 13, 2023

View reviewed changes

samdbmg and others added 7 commits September 14, 2023 09:40

Add licence and copyright notice

c3e2aa0

Add top-level Makefile

25f8960

Also adds help text to Make commands

Add CI workflow

fcd3713

Adds a basic GHA-based CI workflow that renders documentation and uploads it

Add PR template

bd844d1

Also updates CONTRIBUTING.md to note that some parts of the template may not apply.

README: Add links to rendered docs

588bec3

Remove redundant oaslint layer from build

00dcc34

Tidy up wording in descriptions

b3c06f1

Applies various suggestions from code review Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>

samdbmg force-pushed the sammg-add-api branch from ffcd5dd to b3c06f1 Compare September 14, 2023 08:40

samdbmg added 2 commits September 14, 2023 11:12

Add URLs to link back to repo

005de60

Format link for rendered documentation

f4aa6d4

philipnbbc reviewed Sep 14, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

samdbmg mentioned this pull request Sep 14, 2023

Improve CI pipeline #2

Merged

18 tasks

Highlight close relationship with IS-04

745efe6

Adds a note that the schema for Flows is closely related to the AMWA IS-04 schema. Co-authored-by: Philip de Nier <philip.denier@bbc.co.uk>

samdbmg force-pushed the sammg-add-api branch from 2908ae1 to 745efe6 Compare September 14, 2023 10:36

AndrewGibb reviewed Sep 14, 2023

View reviewed changes

Break up Design section, remove ambiguity

f69917e

Co-authored-by: Andrew Gibb <andrew.gibb@bbc.co.uk>

samdbmg force-pushed the sammg-add-api branch from e8a3c7e to f69917e Compare September 14, 2023 11:15

samdbmg mentioned this pull request Sep 14, 2023

Rework example with real links #3

Merged

18 tasks

AndrewGibb reviewed Sep 14, 2023

View reviewed changes

AndrewGibb approved these changes Sep 14, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

samdbmg and others added 2 commits September 14, 2023 19:05

Unify preferred style of name

a94383d

samdbmg force-pushed the sammg-add-api branch from adfd049 to a94383d Compare September 14, 2023 19:34

samdbmg merged commit 3cb29eb into main Sep 14, 2023
2 checks passed

samdbmg deleted the sammg-add-api branch September 14, 2023 19:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial API #1

Add initial API #1

samdbmg commented Sep 13, 2023

philipnbbc left a comment

AndrewGibb left a comment

AndrewGibb Sep 14, 2023

samdbmg Sep 14, 2023

AndrewGibb Sep 14, 2023

AndrewGibb Sep 14, 2023

AndrewGibb Sep 14, 2023

samdbmg Sep 14, 2023

AndrewGibb Sep 14, 2023

samdbmg Sep 14, 2023

AndrewGibb Sep 14, 2023

AndrewGibb left a comment


		Grains are grouped into Flow Segments, containing for example one second of content, wrapped in a container format such as MPEG-TS. The store provides a mechanism to upload and register new segments, and an interface to request all the segments covering a particular timerange and their download URLs; an approach inspired by chunked streaming protocols like HTTP Live Streaming.

		Segments may be stored separately from the metadata linking them to a position on the timeline, separating the metadata and data planes. For example our implementation uses an object store (_e.g._ AWS S3) for the segments and avoids proxying them through our servers, taking advantage of the scalability of cloud object storage. Segments are also de-coupled from their point in the timeline by a link between their `<flow_id, timestamp>` tuple and the underlying object ID, so a single segment can appear at multiple points in multiple flows. This allows for copy-on-write semantics: immutability means a new Flow must be created to make changes to existing parts of the timeline, but for unmodified portions of the timeline that is a metadata operation, and the original Flow's segments are re-used.

Add initial API #1

Add initial API #1

Conversation

samdbmg commented Sep 13, 2023

Details

Pivotal Story

Related PRs

Links to external test runs/working deployment

Submitter PR Checks

Reviewer PR Checks

Info on Cloudfit PRs

philipnbbc left a comment

Choose a reason for hiding this comment

AndrewGibb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewGibb left a comment

Choose a reason for hiding this comment