Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirements when distributing audiobooks #12

Closed
HadrienGardeur opened this issue Aug 19, 2019 · 9 comments
Closed

Requirements when distributing audiobooks #12

HadrienGardeur opened this issue Aug 19, 2019 · 9 comments

Comments

@HadrienGardeur
Copy link

Context

I’ve recently started a new position as Director of R&D at De Marque, an aggregator and digital distributor connected to major ebook and audiobook retailers.

Since we’re receiving audiobooks from major publishers in Canada, France, Spain and Italy, the audiobook spec is very relevant for us as it might become the standard format that we request from them.

Given the lack of a standard format for distributing audiobooks, it felt relevant to explore what various retailers expect to receive when we deliver them an audiobook and figure out what might be different and/or missing for the proposed spec.

Documentation

Manifest Examples

Apple (XML)
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://apple.com/itunes/importer" version="music5.3">
    <language>en</language>
    <provider>AppleseedBooks</provider>
    <album>
        <album_type>audiobook</album_type>
        <vendor_id>9781106701657</vendor_id>
        <title>Dracula (Unabridged)</title>
        <original_release_date>2009-01-05</original_release_date>
        <label_name>Apple Publishing Group</label_name>
        <genres>
            <genre code="CLASSICS-00"/>
        </genres>
        <copyright_pline>2008 Apple Publishing Group</copyright_pline>
        <copyright_cline>2008 Apple Publishing Group</copyright_cline>
        <artwork_files>
            <file>
                <file_name>cover.jpg</file_name>
                <size>56723</size>
                <checksum type="md5">58a9947e2e5de47bc3039092964ad3a3</checksum>
            </file>
        </artwork_files>
        <description>Dracula is the seminal gothic horror novel of its time as Bram Stoker introduced the world to the legendary vampire Count Dracula. Published in 1897 and told through a series of diary entries and letters, the story journeys into the dark world of Count Dracula through the eyes of several different narrators. The novel explores many themes, the role of women in Victorian culture, conventional and conservative sexuality, immigration, colonialism, post colonialism and folklore. Irish author Abraham "Bram" Stoker (1847 - 1912) was a writer of novels and short stories. He was also the personal assistant of the actor Henry Irving and the business manager of the Lyceum Theatre in London, which Irving owned.</description>
        <products>
            <product>
                <territory>AU</territory>
                <wholesale_price_tier>3</wholesale_price_tier>
                <sales_start_date>2009-01-05</sales_start_date>
                <cleared_for_sale>true</cleared_for_sale>
            </product>
            <product>
                <territory>GB</territory>
                <wholesale_price_tier>3</wholesale_price_tier>
                <sales_start_date>2009-01-05</sales_start_date>
                <cleared_for_sale>true</cleared_for_sale>
            </product>
        </products>
        <artists>
            <artist>
                <artist_name>Bram Stoker</artist_name>
                <apple_id>2683478</apple_id>
                <roles>
                    <role>Author</role>
                </roles>
                <primary>true</primary>
            </artist>
            <artist>
                <artist_name>Christopher Saul</artist_name>
                <apple_id>301336965</apple_id>
                <roles>
                    <role>Narrator</role>
                </roles>
                <primary>false</primary>
            </artist>
        </artists>
        <tracks>
            <track>
                <type>audiobook</type>
                <vendor_id>9781106701657_1</vendor_id>
                <title>Dracula Track 1 (Unabridged)</title>
                <label_name>Apple Publishing Group</label_name>
                <explicit_content>none</explicit_content>
                <track_number>1</track_number>
                <audio_file>
                    <file_name>9781106701657_1.wav</file_name>
                    <size>172149800</size>
                    <checksum type="md5">2e669877c1913f59c6686a86b4d84d1d</checksum>
                </audio_file>
                <audio_language>en</audio_language>
                <preview_start_index>240</preview_start_index>
                <artists>
                    <artist>
                        <artist_name>Bram Stoker</artist_name>
                        <apple_id>2683478</apple_id>
                        <roles>
                            <role>Author</role>
                        </roles>
                        <primary>true</primary>
                    </artist>
                    <artist>
                        <artist_name>Christopher Saul</artist_name>
                        <apple_id>301336965</apple_id>
                        <roles>
                            <role>Narrator</role>
                        </roles>
                        <primary>false</primary>
                    </artist>
                </artists>
                <chapters>
                    <chapter>
                        <chapter_start_time>00:00:00.000</chapter_start_time>
                        <chapter_title>Chapter 1 - Jonathan Harker’s Journal</chapter_title>
                    </chapter>
                    <chapter>
                        <chapter_start_time>02:00:08.567</chapter_start_time>
                        <chapter_title>Chapter 2 - Jonathan Harker’s Journal Continued</chapter_title>
                    </chapter>
                    <chapter>
                        <chapter_start_time>03:59:40.321</chapter_start_time>
                        <chapter_title>Chapter 3 - Jonathan Harker’s Journal Continued</chapter_title>
                    </chapter>
                    <!-- additional chapters here as needed -->
                </chapters>
            </track>
            <!-- additional tracks here as needed -->
        </tracks>
    </album>
</package>
Kobo (JSON)
{
  "manifest_version": 1,
  "file_list": [
    {
      "duration": 15, 
      "media_type": "audio/mpeg", 
      "file_name": "01-somefilename.mp3", 
      "file_order_id": 0
    }, 
    {
      "duration": 60, 
      "media_type": "audio/mpeg", 
      "file_name": "02-anotherfilename.mp3", 
      "file_order_id": 1
    }, 
    {
      "duration": 200, 
      "media_type": "audio/mpeg", 
      "file_name": "doesnt need-to-be-in-filename-order.mp3", 
      "file_order_id": 2
    }, 
    {
      "duration": 30, 
      "media_type": "audio/mpeg", 
      "file_name": "lastchapter.mp3", 
      "file_order_id": 3
    }
  ], 
  "table_of_contents": [
    {
      "title": "Introduction", 
      "file_order_id": 0, 
      "offset": 0
    }, 
    {
      "title": "1. We hear you", 
      "file_order_id": 1, 
      "offset": 0
    }, 
    {
      "title": "2. Another chapter", 
      "file_order_id": 2, 
      "offset": 0
    }, 
    {
      "title": "3. The End", 
      "file_order_id": 3, 
      "offset": 0
    }
  ]
}

Notes

  • A number of retailers do not have the concept of a manifest and rely on naming conventions or alpha order in a ZIP/folder instead.
  • For those retailers, the TOC is tied to how an audiobook is broken down into various audio resources (for example Audible requires that: "Each file must contain only one chapter or section").
  • Apple requires content producers to concatenate audio resources to create a single file, it only allows multiple files if the audiobook is longer than 23 hours.
  • On the other end of the spectrum, Audible requires each audio resource to be no longer than 120 minutes, while Kobo requires them to be 200 Mb or less.
  • Apple and Kobo support an explicit TOC, defined directly in their manifest with a flat structure where each entry in the TOC is tied to a resource (indirectly through ID/IDref for Kobo).
  • Across retailers, there seems to be a preference for CBR (Constant Bit Rate) over VBR (Variable Bit Rate) for MP3 and M4A/AAC.
  • Indicating whether an audiobook is unabridged/abridged seems to be an important metadata.
  • Same thing for explicit content.
  • Apple can support track/resource level metadata to indicate contributors while other retailers seems to pull this information from files themselves.
  • Supplemental materials are explicitly supported for a number of retailers, including Apple (which can support booklets per track/resource)
  • Covers are expected to be square (Apple) or will be converted to a square (Google or Kobo).
  • Samples can either be provided separately (Audible or Kobo) or sample-specific metadata are available (length in % or minutes for Google, timestamp where the sample begins for Apple)

Closing Remarks

  • This issue is most likely incomplete and/or partially incorrect, feel free to chime in (cc @wareid, @GarthConboy, @geoffjukes and others).
  • I wish we had started our work on an interchange format with such a document, IMO this is very helpful to fully understand the situation in the market.
  • We should probably add support for indicating if an audiobook is abridged/unabridged or if it contains explicit material directly in the specification instead of a best practice document.
  • Our current support for the TOC (HTML with a nested structure) won't work with any of the retailers listed above. I don't know if we want to re-open that box, but a flat structure in JSON would be better aligned with what's currently requested.
  • Resource-level metadata seems to be useful, if not mandatory. This might be something worth exploring in a best practice document.
  • Requirements for how an audiobook is broken into multiple audio resources are all over the place, it's impossible to create something that works for everyone currently.
  • There's a lack of consistency regarding support for samples and supplemental material, we would probably need to explore things a bit more before anything can be done about them.
@wareid
Copy link
Contributor

wareid commented Aug 19, 2019

(First off: Congrats on the new job!)

I think the current draft of the spec and the requirements you list above are close. I agree we should include the abridged/unabridged and explicit content options, which I am happy to discuss in the meeting next week, I don't think they'll be contentious.

I have been thinking about the TOC question again. I'm reluctant to open that particular pandora's box but I see both sides of the argument and I've talked to people here about it too. Both options are possible and I would be open to discussing the pros/cons in a world of publication manifest. I'll add it to the agenda as well.

Would like to hear options about samples, I think we have covered supplemental material as best we will manage. I don't want to be prescriptive about it in version 1.

@HadrienGardeur
Copy link
Author

Would like to hear options about samples [...]

I've seen the following use cases in the various retailer requirements:

  • a separate sample file
  • pointing to the beginning of a sample
  • length of the sample expressed as % or duration

It's worth pointing that what Apple does (pointing to the beginning of a sample) and what Google does (define the length of the sample) would work better together than separately.

Separate sample file

This could be handled through a reference in either links or resources that would be properly identified (using a specific rel value).

"resources": [
  {
    "rel": "preview", 
    "url": "sample.mp3", 
    "encordingFormat": "audio/mpeg"
  }
]

Pointing to the beginning of a sample

This could be handled through a reference in either links or resources and re-use the start rel value or define our own.

"links": [
  {
    "rel": "start", 
    "url": "chapter1.mp3#t=78", 
    "encordingFormat": "audio/mpeg"
  }
]

Length/duration of a sample

I think this has been suggested by Google before (ping @GarthConboy) as additional metadata but I can't find back where this was suggested (I remember a separate Google Presentation for example).

@GarthConboy
Copy link
Contributor

The Preview PR has been relocated to the Publishing Manifest document here

@geoffjukes
Copy link

geoffjukes commented Aug 19, 2019

@HadrienGardeur It is important to bear in mind that the term 'sample' is used to describe 2 different things, with 2 different purposes.

The first (most commonly supplied as a separate file) is a marketing resource, effectively unrelated to the audiobook as it ships. Samples are often created before the audio is available i.e. it is more akin to a movie trailer than a sample of the book. This sample may change over the lifetime of the book, at the whims of the marketing department.

The second is a snippet of audio extracted directly from the book. This is more akin to a preview.

Both of these can be subject to strict contractual obligations from the author or publisher, and both can exist simultaneously.

It is my understanding that Google use the 'sample' (more appropriately a 'preview') to facilitate contextual audio samples based on search results - leveraging the publishers 'you can play X seconds of this book as a sample'. It is worth emphasizing that not all publisher/author contracts allow for this type of sample (but it is becoming more common)

The term 'sample' has an established meaning in the industry as the marketing resource, and it is already managed via ONIX deliveries (along with the cover image) and I do not think it appropriate to embed it in the distribution package at all.

If we focus on the term preview (and Google's use of it) then I think that it is perfectly appropriate to embed it into the distribution.

@HadrienGardeur
Copy link
Author

Both of these can be subject to strict contractual obligations from the author or publisher, and both can exist simultaneously.

@geoffjukes could you explain their respective use cases when they both exist simultaneously?

I get your point about "trailer" (basically replaces the description or a back cover) vs "preview" (which lets me sample what the experience will be) but I would imagine that most of the time they're used the same way by a retailer.

In previous discussions, we've talked about "authored samples" vs "generated samples". My guess is that a retailer should prioritize an "authored sample" over a generated one, but it doesn't hurt to offer at least some controls about the way these "generated samples" are calculated.

The term 'sample' has an established meaning in the industry as the marketing resource, and it is already managed via ONIX deliveries (along with the cover image) and I do not think it appropriate to embed it in the distribution package at all.

"Authored samples" and covers can definitely be delivered using ONIX, but it's still common to have file naming conventions for covers (this is true for both EPUB and audiobooks).
It's worth discussing whether this is truly redundant with ONIX or if it serves a purpose.

"Generated samples" are usually created based on a default setting per retailer and/or contractual obligations (I've seen a lot of contracts with either 5%/10% or the first chapter, whichever is the smallest). If fine-grained controls over the way they're generated (like Apple or Google) is useful, then ONIX can't easily fulfill that role.

@geoffjukes
Copy link

@geoffjukes could you explain their respective use cases when they both exist simultaneously?

Using your terms @HadrienGardeur - Prior to publishing, the "Authored Sample" will be the only one that exists, and is used to promote the book. After publishing, the "Authored sample" may (or may not) be updated any number of times (in practice, it is rarely updated after publication, unless the book is optioned for a movie or similar).

It is true to say that the "Authored Samples" are most often "generated", but it is not true to say that they are always generated, and that is the distinction. I find it useful to think of "Authored Samples" as "highly-abridged versions of the book, used for marketing and promotion only".

@iherman
Copy link
Member

iherman commented Aug 26, 2019

This issue was discussed in a meeting.

  • No actions or resolutions
View the transcript Hadrien’s issues
Wendy Reid: See Audio issue #12
Wendy Reid: The next thing was to quickly make sure we’ve covered up any additional issues with LPF - Hadrien opened up an issue regarding a minor point of metadata, should we have a field for abridged/unabridged, and “does this contain adult content”
… I’ll write a PR for in the publication manifest, as they are both valid information for audiobooks or publications in general. Another is a mention of preview, which is already in manifest, but I’ll make a point of saying it’s included in audiobooks as well.
… Preview, abridged, and ‘contains explicit material’. Outside of Avneesh’s issue, this is the last to address before the next draft.
Ivan Herman: These will be added to publication manifest?
Wendy Reid: Yes. The preview stuff is already in there.
Ivan Herman: Are we sure those terms don’t exist in schema.org?
Wendy Reid: They do exist in schema.org - so those are the ones we need to use.
… we need to call them out as ones important to publications.
Ivan Herman: One practical thing, I know that Matt (who is out today) has made a pretty big rewrite and I presume he’ll do a PR today or tomorrow. It may be worth waiting for that to get through and done as those issues have already been accepted.
… otherwise we get into merge problems.
Wendy Reid: I’ll wait for that
Garth Conboy: I just wanted to reflect, the other issue that Hadrien raised in #12 - the HTML TOC, as we discussed historically and recently, we do not want to reopen this at this time. His point is that it didn’t map well to audio-ingest platforms. The discussion we had previously..
… was that most support epub, and this is a relic of that, so we want to stay with that HTML agreement, and if this crops up as an implementation issue, it can be re-evaluated, but at the time, we had some producer input as well that got on board with this.
… we are in OK shape until we’re at the implementation stage. If we’re wrong there, we can reopen
Wendy Reid: Any other questions about Audiobook or LPF work?

wareid pushed a commit that referenced this issue Sep 7, 2019
Final updates before TPAC! Including abridged and preview as requested in issue #12.

One minor order change to correct the position of Default Reading Order and Resources in the TOC.
@wareid
Copy link
Contributor

wareid commented Sep 7, 2019

Just an update here, I've added abridged and preview to the specification.

We are still discussing isFamilyFriendly for publication manifest as there is some implications on interpretation there, so I think it's worth including but not required right away, and we should look at options for how best to include it.

@wareid wareid closed this as completed Sep 16, 2019
@iherman
Copy link
Member

iherman commented Sep 25, 2019

This issue was discussed in a meeting.

  • RESOLVED: Close Audiobooks Issue #12, abridged and preview have been added to the specification, and isFamilyFriendly be moved to Publication Manifest
View the transcript Wendy Reid: #12
Wendy Reid: there was a suggestion to include 3 things
… abridged value
… preview
… how to include a preview
… and the last one, which will get a new issue
… the isFamilyFriendly flag
… schema doesn’t have a MPAA-type rating
… there isn’t a standard
… the retailers tend to decide
… kobo ask publishers to identify, and then verify
… it depends on jurisdiction
… i think the flag is worth having
… it’s a terrible name, but schema doesn’t have a better one
… I’d like to close this issue, and then move the isFamilyFriendly to pub manifest
Proposed resolution: Close Audiobooks Issue #12, abridged and preview have been added to the specification, and isFamilyFriendly be moved to Publication Manifest (Wendy Reid)
Brady Duga: +1
Charles LaPierre: +1
Wendy Reid: +1
Romain Deltour: +1
Rachel Comerford: when we say move to publication manifest, we’re really just saying that we’re going to open an issue
Wendy Reid: yes, just opening an issue
Garth Conboy: +1 (but don’t think we should do such a thing in pbu manifest, as I think it’s intractable)
Resolution #5: Close Audiobooks Issue #12, abridged and preview have been added to the specification, and isFamilyFriendly be moved to Publication Manifest
Rachel Comerford: +1
Dave Cramer: Plus one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants