Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a video can both be a TVClip and a VideoObject, vocab is conflicting (and a few more details) #2017

Closed
Meteor0id opened this Issue Jul 22, 2018 · 10 comments

Comments

Projects
None yet
5 participants
@Meteor0id
Copy link
Contributor

commented Jul 22, 2018

Say I write an article about some topic. Included in this article are a few fragemnts of news braodcasts, both tv braodcasts and radio broadcasts.

It is unclear how these should be marked up. I could makr them up as VideoObject and AudioObject, or I could mark them up as TVClip and RadioClip.

A TVClip or RadioClip van have a property "video" or "audio" with repsctivly the itemtype "VideoObject"and "AudioObject", but this makes little semantic sense.

Both the video and audio fragments are presented with a caption (which is accounted for in VideoObject but not in AudioObject).

furthermore I cann not mark up encodings for the available sources, as they are already marked up with the itemprop contentUrl:

<source itemprop="contentUrl" src="horse.ogg" type="audio/ogg">
<source itemprop="contentUrl" src="horse.mp3" type="audio/mpeg">
<source itemprop="contentUrl" src="horse.wav" type="audio/wav">

Just vending some steam here:
The Schema.org vocab seems like a total mess to me. I don't like it. But I hope I can assist in making it better, rather than leaving it be. For an organisation trying to mark up their website things like these are reasons not to mark up any further, the vocab has just not matured enough at this time (in my opinion).

@Meteor0id

This comment has been minimized.

Copy link
Contributor Author

commented Jul 22, 2018

related issue: #1934

@thadguidry

This comment has been minimized.

Copy link
Contributor

commented Jul 22, 2018

@Meteor0id It depends on what you are trying to say about the Clips content. What data do you have about them or the content that is contained in the Clips ? How are they related to your article ? How is their content related to your article ? Do they simply mention the same subject about which your article is written, or is there more things to say about those Clips ?

Most folks just use http://schema.org/associatedMedia on the Article and perhaps https://schema.org/associatedArticle on the MediaObject But again, depends completely on what you are trying to say about the Article and any Media related to the Article and WHAT those relations are. Describe the relations and we can probably help you further.

@Meteor0id

This comment has been minimized.

Copy link
Contributor Author

commented Jul 22, 2018

My issue with it is that I need your help to explain this to me. If the vocab made good sense, I would be able to tell what type I need.

I am gonna try and describe the details, hopefully it lead to clarifications in the vocab for this use case.

The video and audio fragments are in some cases a decription of a certain topic, summing up the events leading up to the status quo, and introducing the issue at hand.
On other occasions they are news framents such as an interview with a politician.

The source of these clips are variosu news agencies. They are almost always fragments of only a few minutes. They are always realted to a certain topic which has been in the news (with impact on society).

Their purpose for these clips being on our site is to provide citations/quutations of what exactly a politician has said about this particulair matter or to explain or sum up the issue at hand. It is always informative in nature.

The clips provide background to a news article about a certain topic.

The reason they need to be marked up so well, is that we want people to land on our page providing background to the topic, whenever the topic becomes subject of public debate, or is meantioned in the news. We must make sure people will find our content easily, and we want them to find the entry point ot the subject which fits the knowledge they already have on the topic (meaning someone doing a simple search on the topic will land on an introduction page, while someone searching for the exact words of the prime minister on this topic will land on the video fragment rather than the introduction page. For news content and background content like this, SEO is of upmost importance. We put our faith in srtuctured data, in hopes it will enable us to provide the right entry point to an article for nearly anyone looking for information on this topic.

I am not please that schema.org leaves uncertainty about how to mark up certain things, and marking up video and audio is just another example of this. In my mind schema.org is meant to promote a simple and clear structure on how content should be presented and grouped, yet in this case it seems to add confusion, and add a need for extra wrappers.

Because these clips originate from news shows, they are TVClips or RadioClips. They have a publisher (the NewsMediaOrganization). The date at which the fragment was originally broadcast is relevant because it allows to question in what context the news was reported at that time.

Some video's might even be a compilation of multiple news broadcasts on a certain topic. Summing up the topic, and showing the extent to which the news item had impacted society at that time.

To provide visitors of our website with the context they need, we add a caption below the video or audio fragment, in which we indicate what broadcast this video contains ( for example "CNN live reportage on school shooting in 2017").
the video or audio does not need a name, title, or description at this time, but we could provide one of each through a meta element.

To make sure anyone can use our site, we offer the same video or audio frament through several source elements, with different encodings. We want to mark up the contentUrl of course, so the video or audio can be indexed, however this introduces the risk of duplicate markups. (video formats are webm mp4, mkv and ogg audio ogg, mp3 and wav.

We also include some subtitle tracks on some video's.

We are using a content management system on the site, should probably menation that.

That sums it up.
For anyone wondering where else we are having difficulties with schema.org vocab, check my issue reports here on github. Some relate to cross webpage marking of content. Some relate to the difference between social media posts, quotations and citations. (a very similar case of confusing vocab if you ask me)

@danbri

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

You're right, it could be more usable.

The underlying issue is that with any representational system, as soon as you get beyond a certain trivial scale, you have multiple plausible ways of saying similar things, and none obviously the best. Our approach has been to make sure that we provide a decent sized supporting vocabulary, against which specific products and search engine etc features can be defined. Ultimately it is those products, features and applications which get to express their more specific information needs, using Schema.org vocabulary.

At Google we document our use of Schema.org vocabulary in a site that explains more specifically how to use Schema.org in a way that plugs directly in to things we've built. Different companies, or opensource projects or whatever, very properly might make different data patterns on top of the basic raw descriptive materials that Schema.org supplies. Although there is certainly scope to chip away at the confusingness of schema.org, ultimately we will run into this basic situation: there will always be several possible ways of describing a situation. None is intrinsically the best, it will depend what problem you're trying to solve, what data, tools and resources you have, and what system, software or applications are involved. Languages like ShEx and SHACL can help to characterize these "data shapes", but they're relatively new.

@thadguidry

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2018

@Meteor0id Since SEO is your priority, you might want to take advantage of professionals that have been using Schema.org with SEO expertise. I can tell you that your best bet is just a simple "about":"something" wrapped around your video fragments.

This issue I think can be closed and moved over into the mailing list or you can post questions to Semantic Search Marketing

@vholland

This comment has been minimized.

Copy link
Contributor

commented Feb 8, 2019

I was about to open an issue for a similar concern about clips.

Perhaps the most expedient thing is to extend the ranges of http://schema.org/video and http://schema.org/audio to include http://schema.org/Clip. If the Clip needs to be related to a longer MediaObject, we can use http://schema.org/isPartOf.

@danbri

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2019

Thanks all. Vicki's last suggestion makes sense, is there more to do or does that address the main concerns here?

@danbri

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

ok, I'll go with implementing that last suggestion at least

@danbri

This comment has been minimized.

Copy link
Contributor

commented Apr 1, 2019

I have implemented that, plus tweaks to "caption" to apply on Audio Object, and also to allow for downloadable media-typed caption formats since "Text" as only option was awkwardly vague.

danbri added a commit that referenced this issue Apr 1, 2019

Added Clip to expected values of video, audio properties.
Also amended "caption" to be expected on AudioObject, and
documented a pattern for describing downloadable captions and their
media type using another MediaObject with an encodingFormat.
For #2017
@RichardWallis

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

Implemented in release 3.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.