New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add minimalist sdPublisher and sdDatePublished [and sdLicense] markup for when structured data created elsewhere #1886

Closed
danbri opened this Issue Apr 10, 2018 · 12 comments

Comments

Projects
None yet
5 participants
@danbri
Contributor

danbri commented Apr 10, 2018

Provide a simple markup convention to indicate structured data markup source (specifically publisher, date) when it differs from the main content being marked up, for example when someone is creating structured data that summarizes the structure of content managed and published by others.

Example: a Columbia journalism student can annotate a fact-checking article by PolitiFact on the Web using the ClaimReview markup. Because this markup is not created by PolitiFact, its creator needs to be clearly specified and distinct from the ClaimReview author: the latter is the actual author of the fact check and the former is merely someone we put the fact check into structured data format.

Scope: note that it is not a goal to represent detailed provenance information for each piece of structured data; other more complex / expressive standards and techniques exist for this, e.g. PROV or named graphs. Human inspection may be needed to tell exactly which pieces of data were added by the structured data publisher, versus derived systematically from the original content source.

This construct is proposed based on experience at Google with ClaimReview markup, with general Schema.org handling, and with various collaborative markup projects.

Vocabulary:

  1. New property sdPublisher (on CreativeWork)
    "Indicates the party responsible for generating and publishing the current structured data markup, typically in cases where the structured data is derived automatically from existing published content but published on a different site. For example, student projects and open data initiatives often re-publish existing content with more explicitly structured metadata. The
    [[sdPublisher]] property helps make such practices more explicit."

  2. New property sdDatePublished (on CreativeWork)
    "Indicates the date on which the current structured data was generated / published. Typically used alongside [[sdPublisher]]."

{
  "@context": "http://schema.org",
  "@type": "ClaimReview",
  "datePublished": "2014-07-23",
  "url": "http://www.politifact.com/texas/statements/2014/ 
          jul/23/rick-perry/rick-perry-claim-about- 
          3000-homicides-illegal-immi/",
  "sdPublisher": {
    "@type": "Person",
    "name": "Joe Student",
    "affiliation": {
      "@type": "Organization",
	"name": "School of Journalism"
    }
  },
  "sdDatePublished": "2018-04-04",
  "author": {
    "@type": "Organization",
    "url": "http://www.politifact.com/",
    "sameAs": "https://twitter.com/politifact"
  },
  "claimReviewed": "More than 3,000 homicides were committed by 
                    \"illegal aliens\" over the past six years.",
  "reviewRating": {
    "@type": "Rating",
    "ratingValue": 1,
    "bestRating": 6,
    "alternateName": "True"
  },
  "itemReviewed": {
    "@type": "CreativeWork",
    "author": {
      "@type": "Person",
      "name": "Rich Perry",
      "jobTitle": "Former Governor of Texas"
    },
    "datePublished": "2014-07-17",
    "name": "The St. Petersburg Times interview [...]"
  }
}
</script>
@thadguidry

This comment has been minimized.

thadguidry commented Apr 10, 2018

I like the simplicity of the prefix you chose.
mu = markup , does not ring as clear against "Schema Dot" as does sd = "structured data" : )
bonus meta-metadata properties when we need : )

@cong-yu

This comment has been minimized.

cong-yu commented Apr 10, 2018

Looks good to me!

@cong-yu

This comment has been minimized.

cong-yu commented Apr 13, 2018

@danbri How about we add another field called sdLicense so the markup creator can dictate how the markup can be reused?

@rvguha

This comment has been minimized.

Contributor

rvguha commented Apr 15, 2018

@danbri

This comment has been minimized.

Contributor

danbri commented Apr 18, 2018

I don't see a coherent design yet, but let's try. I think what you're getting at is that parties who are making explicit SD from implicitly/partially organized or natural language content published by others, might want typically to indicate that their contribution to the SD isn't intended to add encumbrances. I can imagine an sdLicense property doing this, but we need to be careful that it doesn't also try to do the vastly harder job of saying which aspects of the SD are to be associated with the sdPublisher vs the original publisher.

e.g. how does this look?

sdLicense: "a license document indicated by an sdPublisher regarding their contribution to the current structured data. Typically used alongside sdPublisher and sdPublicationDate."

@stuartasutton

This comment has been minimized.

stuartasutton commented Apr 18, 2018

Dan, do you see any boundaries to application of these SD properties? E.g., does it go as far as an org, date and license for structured data created by a 3rd party for describing a credential, a bibliographic description?

@danbri danbri changed the title from Add minimalist sdPublisher and sdDatePublished markup for when structured data created elsewhere to Add minimalist sdPublisher and sdDatePublished [and sdLicense] markup for when structured data created elsewhere Apr 18, 2018

@danbri

This comment has been minimized.

Contributor

danbri commented Apr 18, 2018

@stuartasutton thinking about it a bit more, I've revised my view and it is clearer and cleaner for sdLicense to be simply the license for the SD; whatever that means may need case-by-case consideration in some contexts, but it makes things harder to frame it in terms of additional contributions. This might mean that in some cases it may be difficult to say what the sdLicense is, without some investigation, but that's the nature of the territory.

@danbri

This comment has been minimized.

Contributor

danbri commented Apr 18, 2018

I suggest instead that we stick as close as possible to existing 'license' which says:

A license document that applies to this content, typically indicated by URL.

i.e.

A license document that applies to this structured data, typically indicated by URL.

... and attach it to CreativeWork for now, but we should think through having it applicable everywhere.

@rvguha

This comment has been minimized.

Contributor

rvguha commented Apr 19, 2018

@danbri

This comment has been minimized.

Contributor

danbri commented Apr 19, 2018

Implemented in 53f0a45

This is going into pending, we will want to take care to get implementor feedback before progressing it further.

@danbri

This comment has been minimized.

Contributor

danbri commented Apr 19, 2018

@danbri

This comment has been minimized.

Contributor

danbri commented Jul 11, 2018

published in v3.4 pending; in #1891

I'll close this issue. We can open new ones to track status or revisions later.

@danbri danbri closed this Jul 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment