Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a ratingExplanation property on Rating (and highlight factcheck / ClaimReview usecase) #2300

Closed
danbri opened this issue Jul 10, 2019 · 7 comments
Assignees

Comments

@danbri
Copy link
Contributor

@danbri danbri commented Jul 10, 2019

It is sometimes useful for publishers of ratings to provide a brief supporting explanation, giving context and background to the rating decision. This arises in particular for fact checking, which uses the ClaimReview type (a subtype of Review) and an associated Rating which is often but not always numeric.

  • a new Property named "ratingExplanation", expected on Rating, with Text value
  • definition: "A short explanation (e.g. one to two sentences) providing background context, facts and information that led to the conclusion expressed in the rating."

This idea was discussed in the fact checking community during Check & Tech, and Global Fact events in 2019, and is proposed here as something to be experimentally implemented. So it would go into Pending area initially.

@danbri danbri self-assigned this Jul 10, 2019
danbri added a commit that referenced this issue Jul 10, 2019
@danbri
Copy link
Contributor Author

@danbri danbri commented Jul 10, 2019

draft staged on webschemas for review: https://webschemas.org/ratingExplanation

"A short explanation (e.g. one to two sentences) providing background context and other information that led to the conclusion expressed in the rating. This is particularly applicable to ratings associated with "fact check" markup using ClaimReview."

An example scenario from Duke Reporter's Lab, using Politifact:

this example from Duke Reporter’s Lab Squash Fact Checker (using Politifact Fact Check):

  • claim text: “In the last two years, ICE officers made 266,000 arrests of aliens with…”
  • rating text: “Inflates the numbers.”
  • so the “context/explanation” (i.e. ratingExplanation) for the rating: “The numbers need context. Yes, during the last two years ICE officials arrested 266,000 people. But ICE notes that each arrest ‘may represent multiple criminal charges and convictions,’thus duplicating criminal activity and inflating the numbers”

The suggestion is that this new field will give a place for such information to be made machine-readably accessible.

Loading

@danbri
Copy link
Contributor Author

@danbri danbri commented Aug 1, 2019

This is live in today's Schema.org 3.9 release. Let's keep this ticket open for discussion / feedback.

Loading

@DukeReportersLab
Copy link

@DukeReportersLab DukeReportersLab commented Aug 15, 2019

We proposed this idea at the Duke Tech & Check conference, so it's great to see it now in open for feedback.

We work with the fact-checking organizations that will be using this field, so we are aware of the demands on the people who will fill out this field. We also are developing apps that use ClaimReview, so we're aware of its value and the importance that it be concise and clear.

We will be doing some initial testing in the Reporters' Lab using previously published fact-checks on the best format for the content of this field. Should the ratingExplanation be limited to a certain number of words? Should it be one sentence? Two sentences? We'll post the results of our tests here.

Loading

@cguess
Copy link

@cguess cguess commented Aug 16, 2019

@danbri The only suggestion I have at this point is if we can indicate whether or not there is a technical limit to the number of characters in the documentation. The current description says:

A short explanation (e.g. one to two sentences)

However, if there's a technical limit it would be good to document that, and to do the same if there is not.

If there is no technical limit (though I imagine that there is, simply due to the database type used internally), we should probably discuss whether specifications should indicate a maximum limit. It will help give marching orders to other implementations of this proposal so they follow suit with Google's version or vice versa.

My vote is on this limit being slightly outrageously large, like 5000 characters or something. This way users can decide on how they want to have it, while also indicating to other organizations what they should at least support technically. With the current language I could easily see a project manager who speaks German saying "oh, one or two sentences, that's like 20 words," capping it arbitrarily, and breaking when they try to scrape an organization that is publishing in English or French.

Loading

@sens3
Copy link

@sens3 sens3 commented Aug 19, 2019

I don't believe schema.org ever imposes length limits, since that's up for the consumers to figure out. Dan, correct me if I'm wrong.

Thus, I'd argue not to impose any length limits. If they were needed then for all fields not just this one. I.e. length limits for JSON parsers, database columns etc. apply to all markup fields.

On a practical note, for both JSON parsers and DB columns we should be safe if strings are not longer than a couple million characters :)

Loading

@danbri
Copy link
Contributor Author

@danbri danbri commented Aug 19, 2019

On the number of characters: generally we avoid that kind of rigidity in the official Schema.org definitions, although of course all practical consuming applications will need to make some assumptions. I think language along lines of "one or two sentences" is about where we want to be in terms of specificity. Whether that is the right length, is a separate question and we'll need more deployment experience to be sure.

Character and word limits would vary a fair bit between languages anyway, e.g. you can pack more into a few chinese characters, and even English to Spanish can inflate the char count by ~25%. ([https://www.andiamo.co.uk/resources/expansion-and-contraction-factors/](https://www.andiamo.co.uk/resources/expansion-and-contraction-factors/ has some more details))

What we could say as an informal community guideline "we expect 5000 characters to be very much more than enough.". This is more or less what we do at Google, for dataset markup.

Textual property recommendations
We recommend limiting all textual properties to 5000 characters or less. Google Dataset Search only uses the first 5000 characters of any textual property. Names and titles are typically a few words or a short sentence.

Loading

@RichardWallis
Copy link
Contributor

@RichardWallis RichardWallis commented Jul 2, 2020

Implemented in V3.9

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants