Information Verification Taxonomies #4

jmatsushita · 2014-08-18T16:14:10Z

There is (a google group called Information Verification Ecosystem). I've added you to it. We had this discussion about quads #2 with @johngriffin from Atchai when we worked on NewsVerify (Demo site) with Internews.

Just doing a brain dump copy paste in the issue and let's see later how we make these into different documents and different issues.

Uses Cases

These where the long term use cases we were looking at:

One fact-checker from a single organisation can add his pieces of evidence to an event. They can be viewed and added to by his colleagues before being published.
Factcheckers from different organisation could choose to publish data only to each other and allow editors to check aggregated fact-checking information before deciding to publish. Organisations could have machine reasoning rules that process publicly available verification meta-data to make decision, or recommend actions to editors.
Publishers or Bloggers could opt-in by activating their CMSes option to activate a verification workflow and publish the metadata publicly.
Web platforms could allow the crowd to add their own contribution to the evidence base around an event (a la Andy Carvin or Paul Lewis).
Web platforms could crawl the web to try and identify web posts about similar events (pivoting on something like Freebase?) and display or reason on verification meta-data.
We should be able take care of a workflow such as the one I believe Meedan Checkdesk is envisioning with the audience providing new evidence (which could be linked with the generic attribute hasVerificationComponent) which would be added, leaving the choice to the original author to change the verification status of the piece.

Taxonomies

And these are the design choices we had made at the time on the taxonomy design side of things (from this paragraph):

Verification status

The verification status is an editable taxonomy. It communicates how reliable/trustworthy/verified a piece of news is. The terms in use are:

verified : the news has been verified to a high standard and can be considered a true and accurate account
corroborated : the news has been cross-checked and corroborated but there are doubts or outstanding facts
unverified : the news has not been verified by someone other than the source
debunked : the news has been revealed as false or misleading

Verification category

There are three verification categories : where, when and what happened

Where : where the event happened, GPS coordinates and a text based descriptoin
When : when the event happened, a single date and time or a to/from date
What happened : is a description of what happened. This description is also used as the teaser for an event.

The platform may develop to support more verification categories or to visualise the data collected under each verification category.

Source(s)

A source is the same as an author as the content type Author An Author is the authors of original pieces of evidence. Journalism traditionally calls these people 'Sources'. An author is not an authenticated user creating content.

Data Model

Event : something that happened.

Related Content (pieces of content that are presumably linked to the
event, including other events and evidences themselves)
hasVerificationComponents (a generic relation allowing different
organisations to add their own verification components or to deal with
unclassified verification components like simple comments)
- hasSource (the link to the presumed original author of the content)
  - Links to a Source
  - hasEvidence : evidence A, evidence B,...
- hasDate (the presumed date[s] when the content was
  authored/captured)
  - Links to a Date
  - hasEvidence
- hasLocation (the presumed location[s] where the content was
  authored/captured)
  - Links to a Location
  - hasEvidence
- hasContext (the presumed actions/protagonists/results/causes/...
  that are associated with the event)
  - Links to Context (can be various types)
  - hasEvidence

Evidence (some data corroborating or invalidating a particular claim)

Author (the person submitting the evidence)
Source (the original source or provider of the evidence, including
automated platforms)
Data (the evidence itself as text, multimedia data, a link,...)

jmatsushita · 2014-08-18T16:15:28Z

@johngriffin had commented on the data model:

Event

An event links together a collection of evidence. This evidence
will support or deny that something happened, in some place, at some time.
These direct properties of the event (what happened, when, where) appear
to be inferred (either by machine or by human) from it's linked evidence,
by ascribing a truth/trust value to each piece of evidence. Correct?
It should be possible for each verificationComponont to have
multiple members. E.g. some evidence may corroborate everything else, but
suggest that the date was 2 years ago. In which case there should be two
dates linked from hasDate, with respective evidence linked to each?
Each organisation can assign it's own truth value to an event itself
- verified / disproved etc. This leads me to think that the direct
  properties of an event (what happened, when, where) should be explicit,
  rather than inferred, so that each organisation's truth value is being
  assigned to the same assertion - this happened, in this place, at this time.
Should an event have a verificationComponent called "Source"? Is
this not what the author of each piece of evidence is?
For relating events, perhaps something like SKOS could be used to
give some basic meaning to the relationships.
"Context" - for me this term doesn't really describe what it
contains - "the presumed actions/protagonists/results/causes/... that are
associated with the event". To me at least, context sounds more like the
background circumstances, related but not central to the event.
Context, in the sense that it is currently used, presents some real
problems for exchanging this data and ensuring we're all talking about the
same thing. There are established vocabularies to represent things like
people (foaf), but the facts of an event could be just about anything, e.g.
x said y; z bombed x. To represent these "facts", that each need to be
checked and verified, in something like RDF we'd need to invent terms that
don't exist in established vocabs. While this is not a problem within a
single system, it becomes a problem when data is to be exchanged between
systems and organisations.

Semantic Modelling

This is quite a nice simple event ontology that could be used as an
alternative to the Freebase one: http://purl.org/NET/c4dm/event.owl# -
it's used in the BBC programmes ontology for example. From a publishing
point of view, there may be more of an advantage to maintaining
compatibility with the schema.org event. Any compatibility only seems
possible if we do explicitly state the date and place properties on an
event though, at least at time of publishing.

For those who are interested in the semantic modelling, here is a (slightly
old, but relevant) paper comparing existing Event ontologies:
http://oai.cwi.nl/oai/asset/14783/14783A.pdf

I don't claim to be an expert in semantic modelling, but I've done some
before and have sketched out some graphs based on what's currently here, so
I could potentially do a first cut when the time comes if there's nobody
else in the group.

I appreciate I've dived into technicalities here, so please feel free to
discuss more generally the requirements first.

jmatsushita added the Provenance label Aug 18, 2014

This was referenced Aug 18, 2014

Conventions for NQuads? #2

Open

Mixing RDF and Neo4J #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Information Verification Taxonomies #4

Information Verification Taxonomies #4

jmatsushita commented Aug 18, 2014

jmatsushita commented Aug 18, 2014

Information Verification Taxonomies #4

Information Verification Taxonomies #4

Comments

jmatsushita commented Aug 18, 2014

Uses Cases

Taxonomies

Verification status

Verification category

Source(s)

Data Model

jmatsushita commented Aug 18, 2014