New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional properties for ScholarlyArticle #1379

Open
Glamdring opened this Issue Sep 25, 2016 · 13 comments

Comments

Projects
None yet
6 participants
@Glamdring

Glamdring commented Sep 25, 2016

ScholarlyArticle doesn't seem to have any properties defined at the moment, but there are scholarly article specific things that would be good to exist:

  • abstract (currently "about" is the closest thing, but not ideal)
  • articleType (research paper, meta-analysis, reproducing article, thesis, survey, commentary, report, lecture, etc.)
  • scienceBranch (computer science, mathematics, chemistry, etc.) - possibly "articleSection" can be used, but isn't an ideal match.
  • peerReview - currently the review property can be used, but it doesn't define additional properties for a review, like: status (accepted, acceptable with revisions, rejected), meetsScientificStandards, clarityOfBackground, significance, studyAndDesignMethods, noveltyOfConclusions, qualityOfPresentation, qualityOfDataAnalysis.
@darobin

This comment has been minimized.

Show comment
Hide comment
@darobin

darobin Sep 26, 2016

Contributor

For abstract there is existing discussion (see #276), my preferred option is allowing CreativeWork for description.

The hard bit with article typing is picking a classification — there are many, and one could make the case that they all include crazy cases. The same would impact picking science branches. To me this indicates that both should be open worlds. I would contend that about is not necessarily a bad choice here.

Peer review is a universe on its own. I am not certain that a bag of properties would be sufficient to describe it (though in some cases they could certainly help). A graph of Actions is IMHO more appropriate — hopefully more on that soon!

Contributor

darobin commented Sep 26, 2016

For abstract there is existing discussion (see #276), my preferred option is allowing CreativeWork for description.

The hard bit with article typing is picking a classification — there are many, and one could make the case that they all include crazy cases. The same would impact picking science branches. To me this indicates that both should be open worlds. I would contend that about is not necessarily a bad choice here.

Peer review is a universe on its own. I am not certain that a bag of properties would be sufficient to describe it (though in some cases they could certainly help). A graph of Actions is IMHO more appropriate — hopefully more on that soon!

@danbri

This comment has been minimized.

Show comment
Hide comment
@danbri

danbri Sep 30, 2016

Contributor

I'm supportive of the idea of improving our coverage here, and broadly agree with @darobin on the specifics. If there are widely used, established conventions from elsewhere that could guide us (e.g. categories) that would help...

Contributor

danbri commented Sep 30, 2016

I'm supportive of the idea of improving our coverage here, and broadly agree with @darobin on the specifics. If there are widely used, established conventions from elsewhere that could guide us (e.g. categories) that would help...

@Glamdring

This comment has been minimized.

Show comment
Hide comment
@Glamdring

Glamdring Sep 30, 2016

"category" would be okay for the science branches, yes.

As for peer review - it's complicated indeed, but at least a base set of parameters may be a good start?

Glamdring commented Sep 30, 2016

"category" would be okay for the science branches, yes.

As for peer review - it's complicated indeed, but at least a base set of parameters may be a good start?

@darobin

This comment has been minimized.

Show comment
Hide comment
@darobin

darobin Sep 30, 2016

Contributor

@Glamdring Maybe it's because I'm too close to the issue, but I don't know if a small set of descriptors for peer review would bring much. To give an example, here is the approach we use: http://api.science.ai/. When peer review is completed, it gives a full audit trail (and at every step it provides you with the next available actions). Obviously, that's a lot!

Contributor

darobin commented Sep 30, 2016

@Glamdring Maybe it's because I'm too close to the issue, but I don't know if a small set of descriptors for peer review would bring much. To give an example, here is the approach we use: http://api.science.ai/. When peer review is completed, it gives a full audit trail (and at every step it provides you with the next available actions). Obviously, that's a lot!

@shaunmcdonald

This comment has been minimized.

Show comment
Hide comment
@shaunmcdonald

shaunmcdonald Jan 17, 2017

Hi all, this issue seemed like an appropriate place to comment on an issue we've recently run into with our JSON-LD for type ScholarlyArticle webpages. Please let me know if I should start a new issue.

It seems like the Google Testing Tool is validating them as AMP pages, which requires the "headline" and "image" properties (as opposed to non-AMP pages - the properties are only recommended). I say "seems like" as I'm in the process of confirming with Google, and it may turn out to be something else.

Still, headline is inherited from CreativeWork. This seems odd to me itself since it's "the most generic kind of creative work, including book...", but it makes even less sense on the ScholarlyArticle type, which is purely bibliographic...in that it's a BibEx type.

Am I missing something? I've looked through Issues here without seeing anything. I think the underlying problem is that search engines orgs adopting schema.org tend to be quite literal in their interpretations. For instance, AMP is very newsfeed oriented. They'll latch onto something that serves an immediate purpose and implement it in a way that, however unintentionally, conflates the use case.

I think the ScholarlyArticle type needs a simple "title" or "articleTitle" type, and I think it might be safest if it's inhereted from the Article type.

shaunmcdonald commented Jan 17, 2017

Hi all, this issue seemed like an appropriate place to comment on an issue we've recently run into with our JSON-LD for type ScholarlyArticle webpages. Please let me know if I should start a new issue.

It seems like the Google Testing Tool is validating them as AMP pages, which requires the "headline" and "image" properties (as opposed to non-AMP pages - the properties are only recommended). I say "seems like" as I'm in the process of confirming with Google, and it may turn out to be something else.

Still, headline is inherited from CreativeWork. This seems odd to me itself since it's "the most generic kind of creative work, including book...", but it makes even less sense on the ScholarlyArticle type, which is purely bibliographic...in that it's a BibEx type.

Am I missing something? I've looked through Issues here without seeing anything. I think the underlying problem is that search engines orgs adopting schema.org tend to be quite literal in their interpretations. For instance, AMP is very newsfeed oriented. They'll latch onto something that serves an immediate purpose and implement it in a way that, however unintentionally, conflates the use case.

I think the ScholarlyArticle type needs a simple "title" or "articleTitle" type, and I think it might be safest if it's inhereted from the Article type.

@danbri

This comment has been minimized.

Show comment
Hide comment
@danbri

danbri Jan 17, 2017

Contributor

@shaunmcdonald - while we can't handle Google-related details here, your point about the vocabulary is worth considering. Broadly "name" does the job of (bibliographic)"title" across schema.org. I'm not sure there would be value in adding yet another similar property.

Perhaps 'headline' ought to be pushed down onto more specific subtypes for which it is more applicable? Many articles naturally have headlines. Perhaps most scholarly articles do not, but that's ok, the property can be omitted if not applicable. Is the motivating issue here that you are trying to get some specific Google feature to work, or just that Google's Article validator is triggering on ScholarlyArticle too? Don't take the Google Structured Data Testing Tools's red "errors" too literally, they're often just errors in the context of some specific feature.

Contributor

danbri commented Jan 17, 2017

@shaunmcdonald - while we can't handle Google-related details here, your point about the vocabulary is worth considering. Broadly "name" does the job of (bibliographic)"title" across schema.org. I'm not sure there would be value in adding yet another similar property.

Perhaps 'headline' ought to be pushed down onto more specific subtypes for which it is more applicable? Many articles naturally have headlines. Perhaps most scholarly articles do not, but that's ok, the property can be omitted if not applicable. Is the motivating issue here that you are trying to get some specific Google feature to work, or just that Google's Article validator is triggering on ScholarlyArticle too? Don't take the Google Structured Data Testing Tools's red "errors" too literally, they're often just errors in the context of some specific feature.

@thadguidry

This comment has been minimized.

Show comment
Hide comment
@thadguidry

thadguidry Jan 17, 2017

@danbri NO do not push headline down. We already previously debated that many moons ago. BBC, etc.. :)

thadguidry commented Jan 17, 2017

@danbri NO do not push headline down. We already previously debated that many moons ago. BBC, etc.. :)

@darobin

This comment has been minimized.

Show comment
Hide comment
@darobin

darobin Jan 17, 2017

Contributor

We are definitely happy with name for ScholarlyArticle. It seems pretty natural, and it's nice to reuse the same property for all CreativeWorks.

I think the Google tools tend to live in a parallel universe in which everything is a NewsArticle. I wish Google would fix that because it pressures a lot of people into pretending they are publishing news articles, but here isn't the right place to fix that :)

Contributor

darobin commented Jan 17, 2017

We are definitely happy with name for ScholarlyArticle. It seems pretty natural, and it's nice to reuse the same property for all CreativeWorks.

I think the Google tools tend to live in a parallel universe in which everything is a NewsArticle. I wish Google would fix that because it pressures a lot of people into pretending they are publishing news articles, but here isn't the right place to fix that :)

@shaunmcdonald

This comment has been minimized.

Show comment
Hide comment
@shaunmcdonald

shaunmcdonald Jan 17, 2017

@danbri - thanks for the quick reply. You're right about the Testing Tool validation. In fact, I did map our article titles to the 'name' property, and the Tool is picking them up.

I appreciate thadguidry and darobin's comments, but I'm standing by the point that "headline" doesn't belong on ScholarlyArticle type. If your arguement is that it does belong on the CreativeWork super class, then ScholarlyArticle shouldn't be a subclass of CreativeWork. Headlines are not bibliographic; they are marketing tools that can have very little to do with the content of the article.

I'm not even certain why "headline" would need to be pushed down as opposed to removed from ScholarlyArticle. The Article type doesn't appear to inherit "headline". Is that an oversight? Or should I be looking at source?

@darobin - Agree about Google 100%. I would not ask this group to address something they did. I apologize if I implied that in any way.

BTW - thank you all for engaging me on this. I appreciate your time. One way or another, this issue could end up costing my company a great deal of effort.

shaunmcdonald commented Jan 17, 2017

@danbri - thanks for the quick reply. You're right about the Testing Tool validation. In fact, I did map our article titles to the 'name' property, and the Tool is picking them up.

I appreciate thadguidry and darobin's comments, but I'm standing by the point that "headline" doesn't belong on ScholarlyArticle type. If your arguement is that it does belong on the CreativeWork super class, then ScholarlyArticle shouldn't be a subclass of CreativeWork. Headlines are not bibliographic; they are marketing tools that can have very little to do with the content of the article.

I'm not even certain why "headline" would need to be pushed down as opposed to removed from ScholarlyArticle. The Article type doesn't appear to inherit "headline". Is that an oversight? Or should I be looking at source?

@darobin - Agree about Google 100%. I would not ask this group to address something they did. I apologize if I implied that in any way.

BTW - thank you all for engaging me on this. I appreciate your time. One way or another, this issue could end up costing my company a great deal of effort.

@danbri

This comment has been minimized.

Show comment
Hide comment
@danbri

danbri Jan 17, 2017

Contributor

On that last point - "headline" is a property attached solely to the CreativeWork type. In the logic of schema.org, all ScholarlyArticles are Articles, and all Articles are CreativeWorks, ... so it is applicable to those subtypes too.

Contributor

danbri commented Jan 17, 2017

On that last point - "headline" is a property attached solely to the CreativeWork type. In the logic of schema.org, all ScholarlyArticles are Articles, and all Articles are CreativeWorks, ... so it is applicable to those subtypes too.

@shaunmcdonald

This comment has been minimized.

Show comment
Hide comment
@shaunmcdonald

shaunmcdonald Jan 18, 2017

OK. It's just that I didn't see headline at https://schema.org/Article

shaunmcdonald commented Jan 18, 2017

OK. It's just that I didn't see headline at https://schema.org/Article

@darobin

This comment has been minimized.

Show comment
Hide comment
@darobin

darobin Jan 18, 2017

Contributor

Indeed it's on http://webschemas.org/Article but not https://schema.org/Article — that looks a lot like a bug.

@shaunmcdonald headline makes sense for a lot of CreativeWork types. Sometimes properties don't make that much sense on a given subclass, for instance it's disputable whether whether genre makes much sense on a ScholarlyArticle ("it's police-procedural homotopy theory!") or isFamilyFriendly for that matter ("Baby de Sitter space!"). Ontological perfection is not within the reach of a simple class hierarchy with a bag of properties. So the "trick" is to just not use properties that don't make sense to one's given use case.

Contributor

darobin commented Jan 18, 2017

Indeed it's on http://webschemas.org/Article but not https://schema.org/Article — that looks a lot like a bug.

@shaunmcdonald headline makes sense for a lot of CreativeWork types. Sometimes properties don't make that much sense on a given subclass, for instance it's disputable whether whether genre makes much sense on a ScholarlyArticle ("it's police-procedural homotopy theory!") or isFamilyFriendly for that matter ("Baby de Sitter space!"). Ontological perfection is not within the reach of a simple class hierarchy with a bag of properties. So the "trick" is to just not use properties that don't make sense to one's given use case.

@mfenner

This comment has been minimized.

Show comment
Hide comment
@mfenner

mfenner Mar 1, 2017

Regarding peer review, and the difficulties describing it: one lightweight approach that also has important use cases would be to add additional dates to the existing dateCreated, datePublished and dateModified:

  • dateSubmitted
  • dateAccepted

These are common metadata for scholarly articles, and often publicly available, as you can for example see in this article.

scienceBranch is a big mess, and I basically agree with @darobin that it should be left open as there is no community standard. category and/or keywords would then be a good fit.

If I had to pick a classification for the sciences to be used in category, and would need to cover all disciplines, I would use the OECD Fields of Sciences. A short list of under 50 disciplines that is widely used.

mfenner commented Mar 1, 2017

Regarding peer review, and the difficulties describing it: one lightweight approach that also has important use cases would be to add additional dates to the existing dateCreated, datePublished and dateModified:

  • dateSubmitted
  • dateAccepted

These are common metadata for scholarly articles, and often publicly available, as you can for example see in this article.

scienceBranch is a big mess, and I basically agree with @darobin that it should be left open as there is no community standard. category and/or keywords would then be a good fit.

If I had to pick a classification for the sciences to be used in category, and would need to cover all disciplines, I would use the OECD Fields of Sciences. A short list of under 50 disciplines that is widely used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment