Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doi.org serving JSON-LD documents breaks Codemeta's context file #278

Closed
progval opened this issue Jun 29, 2022 · 6 comments
Closed

doi.org serving JSON-LD documents breaks Codemeta's context file #278

progval opened this issue Jun 29, 2022 · 6 comments
Labels

Comments

@progval
Copy link
Member

progval commented Jun 29, 2022

Hi,

(ignore this paragraph, see the first comment) It looks like the http://schema.org now redirects to https://schema.org, even with a Accept: application/ld+json header; and neither PyLD nor jsonld.js seem to handle this gracefully.

Applications which hardcode context files (as they should) are not affected, but this causes issues with the JSON-LD playground for example: https://json-ld.org/playground/#startTab=tab-expanded&json-ld=%7B%22%40context%22%3A%22https%3A%2F%2Fdoi.org%2F10.5063%2Fschema%2Fcodemeta-2.0%22%2C%22license%22%3A%22https%3A%2F%2Fspdx.org%2Flicenses%2FGPL-3.0%22%2C%22name%22%3A%22test%20software%22%7D shows:

jsonld.InvalidUrl: Dereferencing a URL did not result in a valid JSON-LD object. Possible causes are an inaccessible URL perhaps due to a same-origin policy (ensure the server uses CORS if you are using client-side JavaScript), too many redirects, a non-JSON response, or more than one HTTP Link Header was provided for a remote context.

I don't know what should be done about this, any idea? (hot-patch the codemeta context? report the issue to schema.org doi.org?)

@progval
Copy link
Member Author

progval commented Jun 29, 2022

actually, the issue is caused by doi.org now serving its own JSON-LD documents (via a redirect to Datacite).

https://doi.org/10.5063/schema/codemeta-2.0 is supposed to redirect to https://raw.githubusercontent.com/codemeta/codemeta/2.0/codemeta.jsonld , and it still does when used in a browser or with default curl for example:

$ curl https://doi.org/10.5063/schema/codemeta-2.0 -i -L                                 
HTTP/2 302 
date: Wed, 29 Jun 2022 17:45:32 GMT
content-type: text/html;charset=utf-8
content-length: 227
location: https://raw.githubusercontent.com/codemeta/codemeta/2.0/codemeta.jsonld
vary: Accept
expires: Wed, 29 Jun 2022 17:53:40 GMT
permissions-policy: interest-cohort=(),browsing-topics=()
cf-cache-status: DYNAMIC
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=vO0NDUrcZbBarZCgjLgFIPzu9SDxxRnLVYDwGPuZMBj9U8eT8GysNvppkA5d1lZ%2F049PFV03JxueXHCj75LNAoxeH7lWl9wzbo0DLutpwbM0%2B2UaGiY9ehFoIZhJLC1Xb60b3i0%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
strict-transport-security: max-age=31536000; includeSubDomains; preload
server: cloudflare
cf-ray: 723079f75aafedb7-CDG
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400

HTTP/2 200 
cache-control: max-age=300
content-security-policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
content-type: text/plain; charset=utf-8
etag: "12ecf10e45bf11e8d91f535307d970990c3a0065d292ce04f9c2b62c539414ed"
strict-transport-security: max-age=31536000
x-content-type-options: nosniff
x-frame-options: deny
x-xss-protection: 1; mode=block
x-github-request-id: 3786:13A20:1AF8F93:1CD6A03:62BC8FBC
accept-ranges: bytes
date: Wed, 29 Jun 2022 17:45:32 GMT
via: 1.1 varnish
x-served-by: cache-cdg20775-CDG
x-cache: MISS
x-cache-hits: 0
x-timer: S1656524733.682011,VS0,VE170
vary: Authorization,Accept-Encoding,Origin
access-control-allow-origin: *
x-fastly-request-id: 86e60f5ed7c34ba154835cbaeac631cc5d5e9699
expires: Wed, 29 Jun 2022 17:50:32 GMT
source-age: 0
content-length: 4421

{
  "@context": {
      "type": "@type",
      "id": "@id",
      "schema":"http://schema.org/",
      "codemeta": "https://codemeta.github.io/terms/",
      "Organization": {"@id": "schema:Organization"},
      "Person": {"@id": "schema:Person"},
      "SoftwareSourceCode": {"@id": "schema:SoftwareSourceCode"},
      "SoftwareApplication": {"@id": "schema:SoftwareApplication"},
      "Text": {"@id": "schema:Text"},
      "URL": {"@id": "schema:URL"},
      "address": { "@id": "schema:address"},
      "affiliation": { "@id": "schema:affiliation"},
      "applicationCategory": { "@id": "schema:applicationCategory", "@type": "@id"},
      "applicationSubCategory": { "@id": "schema:applicationSubCategory", "@type": "@id"},
      "citation": { "@id": "schema:citation"},
      "codeRepository": { "@id": "schema:codeRepository", "@type": "@id"},
      "contributor": { "@id": "schema:contributor"},
      "copyrightHolder": { "@id": "schema:copyrightHolder"},
      "copyrightYear": { "@id": "schema:copyrightYear"},
      "creator": { "@id": "schema:creator"},
      "dateCreated": {"@id": "schema:dateCreated", "@type": "schema:Date" },
      "dateModified":  {"@id": "schema:dateModified", "@type": "schema:Date" },
      "datePublished":  {"@id": "schema:datePublished", "@type": "schema:Date" },
      "description": { "@id": "schema:description"},
      "downloadUrl": { "@id": "schema:downloadUrl", "@type": "@id"},
      "email": { "@id": "schema:email"},
      "editor": { "@id": "schema:editor"},
      "encoding": { "@id": "schema:encoding"},
      "familyName": { "@id": "schema:familyName"},
      "fileFormat": { "@id": "schema:fileFormat", "@type": "@id"},
      "fileSize": { "@id": "schema:fileSize"},
      "funder": { "@id": "schema:funder"},
      "givenName": { "@id": "schema:givenName"},
      "hasPart": { "@id": "schema:hasPart" },
      "identifier": { "@id": "schema:identifier", "@type": "@id"},
      "installUrl": { "@id": "schema:installUrl", "@type": "@id"},
      "isAccessibleForFree": { "@id": "schema:isAccessibleForFree"},
      "isPartOf":  { "@id": "schema:isPartOf"},
      "keywords": { "@id": "schema:keywords"},
      "license": { "@id": "schema:license", "@type": "@id"},
      "memoryRequirements": { "@id": "schema:memoryRequirements", "@type": "@id"},
      "name": { "@id": "schema:name"},
      "operatingSystem": { "@id": "schema:operatingSystem"},
      "permissions": { "@id": "schema:permissions"},
      "position": { "@id": "schema:position"},
      "processorRequirements": { "@id": "schema:processorRequirements"},
      "producer": { "@id": "schema:producer"},
      "programmingLanguage": { "@id": "schema:programmingLanguage"},
      "provider": { "@id": "schema:provider"},
      "publisher": { "@id": "schema:publisher"},
      "relatedLink": { "@id": "schema:relatedLink", "@type": "@id"},
      "releaseNotes": { "@id": "schema:releaseNotes", "@type": "@id"},
      "runtimePlatform": { "@id": "schema:runtimePlatform"},
      "sameAs": { "@id": "schema:sameAs", "@type": "@id"},
      "softwareHelp": { "@id": "schema:softwareHelp"},
      "softwareRequirements": { "@id": "schema:softwareRequirements", "@type": "@id"},
      "softwareVersion": { "@id": "schema:softwareVersion"},
      "sponsor": { "@id": "schema:sponsor"},
      "storageRequirements": { "@id": "schema:storageRequirements", "@type": "@id"},
      "supportingData": { "@id": "schema:supportingData"},
      "targetProduct": { "@id": "schema:targetProduct"},
      "url": { "@id": "schema:url", "@type": "@id"},
      "version": { "@id": "schema:version"},
        
      "author": { "@id": "schema:author", "@container": "@list" },
      
      "softwareSuggestions": { "@id": "codemeta:softwareSuggestions", "@type": "@id"},
      "contIntegration": { "@id": "codemeta:contIntegration", "@type": "@id"},
      "buildInstructions": { "@id": "codemeta:buildInstructions", "@type": "@id"},
      "developmentStatus": { "@id": "codemeta:developmentStatus", "@type": "@id"},
      "embargoDate": { "@id":"codemeta:embargoDate", "@type": "schema:Date" },
      "funding": { "@id": "codemeta:funding" },
      "readme": { "@id":"codemeta:readme", "@type": "@id" },
      "issueTracker": { "@id":"codemeta:issueTracker", "@type": "@id" },
      "referencePublication": { "@id": "codemeta:referencePublication", "@type": "@id"},
      "maintainer": { "@id": "codemeta:maintainer" }
  }
}

however, when the Accept: application/ld+json header is set, doi.org now returns a JSON-LD about the DOI itself:

$ curl https://doi.org/10.5063/schema/codemeta-2.0 -i -H 'Accept: application/ld+json' -L
HTTP/2 302 
date: Wed, 29 Jun 2022 17:46:17 GMT
content-type: text/html;charset=utf-8
content-length: 201
location: https://data.crosscite.org/10.5063%2Fschema%2Fcodemeta-2.0
vary: Accept
expires: Wed, 29 Jun 2022 18:41:01 GMT
permissions-policy: interest-cohort=(),browsing-topics=()
cf-cache-status: DYNAMIC
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=M8N8tl6yOtdgmGuBTVE4I5tkPo%2FHpcnEc2NmbwnV8K9p6G9Xc7C%2FrDXbE8ByOmtlWVLu8yKWoM%2BAxaamfoeXZCGEYMd4veY27G8RxIxxKuU1RMES3yf%2BlfKL7MGshVWnJsgCCo4%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
strict-transport-security: max-age=31536000; includeSubDomains; preload
server: cloudflare
cf-ray: 72307b108d44cd8b-CDG
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400

HTTP/2 200 
date: Wed, 29 Jun 2022 17:46:17 GMT
content-type: application/vnd.schemaorg.ld+json; charset=utf-8
status: 200 OK
cache-control: max-age=0, private, must-revalidate
vary: Accept-Encoding, Origin
etag: W/"ed54ee248f2a10875374e67b215a63ce"
x-runtime: 0.060207
x-request-id: f758c81f-ad1e-4b38-beea-6164e9b76dfd
x-powered-by: Phusion Passenger(R) 6.0.13
server: nginx/1.18.0 + Phusion Passenger(R) 6.0.13

{
  "@context": "http://schema.org",
  "@type": "SoftwareSourceCode",
  "@id": "https://doi.org/10.5063/schema/codemeta-2.0",
  "url": "https://raw.githubusercontent.com/codemeta/codemeta/2.0/codemeta.jsonld",
  "additionalType": "Software",
  "name": "CodeMeta: an exchange schema for software metadata",
  "author": [
    {
      "name": "Matthew B Jones",
      "givenName": "Matthew B",
      "familyName": "Jones",
      "@type": "Person"
    },
    {
      "name": "Carl Boettiger",
      "givenName": "Carl",
      "familyName": "Boettiger",
      "@type": "Person"
    },
    {
      "name": "Abby Cabunoc Mayes",
      "givenName": "Abby Cabunoc",
      "familyName": "Mayes",
      "@type": "Person"
    },
    {
      "name": "Arfon Smith"
    },
    {
      "name": "Peter Slaughter",
      "givenName": "Peter",
      "familyName": "Slaughter",
      "@type": "Person"
    },
    {
      "name": "Kyle Niemeyer",
      "givenName": "Kyle",
      "familyName": "Niemeyer",
      "@type": "Person"
    },
    {
      "name": "Yolanda Gil",
      "givenName": "Yolanda",
      "familyName": "Gil",
      "@type": "Person"
    },
    {
      "name": "Martin Fenner",
      "givenName": "Martin",
      "familyName": "Fenner",
      "@type": "Person"
    },
    {
      "name": "Krzysztof Nowak",
      "givenName": "Krzysztof",
      "familyName": "Nowak",
      "@type": "Person"
    },
    {
      "name": "Mark Hahnel",
      "givenName": "Mark",
      "familyName": "Hahnel",
      "@type": "Person"
    },
    {
      "name": "Luke Coy",
      "givenName": "Luke",
      "familyName": "Coy",
      "@type": "Person"
    },
    {
      "name": "Alice Allen",
      "givenName": "Alice",
      "familyName": "Allen",
      "@type": "Person"
    },
    {
      "name": "Mercè Crosas",
      "givenName": "Mercè",
      "familyName": "Crosas",
      "@type": "Person"
    },
    {
      "name": "Ashley Sands",
      "givenName": "Ashley",
      "familyName": "Sands",
      "@type": "Person"
    },
    {
      "name": "Neil Chue Hong",
      "givenName": "Neil Chue",
      "familyName": "Hong",
      "@type": "Person"
    },
    {
      "name": "Patricia Cruse",
      "givenName": "Patricia",
      "familyName": "Cruse",
      "@type": "Person"
    },
    {
      "name": "Dan Katz",
      "givenName": "Dan",
      "familyName": "Katz",
      "@type": "Person"
    },
    {
      "name": "Carole Goble",
      "givenName": "Carole",
      "familyName": "Goble",
      "@type": "Person"
    }
  ],
  "encodingFormat": "application/ld+json",
  "datePublished": "2017",
  "spatialCoverage": {
    "@type": "Place",
    "geo": {
      "@type": "GeoCoordinates",
      "address": "Santa Barbara, CA, USA"
    }
  },
  "successor_of": {
    "@id": "https://doi.org/10.5063/schema/codemeta-1.0",
    "@type": "CreativeWork"
  },
  "schemaVersion": "http://datacite.org/schema/kernel-4",
  "publisher": {
    "@type": "Organization",
    "name": "KNB Data Repository"
  },
  "funder": {
    "@type": "Organization",
    "name": "National Science Foundation"
  },
  "provider": {
    "@type": "Organization",
    "name": "datacite"
  }
}%

(thanks to @puckipedia for figuring this out)

@progval progval closed this as completed Jun 29, 2022
@progval progval reopened this Jun 29, 2022
@tmorrell
Copy link
Contributor

That is an issue...and I'm not sure how we would work around it. DataCite used to allow you have custom content negotiation, but now they provide content negotiation to the medatata. This makes sense in most cases....but not if you want to serve your own jsonld.

@progval progval changed the title schema.org HTTPS redirect breaks Codemeta's context files doi.org serving JSON-LD documents breaks Codemeta's context file Jun 29, 2022
@moranegg
Copy link
Contributor

Can we close this with the w3id change?

@tmorrell
Copy link
Contributor

Yup!

It actually looks like DataCite changed their content negotiation again so the DOI does work now....but w3id is going to be much more sustainable.

@progval
Copy link
Member Author

progval commented Dec 13, 2023

@tmorrell Did it? I still get their own JSON-LD instead of the Codemeta context:

$ curl https://doi.org/10.5063/schema/codemeta-2.0 -i -H 'Accept: application/ld+json' -L      
HTTP/2 302 
date: Wed, 13 Dec 2023 07:42:54 GMT
content-type: text/html;charset=utf-8
content-length: 201
location: https://data.crosscite.org/10.5063%2Fschema%2Fcodemeta-2.0
vary: Accept
expires: Wed, 13 Dec 2023 08:37:51 GMT
permissions-policy: interest-cohort=(),browsing-topics=()
cf-cache-status: DYNAMIC
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=S%2BchG1aMGc4MYFZSBlEBbLmWfas3B4Tk3sQG5YbAT%2F4bt%2B7z8wPsBTC6nSWSq15HY%2BSiRoJlEX57LdAk33ON%2Fd%2FUUUxADWFO2Tf2iQqpZBmk0tGZUvMtf8vw%2B%2FSrDetTWPf60%2Fk%3D"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
strict-transport-security: max-age=31536000; includeSubDomains; preload
server: cloudflare
cf-ray: 834c92b3481c22b5-CDG
alt-svc: h3=":443"; ma=86400

HTTP/2 200 
date: Wed, 13 Dec 2023 07:43:08 GMT
content-type: application/vnd.schemaorg.ld+json; charset=utf-8
status: 200 OK
cache-control: max-age=0, private, must-revalidate
vary: Accept-Encoding, Origin
etag: W/"ed54ee248f2a10875374e67b215a63ce"
x-runtime: 13.984227
x-request-id: 19b4244a-d68c-458e-aa99-f1502df703d0
x-powered-by: Phusion Passenger(R) 6.0.13
server: nginx/1.18.0 + Phusion Passenger(R) 6.0.13

{
  "@context": "http://schema.org",
  "@type": "SoftwareSourceCode",
  "@id": "https://doi.org/10.5063/schema/codemeta-2.0",
  "url": "https://raw.githubusercontent.com/codemeta/codemeta/2.0/codemeta.jsonld",
...

@tmorrell
Copy link
Contributor

Huh...playground is working and I thought the content negotiation was too...but now it's not. In any event w3id will solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants