Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClinicalTrials.gov website updates #1210

Open
agitter opened this issue Jun 28, 2023 · 8 comments
Open

ClinicalTrials.gov website updates #1210

agitter opened this issue Jun 28, 2023 · 8 comments

Comments

@agitter
Copy link
Collaborator

agitter commented Jun 28, 2023

ClinicalTrials.gov updated their website (announcement). I'm not sure whether the Zotero citation infrastructure that Manubot relies on will be stable. The classic website will be retired.

Here's an example of the new and classic sites:

I believe we need to update https://github.com/biopragmatics/bioregistry/ and then the Manubot package to update these URLs. Is that correct @dhimmel?

@agitter agitter assigned rando2 and unassigned rando2 Jun 28, 2023
@rando2
Copy link
Collaborator

rando2 commented Jun 29, 2023

Oh no! I'm definitely interested in keeping an eye on this.

@dhimmel
Copy link
Collaborator

dhimmel commented Jun 29, 2023

Also noting the URL format used for resolution: https://clinicaltrials.gov/ct2/show/NCT04619628. This URL currently redirects to the classic view, although I imagine eventually it which switch to redirect to the new view if classic is retired.

I think updating Bioregistry is a good idea. I'll make a PR for that.

I'm not sure whether the Zotero citation infrastructure that Manubot relies on will be stable

This is the bigger worry IMO, so good to keep in mind when we upgrade the bioregistry version in Manubot, and perhaps to test so we can report any bugs to Zotero beforehand.

@cthoyt
Copy link

cthoyt commented Jun 29, 2023

One solution for upgrading manubot is to simply generate URLs that are https://bioregistry.io/:, then you don't have to update this in manubot ever since this can be handled upstream

@agitter
Copy link
Collaborator Author

agitter commented Jun 29, 2023

perhaps to test so we can report any bugs to Zotero beforehand.

Directly testing the classic and new website formats indicates there will be problems. I'm assuming the Zotero translator is used for both.

$ manubot cite https://classic.clinicaltrials.gov/ct2/show/NCT04619628
[
  {
    "id": "12D6rB04F",
    "type": "webpage",
    "abstract": "Safety and Immunogenicity of COVI-VAC, a Live Attenuated Vaccine Against COVID-19 - Full Text View.",
    "language": "en",
    "title": "Safety and Immunogenicity of COVI-VAC, a Live Attenuated Vaccine Against COVID-19 - Full Text View - ClinicalTrials.gov",
    "URL": "https://clinicaltrials.gov/ct2/show/NCT04619628",
    "accessed": {
      "date-parts": [
        [
          "2023",
          6,
          29
        ]
      ]
    },
    "note": "This CSL Item was generated by Manubot v0.5.5 from its persistent identifier (standard_id).\nstandard_id: url:https://classic.clinicaltrials.gov/ct2/show/NCT04619628"
  }
]
$ manubot cite https://www.clinicaltrials.gov/study/NCT04619628
[
  {
    "id": "aHoxDwRa",
    "type": "webpage",
    "title": "CTG Labs - NCBI",
    "URL": "https://www.clinicaltrials.gov/study/NCT04619628",
    "accessed": {
      "date-parts": [
        [
          "2023",
          6,
          29
        ]
      ]
    },
    "note": "This CSL Item was generated by Manubot v0.5.5 from its persistent identifier (standard_id).\nstandard_id: url:https://www.clinicaltrials.gov/study/NCT04619628"
  }
]

This trial ID isn't even the best example because it doesn't set all the metadata that zotero/translators#2153 added support for, like the creators.

cthoyt pushed a commit to biopragmatics/bioregistry that referenced this issue Jun 29, 2023
refs greenelab/covid19-review#1210

Two changes (one per commit) that should be treated as separate (I can
remove either of the changes from this PR based on review).
@agitter
Copy link
Collaborator Author

agitter commented Jul 2, 2023

I opened a similar issue for the Zotero translator. I don't know enough Javascript to make the updates myself. zotero/translators#3069

@agitter
Copy link
Collaborator Author

agitter commented Jul 13, 2023

Good news on this, Zotero contributors responded to my issue and updated the clinicaltrials.gov.js translator. I believe we would need to update the Zotero translation-server (manubot/manubot#82) before we can test those changes in Manubot.

@dhimmel
Copy link
Collaborator

dhimmel commented Jul 13, 2023

Nice! I just updated the Manubot translation-server to zotero/translators@28f344cd, which includes zotero/translators@edde701. But I'm still getting the same result as above for manubot cite https://www.clinicaltrials.gov/study/NCT04619628 with the title as CTG Labs - NCBI. I would expect this now to be the actual title, kind of confused.

@agitter
Copy link
Collaborator Author

agitter commented Jul 13, 2023

It seems like your translation-server update worked. I tested another URL from the test case of a recent commit zotero/translators@aa7d6a2

$ curl -d 'https://www.nrc.nl/nieuws/2022/12/03/wikipedia-wordt-onbetrouwbaar-alweer-a4150299' -H 'Content-Type: text/plain' https://translate.manubot.org/web
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   589  100   507  100    82    479     77  0:00:01  0:00:01 --:--:--   557[{"key":"ZTT5BNZK","version":0,"itemType":"newspaperArticle","creators":[{"firstName":"Maxim","lastName":"Februari","creatorType":"author"}],"tags":[],"title":"Column | Wikipedia wordt onbetrouwbaar. Alweer","publicationTitle":"NRC","rights":"Copyright Mediahuis NRC BV","url":"https://www.nrc.nl/nieuws/2022/12/03/wikipedia-wordt-onbetrouwbaar-alweer-a4150299","abstractNote":"Column:Maxim Februari","date":"2022-12-03","language":"nl-NL","libraryCatalog":"www.nrc.nl","accessDate":"2023-07-13T15:42:12Z"}]

The output matches at first glance.

The output does not match the classic clinical trials URL test case:

$ curl -d 'https://classic.clinicaltrials.gov/ct2/show/NCT04292899' -H 'Content-Type: text/plain' https://translate.manubot.org/web
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   599  100   544  100    55    408     41  0:00:01  0:00:01 --:--:--   449[{"key":"5KR82YAW","version":0,"itemType":"webpage","creators":[],"tags":[],"title":"Study to Evaluate the Safety and Antiviral Activity of Remdesivir (GS-5734™) in Participants With Severe Coronavirus Disease (COVID-19) - Full Text View - ClinicalTrials.gov","url":"https://clinicaltrials.gov/ct2/show/NCT04292899","abstractNote":"Study to Evaluate the Safety and Antiviral Activity of Remdesivir (GS-5734™) in Participants With Severe Coronavirus Disease (COVID-19) - Full Text View.","language":"en","accessDate":"2023-07-13T15:42:56Z"}]

For example, the creators are empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants