Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clinicaltrials.gov website changes #3069

Closed
agitter opened this issue Jul 2, 2023 · 3 comments · Fixed by #3076
Closed

clinicaltrials.gov website changes #3069

agitter opened this issue Jul 2, 2023 · 3 comments · Fixed by #3076
Assignees

Comments

@agitter
Copy link

agitter commented Jul 2, 2023

ClinicalTrials.gov updated their website (announcement). This will likely affect clinicaltrials.gov.js. For instance, the classic version of the site now has URL patterns like https://classic.clinicaltrials.gov/ct2/show/NCT04292899, and the translator target is "target": "^https://(www\\.)?clinicaltrials\\.gov/ct2/(show|results\\?)". The new version of the site may store metadata differently or different metadata altogether.

Here's an example of the new and classic sites:

@zoe-translates
Copy link
Collaborator

The current implementation uses the API for most of the work, and we should migrate to the new API too.

@adam3smith, may I have your opinion about migration? Based on your changes to the MathSciNet translator, am I correct to say that we want to move the existing translator script (without changing the UUID) to a new name like clinicaltrials.gov (Legacy).js, and create a new translator (with new UUID) that takes over the name clinicaltrials.gov.js?

@zoe-translates zoe-translates self-assigned this Jul 12, 2023
@dstillman
Copy link
Member

dstillman commented Jul 12, 2023

Is there any reason to support the classic domain?

We usually just update the translator for the current version of the site.

@zoe-translates
Copy link
Collaborator

From the site: https://www.clinicaltrials.gov/data-about-studies/learn-about-api
(emphasis mine)

This is a draft version of the new API
Authentication is not required to access the API at this time but will be required in future.
The classic ClinicalTrials.gov API will remain available for some time.

And from https://www.clinicaltrials.gov/data-about-studies/api-migration

The legacy API will remain available for the foreseeable future, and users will be notified in advance of any changes to that status.

It seems that for API access, the classic domain will be supported "for some time". Meanwhile, the default user-facing web interface switched to the new design, while the old design (for which the translator was written) is being temporarily maintained at the classic domain.

I think I'm going to do the following:

  • We use URL and on-page design elements for detection, so this definitely needs fixing. But my feeling is that it shouldn't require a split into two translators, because we only use these elements sparingly. The actual item metadata are obtained from the API.
  • For API use, we don't have to do much for now; perhaps just change the domain in order to save a redirect for each call.
  • Meanwhile, it's advisable to prepare for the new API. The future requirements of authentication is a bit concerning but we don't have much concrete info about that atm.

zoe-translates added a commit to zoe-translates/translators that referenced this issue Jul 12, 2023
- Make target identification and the detection of item/search work for
  both the new UI at (www.)clinicaltrials.gov and the old UI at
  classic.clinicaltrials.gov.
- Reduce network traffic significantly by eliminating the request for
  the full document when processing search results. In fact, the URL,
  which contains the NCTId, is all that's necessary for getting the
  results, and it can be scraped from the search-results page.
- Use async requests for the JSON data, which also makes the code less
  nested.
- Update and add test cases, including search page in both old and new
  UIs.
- Eliminate some dead code.
- In the routine processing JSON data, make the code less verbose.
- Other small fixes.

This resolves zotero#3069.
dstillman pushed a commit that referenced this issue Jul 12, 2023
- Make target identification and the detection of item/search work for
  both the new UI at (www.)clinicaltrials.gov and the old UI at
  classic.clinicaltrials.gov.
- Reduce network traffic significantly by eliminating the request for
  the full document when processing search results. In fact, the URL,
  which contains the NCTId, is all that's necessary for getting the
  results, and it can be scraped from the search-results page.
- Use async requests for the JSON data, which also makes the code less
  nested.
- Update and add test cases, including search page in both old and new
  UIs.
- Eliminate some dead code.
- In the routine processing JSON data, make the code less verbose.
- Other small fixes.

This resolves #3069.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants