Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open, and Linked, FDA data #5

Closed
kerfors opened this issue Jun 4, 2014 · 20 comments
Closed

Open, and Linked, FDA data #5

kerfors opened this issue Jun 4, 2014 · 20 comments

Comments

@kerfors
Copy link

kerfors commented Jun 4, 2014

Excellent to see that OpenFDA features harmonization on identifiers as annotated strings http://open.fda.gov/api/reference/ How about http-based URIs as a next step? As part of applying the LinkedData principles, see http://kerfors.blogspot.se/2014/04/openfda.html for links to examples such as LinkedAERS and Linked SPL and to experts in W3C HCLS and Bio2RDF

@seanherron
Copy link

Hi @kerfors, it looks like most of the links on that page are broken. Do you have an example of how this would play out with the API Results?

@seanherron seanherron self-assigned this Jun 6, 2014
@kerfors
Copy link
Author

kerfors commented Jun 8, 2014

HI Sean, sorry to see that the @Data2Semantics links on the Linked AERS site are broken. I'll ping @RinkeHoekstra. However, the general idea is to assign a http-based identifier for each code (URI/IRI). So, beside providing a code as a string it would be great to also have a URI, prefereable URIs that have resolver services (eventhough some of the licenses make this hard), Health care data standard intitatives such as HL7 FHIR are now moving in this direction http://hl7.org/implement/standards/fhir/terminologies-systems.html And we are pushing back to clinical research data standard organisationer such as CDISC and MedDRA to provide persistent URIs. I propose that you get in contact with experts in linked data and semantic web for health care and life science (W3C HCLS) such as @micheldumontier, @egonwillighagen, Eric Prud'hommeaux and Charlie Mead.

@RinkeHoekstra
Copy link

Hi Kerstin and Sean,

Thanks for letting me know, we're working to get Linked AERS up and running again. Keep in mind that this is not a 'live' service. We would most gladly connect to the OpenFDA API instead.

Rinke

@GeekNurse
Copy link

@kerfors @RinkeHoekstra this site may be helpful to you if your simply wanting to search through the OpenFDA AERS data - ResearchAE.com - http://www.researchae.com/
Let me know if you have any questions.

@kerfors
Copy link
Author

kerfors commented Jun 18, 2014

Hi @GeekNurse I'm pointing colleagues to your great user interface to query and visualise data using the brilliant openFDA API. My point to @seanherron is really the additional value you would get from having the identifiers as http-based URIs as a step towards 5 star Open and Linked data, Checkout http://5stardata.info/ and http://www.w3.org/TR/ld-bp/ cc: @BernHyland

@kerfors kerfors changed the title OpenandLinkedFDA Open and Linked FDA data Jun 18, 2014
@kerfors kerfors changed the title Open and Linked FDA data Open, and Linked, FDA data Jun 18, 2014
@kerfors
Copy link
Author

kerfors commented Nov 20, 2014

Great to see the "Enhancement" label on this issue. Any updates?

@bewest
Copy link

bewest commented Jan 21, 2015

This may be a crazy tangent, but it's awesome to see FDA talking about LinkedData.
In Richard Chapman's discussion of assurance cases, he talks about iterating, inspecting, and reasoning through recursive documents, and I immediately thought about LinkedData, and something like http://worrydream.com/TenBrighterIdeas/ as an editor/UI for these documents.

For my open source project, I've been asked to prepare a gap analysis, so I've started modeling a suite of documents roughly after the quality regulations themselves. http://process-controls.readthedocs.org/en/latest/index.html

As a thought experiment, are tools such as the openfda API an example of the MDDT regulations? Do the quality controls/regulations apply to tools such as these? If quality systems regulations were to apply to a project such as this, and you had to provide eg, a class III submission, would you used LinkedData to link audit reports to the quality controls, so that someone could traverse the data all the way to the regulations and back to the practical issue "at hand?" In theory github tracks a lot of usage trails, using the API, so reports/templates could automate a lot of the typically cumbersome (and expensive) work of preparing appropriate documentation. If class III controls applied to a project like this, would you eg create rst/markdown output from the test runners to render reports? Would they link or be linked to in some special way to or from other reports? With LinkedData, in theory, someone looking at MDRs/maude could track trending issues all the way back to reports on how things were fixed, or handled. Would the regulations themselves also need to be linked data?

As a practical matter, I've chosen sphinx/rst/markdown so I can easily re-theme, re-engrave, re-render, and version documents with as many or as few "links" in them as needed. As an even more practical matter, I would love to figure out a way to cite or link to regulations, and if the project starts submitting MDR reports, how to best interlink between everything.

Lot of questions in there, from philosophical to practical, and perhaps only tangentially related to the thread, sorry for the noise if that's the case.

@westurner
Copy link

Could there be a JSONLD @context?

@westurner
Copy link

Could there be a JSONLD @context?

  1. Create/generate a JSON-LD @context
  2. Annotate with the relevant @context attributes
    • (EDIT: this would need to be added to the data pipelines)

Docs:

@westurner
Copy link

I wrote a tool to generate approximate JSON-LD @contexts from (these) ElasticSearch mappings: https://github.com/westurner/elasticsearchjsonld/blob/master/elasticsearchjsonld/elasticsearchjsonld.py

The output JSON-LD @context schema (.jsonld) are here : https://github.com/westurner/openfda-jsonld-testing/tree/gh-pages/ns

Not sure how to test these

@westurner
Copy link

I specified the vocabulary prefixes as http://open.fda.gov/ns/${x}# here:
https://github.com/westurner/elasticsearchjsonld/blob/master/scripts/build_openfda_jsonld_contexts.sh

@westurner
Copy link

To make these more useful, there could be mappings e.g. to URNs/URIs that would need to be manually added to the JSON-LD @contexts (e.g. http://schema.org/docs/meddocs.html )

See "TODO" here for broader health informatics #LinkedData context: https://westurner.github.io/opengov/us/#health

@westurner
Copy link

http://schema.org/docs/meddocs.html

The schema does provide a way to annotate entities with codes that refer to existing controlled medical vocabularies (such as MeSH, SNOMED, ICD, RxNorm, UMLS, etc) when they are available.

@bewest
Copy link

bewest commented Apr 8, 2016

To add some more color to this, after reading https://medium.com/@chrishannemann/measure-seventy-five-times-cut-once-further-blood-glucose-meter-testing-9e769a853710, I was inspired to mock up a way for people, (citizen scientists) to contribute pair-wise readings from glucometers in order to aide post-market surveillance.

To make this easy, I noticed that openfda published device registrations and listings, and thought this might be a good way to automatically populate a list of meters for users to choose from. However, it's not clear to me how common labeling might be linked directly to attributes of a particular device, or how the device registrations overlap with what people experience in the market.

For example, while it appears I can search for eg, OTC glucose meters by restricting for regulation_number:862.1345, which is not quite enough.

  • https://api.fda.gov/device/registrationlisting.json?search=products.open fda.regulation_number:862.1345+AND+products.openfda.device_name:glucose+ AND+products.openfda.device_name:%22over+the+counter%22&limit=1000&skip= 0&count=registration.name.exact or similar will list vendors that create devices, but there's no way to address or dereference a particular proprietary_name, or know what attributes might be connected to the product? Eg, which labeling will the user see, any picture available, which "test strips" might be registered as compatible? Once a user has "selected" a device, is there any addressing/numbering system that allows referring to it precisely in the future (keep track of the k_number, or???

Here's a quick hello-world demo using FDA's registration database to provide an easy way to select a glucometer that is sold in the US.
optimised

Once chosen though, I'm having trouble finding a unique identifier that would identify only the selected device, and as a nice to have, it'd be lovely to find links to additional media/labeling connected with the approval/device.

HTH see what people can start to do with this very very cool API. It's been wonderful to see more and more data added and start to be able to integrate against this.

@westurner
Copy link

On Apr 8, 2016 10:25 AM, "Ben West" notifications@github.com wrote:

[...].

Once chosen though, I'm having trouble finding a unique identifier that
would identify only the selected device, and as a nice to have, it'd be
lovely to find links to additional media/labeling connected with the
approval/device.

There may be opportunities for linking with schema.org RDF / JSON-LD:

AFAIU, feedback/suggestions/(PRs) for e.g. schema:MedicalDevice should be
added to (or referenced w/ #.492).

Potential data publishers here:

IIUC, your need implies a need for (RDF(a), JSON-LD) linked data mappings
between { device codes , urls , [?] }.

@westurner
Copy link

westurner commented Oct 5, 2019 via email

@violetcrestedwren
Copy link
Contributor

Hi Wes,

I was going through and removing some old issues that no longer seemed relevant. If this is still relevant let me know and the team will take a look.

@westurner
Copy link

westurner commented Oct 8, 2019

@beardedfinch et aI.

We can maximize the utility of this and other FDA datasets by using URIs as identifiers and using or creating RDFS vocabu laries with URIs for each "column" of each Dataset.

Elasticsearchjsonld is one way to generate a JSONLD @context (akin to RDFS schema) from an existing elasticsearch mapping schema.

When I search for "FDA linked data", I find mentions of universal device identifiers for use with EHRs: https://www.healthdatamanagement.com/news/fda-sees-benefits-of-linking-universal-device-ids-to-ehrs

There's also the FDA DSCSA pharmaceutical blockchain pilot program where e.g. JSONLD (or any other RDF linked data representation) would be very helpful for data integration with industry and international datasets.

Use cases for linked data for integration with this dataset:

  • Determine whether the variance is due to e.g.: supply chain handling, patient attributes, world region / zip code, air quality, water quality, concomitant treatments and devices, presence of active ingredients or presumed inactive ingredients
  • Build a relative hazard statistic and data visualization; such as a "heat map" ENH: Adverse Event Count / 'Use' Count Heatmap #49 (comment)

You might argue that this issue should be closed because there is currently no FDA effort to publish this or other datasets as linked data. Or, it could be argued that this issue should remain open precisely because there is no other industry effort to enable data integration with linked data.

To whom at FDA should the strong case for linked data presented by e.g. https://5stardata.info and https://lod-cloud.net/ be directed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants