-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate + add data from drugbank.ca #61
Comments
Thanks for making this issue @jenniferthompson!
|
They got back to me pretty quickly, and had some questions about data.world that I'm not sure I know the answer to:
Is there any way we can include their license and citation on data.world? I'm pretty sure it will be more characters than are allowed in the "description" on data.world, and I'm not sure where else dataset metadata can be put on data.world (which is pretty surprising...) Alternatively, should we just stick with the public domain data? |
I would suggest reaching out to one of our contacts with data.world inside
the D4D Slack group. I believe @gabriela might be a good first-line person
for this. I'll ping her in the channel now, and we can continue the
conversation there.
…On Apr 27, 2017 5:54 PM, "cduvallet" ***@***.***> wrote:
They got back to me pretty quickly, and had some questions about
data.world that I'm not sure I know the answer to:
Looks like an interesting project, thanks for reaching out!
I checked out your site and noticed a couple of issues:
1.
Data.world looks like a commercial project that requires people have
accounts to download data. It doesn't look like they have a good way to
post the licenses for datasets? Maybe I am not understanding what
data.world is.
2.
I don't see a clear indication of the license for the datasets
available through your website, or clear citations to the datasets there?
Your use case looks like a non-commercial use case, so that should be fine
but, when our data is shared it has to be shared both with a citation and
the license we share our data under.
We also have 2 datasets that are public domain and you can do whatever you
want with them, on this page: https://www.drugbank.ca/
releases/latest#open-data
They include DrugBank identifiers, names, and synonyms to permit easy
linking and integration into any type of project.
Is there any way we can include their license and citation on data.world?
I'm pretty sure it will be more characters than are allowed in the
"description" on data.world, and I'm not sure where else dataset metadata
can be put on data.world (which is pretty surprising...)
Alternatively, should we just stick with the public domain data?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#61 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAo5cwWdhJPJ4FbhHSzH5BdueEDrxAQYks5r0Ry5gaJpZM4NJrGR>
.
|
hi all-- first time jumping in here! at the NYC hackathon rn, seems like this issue is pretty recent and would like to start munging something.... guidance? |
@forzavitale I pinged the DrugBank people again to ask if we could just include the license and citation info in the header of the file, since we can't assign it to the file directly via data.world. They haven't gotten back to me about that though. That said, in my opinion it should be fine so you can probably start working on the data. Let's just make sure to check back in with them before we post the data to data.world. Alternatively, you can poke around the public domain data and see if that's enough to get us what we want! |
I think that's a good plan @cduvallet - and at the speed the data.world folks move (read: blazing fast), it's entirely plausible that we might be able to assign a file-specific license by the time we're ready to post it. |
Update, just heard back from the DrugBank people and they said that including the info in the header of the file is fine. Full speed ahead! @forzavitale can you update us on your progress from the hackathon (if you ended up working on this)? |
Fantastic! Thanks so much for following up, @cduvallet! 🎉 |
Is this still a project that needs help? I see the label but comments are fairly old. Been lurking on D4D for a while but interested in working on something. |
Hello! The project has been dormant for a while (hence the old comments), I'm one of the people that's trying to get this project going again. Any issue with the label status-under-review can be ignored for now, it either can't be tackled yet or may need to be trimmed/reformatted. This is one of the older issues that I thought would be good to try and get through because drugbank.ca materials seem to be very useful for our current goal of matching drugs to therapeutic uses. |
In PR #83 @proof-by-accident investigated how many of the Medicare drugs can be found in the drugbank.ca data. The results seem similar to matching attempts attempts from other sources: a good number of drugs can be matched easily on the first pass, but about twice as many were not matched and will probably require a non-trivial amount of research to match the rest properly. |
I don't have the coding ability to do this, but I am knowledgeable about the domain as an informatics pharmacist and willing to offer some help from that aspect. Pretty sure the answer to this problem is the Structured Product Labeling (SPL). It is a document markup standard approved by Health Level Seven (HL7) and adopted by FDA as a mechanism for exchanging product and facility information. Different datasets use different drug identifiers: brand name, generic name, NDA, NDC, etc. and it is hard to find the same drug in different datasets. The OpenFDA features harmonization of drug identifiers and fields for various pharmacological use are part of the dataset. Take a look: https://open.fda.gov/drug/label/reference/ |
Oh this seems great! The OpenFDA might be just what we need because you're right, we have been running into the issue where not everything is in one dataset and the names can be inconsistent between datasets. Thanks for this suggestion. |
Can I help? |
Is this still active? Can I start this or is this throw away work? |
@cduvallet made us aware of a site (drugbank.ca) that looks like it has very promising data! We need someone to
The text was updated successfully, but these errors were encountered: