Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newline char in node description/definition causes dirty lines in node metadata files #116

Closed
nomisto opened this issue Oct 12, 2021 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@nomisto
Copy link

nomisto commented Oct 12, 2021

Describe the bug
Hello, great work, thanks for providing so much data! I've recently discovered that the metadata files contain "dirty" lines which may be result of newline characters in the source of the description of a node. This is not a breaking bug, but just so you know. So far i've checked version 2.0 (build build_11FEB2021) and 3.0 (build_02OCT2021).

To Reproduce
Steps to reproduce the behavior:

  1. Download f.e. https://storage.googleapis.com/pheknowlator/archived_builds/release_v3.0.0/build_02OCT2021/knowledge_graphs/instance_builds/relations_only/owlnets/PheKnowLator_v3.0.0_full_instance_relationsOnly_OWLNETS_NodeLabels.txt
  2. Search for 'http://purl.obolibrary.org/obo/VO_0000247' or go to line 310638 (this is an example, there are a few others that I discovered)
  3. See error: lines 310639-310643 contain text belonging to description of VO_0000247, possibly due to newline character in source.
NODES	369013	<http://purl.obolibrary.org/obo/VO_0000247>	vaccine efficacy	Vaccine efficacy is an efficacy of a vaccine in induction of protective immune response in vivo or protection against infection of a virulent pathogen. 
Specifically, vaccine efficacy (VE) is the percentage reduction in disease incidence attributable to vaccination, calculated by means of the following equation:
VE(%) = (U - V)/U x 100
where U = the incidence in unvaccinated people and 
V = the incidence in vaccinated people.
Ref: Hadler TC, et al. Immunization in developing countries. In: Vaccines. Editors: Plotkin S, et al. 2008. p1542-71.	None
NODES	671778	<http://purl.obolibrary.org/obo/CHEBI_165329>	Dinor-PGD2	None	(Z)-5-[(1R,2R,5S)-5-hydroxy-2-[(E,3S)-3-hydroxyoct-1-enyl]-3-oxocyclopentyl]pent-3-enoic acid
@nomisto nomisto added the bug Something isn't working label Oct 12, 2021
@callahantiff
Copy link
Owner

Thanks for the heads up on this @nomisto! Looking into it now.

@callahantiff callahantiff added this to Needs to be Done in Coding Tasks via automation Oct 12, 2021
callahantiff added a commit that referenced this issue Oct 13, 2021
@callahantiff
Copy link
Owner

Hi @nomisto. Thanks again for pointing out this bug!

I have found and repaired the error in the codebase and pushed an update to PyPI. I am currently in the process of updating the node_metadata_dict.pkl and XXXX_NodeLabels.txt files for all v2.0.0 (excluding build_10MAY2020), v2.1.0, and v3.0.0 builds.

I am happy to let you know when that processing is complete. I hope to have it done by Friday at the very latest (ideally by tomorrow).

@callahantiff
Copy link
Owner

callahantiff commented Oct 13, 2021

@nomisto - Just to keep you updated on the progress, I have created a list of all of the builds I will be updating and I will check each box once it's complete and ready for use.

Updated Build Metadata

  • release_v2.0.0
    • archived_builds/release_v2.0.0/build_25JAN2021/
    • archived_builds/release_v2.0.0/build_11FEB2021/
  • release_v2.1.0
    • archived_builds/release_v2.1.0/build_01MAY2021/
    • archived_builds/release_v2.1.0/build_01JUN2021/
    • archived_builds/release_v2.1.0/build_06JUL2021/
    • archived_builds/release_v2.1.0/build_01AUG2021/
    • archived_builds/release_v2.1.0/build_01SEP2021/
  • release_v3.0.0
    • archived_builds/release_v2.1.0/build_02OCT2021/
    • current_build

@callahantiff
Copy link
Owner

@nomisto - everything has been updated. Please feel free to close this issue if everything looks OK to you.

@nomisto
Copy link
Author

nomisto commented Oct 14, 2021

Thanks @callahantiff, for taking care of this so quickly, everything looks good now!

@nomisto nomisto closed this as completed Oct 14, 2021
Coding Tasks automation moved this from Needs to be Done to Completed Oct 14, 2021
callahantiff added a commit that referenced this issue Oct 18, 2021
re-triggering October build due to bugs identified in issues #116 and #118
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Coding Tasks
  
Completed
Development

No branches or pull requests

2 participants