Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge case results in error being thrown when extracting abstract text in medlinecitation.py #157

Closed
pmartin23 opened this issue Jun 27, 2018 · 1 comment · Fixed by #158

Comments

@pmartin23
Copy link
Contributor

pmartin23 commented Jun 27, 2018

Hey there,

Firstly - this library has saved me so much time. Thanks for all of your hard work!

Today I noticed I was receiving an error when trying to access the abstract property of this specific PubMed article and upon further investigation saw that (unusually) the AbstractText contains formatting tags which seems to break line 16 in medlinecitation.py:
return " ".join([at.text for at in self._xml_root.findall('Article/Abstract/AbstractText')])

The issue seems to be fixed by changing the line to the below, which should capture the text of all AbstractText and child elements:
return " ".join(["".join(at.itertext()) for at in self._xml_root.findall('Article/Abstract/AbstractText')])

I'm new to open source contribution but would love to make this my first PR!

@reece
Copy link
Member

reece commented Sep 25, 2018

Hi @pmartin23 - So sorry for the long delay. I completely missed this comment in eutils. I'm glad the library is useful to you!

I agree with your diagnosis and solution. It would be great to have a PR from you. Please let me know if you need any help in preparing a PR -- see my profile for my email address.

pmartin23 pushed a commit to pmartin23/eutils that referenced this issue Oct 10, 2018
reece pushed a commit that referenced this issue Oct 10, 2018
…abstract text (#158)

* fix error that occurrs when AbstractText element contains formatting tags
* add test to cover edge case referenced in #157
reece pushed a commit that referenced this issue Oct 10, 2018
…abstract text (#158)

* fix error that occurrs when AbstractText element contains formatting tags
* add test to cover edge case referenced in #157
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants