-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems parsing .bib from Web of Science #31
Comments
Hi, thanks for your message. This seems to happen because of the multi-line values in this particular .bib file. I'll have to play with it a bit to see what can be improved in bib2df to avoid this behaviour. |
Any news on this issue? I have the same problem. I have downloaded a bib file from Web of Science and anything after a line break (e.g. all of the abstracts) is excluded from the dataframe. I really like your package otherwise, and hope that you are able to resolve this critical problem! |
@ottlngr we ran into the same issue (our code builds on bib2df). Maybe the function here could constitute the basis for a solution (not sure how robust it is): https://github.com/paulcbauer/flex_bib/blob/master/merge_bib_lines.R @jjsantana maybe this helps: https://github.com/paulcbauer/flex_bib#caveats |
@paulcbauer I added a test caste that covers this issue. Of cource it fails at the moment, but feel free to try integrating your function and see if the test succeeds. |
I added some code (optional argument merge_lines + function to merge
lines). I am not sure whether (and how) it interacts with the
separate_names argument. Also, there may be a nicer way to integrate it
into your functions.
…On Thu, Jul 2, 2020 at 9:59 PM Philipp Ottolinger ***@***.***> wrote:
@paulcbauer <https://github.com/paulcbauer> I added a test caste that
covers this issue. Of cource it fails at the moment, but feel free to try
integrating your function and see if the test succeeds.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB75DJ2FJD6OA7APBSUEOH3RZTRJBANCNFSM4IFFK3CA>
.
--
-----------------------------------------------------------------------------------------
Dr. Paul C. Bauer
Mannheim Centre for European Social Research
University of Mannheim
Email: mail@paulcbauer.eu
Current research: "Believing and Sharing Information by Fake Sources
<https://osf.io/mrxvc>"
Websites: Homepage <http://www.paulcbauer.eu/>, GoogleScholar
<https://scholar.google.ch/citations?user=zRqPQ_kAAAAJ&hl=en&oi=ao>,
ResearchGate <https://www.researchgate.net/profile/Paul_Bauer4>,
www.tweetingpoliticians.com, SSRN
<http://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=1911340>, Twitter
<https://twitter.com/p_c_bauer>, Github <https://github.com/paulcbauer>
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination, distribution,
forwarding, or other use of, or taking of any action in reliance upon, this
information by persons or entities other than the intended recipient is
prohibited without the express permission of the sender. If you received
this communication in error, please contact the sender and delete the
material from any computer.
|
Cool, thanks for the effort. I will have a closer look at it. |
Cool thanks. There was some sort of error message but I didn't know how
relevant it is.
…On Fri, Jul 10, 2020 at 11:59 AM Philipp Ottolinger < ***@***.***> wrote:
Cool, thanks for the effort. I will have a closer look at it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB75DJ7SN4PXUPZZVPLQIBTR23RARANCNFSM4IFFK3CA>
.
--
-----------------------------------------------------------------------------------------
Dr. Paul C. Bauer
Mannheim Centre for European Social Research
University of Mannheim
Email: mail@paulcbauer.eu
Current research: "Believing and Sharing Information by Fake Sources
<https://osf.io/mrxvc>"
Websites: Homepage <http://www.paulcbauer.eu/>, GoogleScholar
<https://scholar.google.ch/citations?user=zRqPQ_kAAAAJ&hl=en&oi=ao>,
ResearchGate <https://www.researchgate.net/profile/Paul_Bauer4>,
www.tweetingpoliticians.com, SSRN
<http://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=1911340>, Twitter
<https://twitter.com/p_c_bauer>, Github <https://github.com/paulcbauer>
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination, distribution,
forwarding, or other use of, or taking of any action in reliance upon, this
information by persons or entities other than the intended recipient is
prohibited without the express permission of the sender. If you received
this communication in error, please contact the sender and delete the
material from any computer.
|
@paulcbauer 's suggestion to the |
Hi there Wondered if there was an update on this issue. I'm unable to import full abstracts from WoS .bib files and cannot get the above solutions to work. Thanks. |
Apologies - I did get @paulcbauer's merge_bib_lines function to work and it solved the issue with import of incomplete abstracts - many thanks. |
Problem I have now is that the merge_bib_lines function does not parse text properly when the character "=" is encountered - any ideas? Thanks |
There should be some regex workaround. I just don't have any time right now
to look into this (hopefully in the next weeks). Sorry!
…On Thu, May 20, 2021 at 11:30 AM Robert Berryr ***@***.***> wrote:
Problem I have now is that the merge_bib_lines function does not parse
text properly when the character "=" is encountered - any ideas? Thanks
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB75DJ7ADZI2ZVWJB4TA7WTTOTJE3ANCNFSM4IFFK3CA>
.
--
-----------------------------------------------------------------------------------------
Dr. Paul C. Bauer
Mannheim Centre for European Social Research
University of Mannheim
Email: ***@***.***
Current research: "Believing and Sharing Information by Fake Sources
<https://doi.org/10.1080/10584609.2020.1840462>" (Political Communication)
Websites: Homepage <http://www.paulcbauer.eu/>, GoogleScholar
<https://scholar.google.ch/citations?user=zRqPQ_kAAAAJ&hl=en&oi=ao>,
ResearchGate <https://www.researchgate.net/profile/Paul_Bauer4>,
www.tweetingpoliticians.com, SSRN
<http://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=1911340>, Twitter
<https://twitter.com/p_c_bauer>, Github <https://github.com/paulcbauer>
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination, distribution,
forwarding, or other use of, or taking of any action in reliance upon, this
information by persons or entities other than the intended recipient is
prohibited without the express permission of the sender. If you received
this communication in error, please contact the sender and delete the
material from any computer.
|
I downloaded a bib file from Web of Science savedrecs.zip and there are multiple issues when reading it. The solution shown in #21 doesn't work here :(
Most of them seen to be related with what you @ottlngr mentioned in in #21 (key-value pairs not separated by linebreaks):
But other issues seem to arise from a different thing:
[A] single_reference.zip
When reading this bib reference, the following lines of the abstract are creating new columns (the first-word of the line is the column title, and the text in the cell is whatever comes after the "="):
So, the first of those creates a BENEFITS column with a text "451) or non-evidence-based (e.g., relative risks"
Please, let me know if I can be of any help testing/debugging this.
The text was updated successfully, but these errors were encountered: