Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pubmed import should have quality check multiple times. #4345

Closed
bernhard-kleine opened this issue Sep 15, 2018 · 6 comments
Closed

Pubmed import should have quality check multiple times. #4345

bernhard-kleine opened this issue Sep 15, 2018 · 6 comments

Comments

@bernhard-kleine
Copy link

bernhard-kleine commented Sep 15, 2018

When I imported from Medline today (PMID: 30096135) Jabref immediately went into decline. I know that the problem is related to #4323, but that problem resolved, I would argue that an import into Jabref should be cleaned in any possible way. If the author list is not bibtex conform, please make it so. When there is a standard like Bibtex, we should keep to that and not letting anything to deviate from that standard.

JabRef 5.0-dev--snapshot--2018-09-13--master--e83680f18
Windows 7 6.1 amd64
Java 1.8.0_181

Steps to reproduce:

  1. open any library in Jabref
  2. import 30096135 ( a PLoS article) into the library
  3. since the author list after the import does not conform to Bibtex, Jabref stops working. You can only shut down the program. Once you open that library again, it will fail again. I circumwent this using Jabref 4.1 to clean the author list.
Log File
Paste an excerpt of your log file here
@bernhard-kleine
Copy link
Author

@Siedlerchr Would it be much work to have the quality control to run on any medline import and avoid the tiny error since author list contains abbreviated firstnames withoud a dot.

Siedlerchr added a commit that referenced this issue Apr 7, 2019
Author names from medline lack the dot by Abbrevatieed authors
e.g. Lahiru S instead of Lahuri S.

Fix tests
Fixes #4345
Siedlerchr added a commit that referenced this issue Apr 7, 2019
* Add author normalizer for medline import

Author names from medline lack the dot by Abbrevatieed authors
e.g. Lahiru S instead of Lahuri S.

Fix tests
Fixes #4345

* Add changelog entry
@Siedlerchr
Copy link
Member

I just added the author normalizer to the cleanup operations after import from medline. So the abbreviated names are now correct with a dot. Bu I just noticed in the ID 30096135 you gave there is inserted an extra space after the last name when importing

@bernhard-kleine
Copy link
Author

bernhard-kleine commented Apr 8, 2019

@Siedlerchr: Today I imported PMID 30836949. The first run of the Author normalizer works, but this entry needs a second run of the normalizer since the abbreviated firstnames are wrongly spaced:

Nielsen, Sofie K. D. and Koch, Thomas L. and Hauser, Frank and Garm, Anders and Grimmelikhuijzen, Cornelis J. P.

That is why I suggested to run it multiple times on the Authors. After a second run it looks like Nielsen, Sofie K. D. and Koch, Thomas L. and Hauser, Frank and Garm, Anders and Grimmelikhuijzen, Cornelis J. P.

@bernhard-kleine
Copy link
Author

The double spaces between K. D. and between J. P. do not survive quotes and codes. Therefore , both times they look the same, but they are not. I wonder why quotes are edited at all in github?

@Siedlerchr
Copy link
Member

Try adding them in multiple code lines, three backticks before the block and three after

@Siedlerchr
Copy link
Member

The problem with the double spaces between the author initials should be fixed now as well in the latest master. Refs #4931

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants