Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EndNote XML importer #3713

Merged
merged 7 commits into from
Feb 16, 2018
Merged

Add EndNote XML importer #3713

merged 7 commits into from
Feb 16, 2018

Conversation

tobiasdiez
Copy link
Member

As wished in the forum.


  • Change in CHANGELOG.md described
  • Tests created for changes
  • Screenshots added (for bigger UI changes)
  • Manually tested changed features in running JabRef
  • Check documentation status (Issue created for outdated help page at help.jabref.org?)

@tobiasdiez tobiasdiez added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Feb 11, 2018
@tobiasdiez
Copy link
Member Author

@Siedlerchr Can you please try to import the test file in Linux. The user reports that JabRef freezes after he confirms the import dialog but I can't reproduce the problem here on Windows. Thanks.

@Siedlerchr
Copy link
Member

Xubuntu 16.04 works fine. No issues.

@Siedlerchr
Copy link
Member

The problem is that in the xml file from the user there are records in which apparently many fields are null and thus result in an NPE.
I fixed some of them.

The root problem I see here is that this thing like Titles etc all are serialized to their own object and not to collections and some of them might be null.
If it were mapped to collections we could just initalize it with empty collections and avoid the null shit.
Maybe this approach helps:
http://blog.bdoughan.com/2012/11/creating-generic-list-wrapper-in-jaxb.html

@Siedlerchr
Copy link
Member

I think I found an idea:
First we generate an XSD file from a concrete xml file and then use that as input for xjc. This should hopefully be able to generate lists from what I read.. Or we convert the dtd to xsd first.

.getRecord().stream()
.map(this::parseRecord)
.collect(Collectors.toList());
.getRecord()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Siedlerchr Thanks for the fix! However, it appears there is still a problem with your eclipse style configuration (alignment at dot).

Copy link
Member

@lenhard lenhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a few comments regarding the code. Not tested it in a running instance.

public static void testImportEntries(Importer importer, String fileName, String fileType) throws IOException, ImportException {
ParserResult parserResult = importer.importDatabase(getPath(fileName), StandardCharsets.UTF_8);
if (parserResult.isInvalid()) {
throw new ImportException(parserResult.getErrorMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not throw an IOException instead? Or maybe put the call to importDatabase into a try-catch that converts the IOException into an ImportException?

Overall, it would be better to throw just one Exception type. Then you don't have to change all tests to throws Exception, which imho is ugly, because it is so generic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the generic Exception the preferred choice for tests anyway? It has the advantage that you not need to change your tests if you decide to change the exception signature of the method under test.

I prefer to keep it that way. The error is not coming directly from the file system and thus it is not an IOException in my opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In tests it doesn't matter which exceptions are thrown

Objects.requireNonNull(reader);

try {
JAXBContext context = JAXBContext.newInstance("org.jabref.logic.importer.fileformat.endnote");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructing a JAXBContext is very expensive. It is so expensive that you notice it even as a human. Please turn this into an instance variable, then the construction takes place only once per importer object. Or you could just store the Unmarshaller, which should be enough as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know! I actually copied this part from the medline importer and will change it there too.

public boolean isRecognizedFormat(BufferedReader reader) throws IOException {
String str;
int i = 0;
while (((str = reader.readLine()) != null) && (i < 50)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a funny way of determining the format :-) Somewhere in the first 50 lines a records tag has to appear? Why up to 50? What if some random garbage comes before the records tag? Would JAXB still be able to parse that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's another thing I just copied from the medline importer. Can jaxb test the file without actually parsing everything? If not I find the "record appears near the beginning" a relative good heuristic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, JAXB will try to parse everything. So you could go with such a heuristic, but could you just improve the code a bit? E.g. no side effect (line reading) in the condition of the while, promoting the magic number 50 to a private final static and so on.

Copy link
Member

@lenhard lenhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise I have nothing to add.

I would say that if the user from the forum has tested the functionality and is happy with it, you can merge.

@Siedlerchr
Copy link
Member

Coday is just complaining about the juntit 5 annotations, but that's not your fault. Otherwise from my side code is good as well.

@Siedlerchr Siedlerchr merged commit bb2b078 into master Feb 16, 2018
@Siedlerchr Siedlerchr deleted the importBookends branch February 16, 2018 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants