Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Proper copying of entries #1677

Merged
merged 1 commit into from
Feb 15, 2017
Merged

[WIP] Proper copying of entries #1677

merged 1 commit into from
Feb 15, 2017

Conversation

oscargus
Copy link
Contributor

@oscargus oscargus commented Aug 4, 2016

I'm making a method to properly copy entries from one database to another, including crossref entries and used strings. This is useful in generating a database from an aux file and for generating a database based on an OO/LO document. It could also be useful for generating a subdatabase from selected entries (or copying relevant strings when copying entries).

There are some problems though related to the preamble. In OO/LO the entries may come from several databases and there it is not obvious how to deal with the preamble. However, a possibly solvable problem is to detect which strings are used in the preamble.

In AuxParserTest @stefan-kolb have generated a test file with a string in the preamble. JabRef doesn't parse this in a way such that the string can be resolved automatically. Earlier, the preamble and all strings were copied in AuxParser, but I was hoping to only copy the relevant strings. Any input on how the preamble issue should be solved? One way would simply be to parse the preamble for strings, I guess. Any other suggestions?

  • Change in CHANGELOG.md described
  • Tests created for changes
  • Screenshots added (for bigger UI changes)
  • Manually tested changed features in running JabRef

@koppor
Copy link
Member

koppor commented Aug 10, 2016

Is it possible to also copy the files if the absolute directory of the file differs? The relative path of the source should be resolved against the file path of the current bib file. Including directory creation.

@oscargus
Copy link
Contributor Author

That is indeed another thing which I haven't considered. I was primarily thinking of generating a bib without files (for OO/LO and AUX), but clearly, for proper copying files should be included. Even better if everything can be added to a Zip archive as well.

@koppor
Copy link
Member

koppor commented Sep 13, 2016

@oscargus If the only thing is that the files aren't copied, this should be ready for review, isn't it?

@oscargus
Copy link
Contributor Author

Maybe it is OK to add. It is not really what I aimed for (yet), but the string copying works for AuxParser (I think), so always something.

@koppor
Copy link
Member

koppor commented Sep 14, 2016

Could you point to the concrete file "with a string in the preamble". Are these comment strings? I think, you reference #900?

@oscargus
Copy link
Contributor Author

@koppor
Copy link
Member

koppor commented Sep 15, 2016

@lenhard You recently worked on reading and writing bib databases. Can you comment on parsing @preamble. As far as I understood @oscargus, he wants to know which strings are used at @preamble and thus only copy these fields.

I don't know @preamble. Any pointers to documentation? 😇

@oscargus: Leaving @preamble and file copying aside, this should be ready for review, isn't it? I would like to get this merged to minimize the LOC for PRs. This enables us to quickly review PRs in our spare time. Longer PRs cannot be quickly reviewed.

@lenhard
Copy link
Member

lenhard commented Sep 15, 2016

The preamble is parsed as-is with formatting and everything and set as a String member of BibDataBase. I am unaware of any formatting or syntax restrictions within the preamble.

Honestly, I do not really get what it should be good for, it seems like a relic. Apparently it is part of Bibtex, and, therefore, we parse it, have an editor for it and serialize it. It seems that it is more relevant for programs that compile the bib file. Here are some, not very conclusive references on the preamble: first, second, third.

If you want to decide, which strings are used in a preamble, it would be best to parse the preamble string and look for occurrences of #. Best to keep this simple. If we miss edge cases of preamble usage, it is probably irrelevant, as hardly anybody uses it.

database.insertEntry(entry);
List<BibtexString> usedStrings = (List<BibtexString>) database.getUsedStrings(Arrays.asList(entry));
assertEquals(2, usedStrings.size());
assertTrue((string.getName() == usedStrings.get(0).getName())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we put all strings in a set and just compare two sets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might indeed be a better solution. The problem at the moment is really that I do not know which order they are added in (or rather, I want the test to be independent of that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, therefore I asked for a Set. The Set is independent of the order - or did I get your comment wrong?

@oscargus
Copy link
Contributor Author

@koppor yes, the introduced functionality works, although it is not enough to obtain the complete purpose, so it can be reviewed and merged for now.

Personally, I do not really see the need for strings in the preamble, but since we have a test for that it felt wrong to break/remove that test just because I couldn't bother to handle it...

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, then it is good to go.

@@ -400,7 +426,7 @@ public BibEntry resolveForStrings(BibEntry entry, boolean inPlace) {
* care not to follow a circular reference pattern.
* If the string is undefined, returns null.
*/
private String resolveString(String label, Set<String> usedIds) {
private String resolveString(String label, Set<String> usedIds, Set<String> allUsedIds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add Objects.requireNonNull(label);, Objects.requireNonNull(usedIds);, Objects.requireNonNull(allUsedIds);

Please add @paramter allUsedIds containing a short description of the parameter

@@ -342,7 +342,33 @@ public synchronized boolean hasStringLabel(String label) {
*/
public String resolveForStrings(String content) {
Objects.requireNonNull(content, "Content for resolveForStrings must not be null.");
return resolveContent(content, new HashSet<>());
return resolveContent(content, new HashSet<>(), null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't pass null. Pass an empty set.

database.insertEntry(entry);
List<BibtexString> usedStrings = (List<BibtexString>) database.getUsedStrings(Arrays.asList(entry));
assertEquals(2, usedStrings.size());
assertTrue((string.getName() == usedStrings.get(0).getName())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, therefore I asked for a Set. The Set is independent of the order - or did I get your comment wrong?

@@ -42,6 +45,10 @@ public int getUnresolvedKeysCount() {
return unresolvedKeys.size();
}

public int getInsertedStringsCount() {
return insertedStrings;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test case covering that?

Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general.

@@ -400,7 +426,7 @@ public BibEntry resolveForStrings(BibEntry entry, boolean inPlace) {
* care not to follow a circular reference pattern.
* If the string is undefined, returns null.
*/
private String resolveString(String label, Set<String> usedIds) {
private String resolveString(String label, Set<String> usedIds, Set<String> allUsedIds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between usedIds and allUsedIds? They seem to have the same purpose.

}

for (String stringId : allUsedIds) {
result.add((BibtexString) bibtexStrings.get(stringId).clone());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you use clone here?

@@ -413,11 +439,14 @@ private String resolveString(String label, Set<String> usedIds) {
}
// If not, log this string's ID now.
usedIds.add(string.getId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you get the id of the string and above in getUsedStrings you get the string from the id.
Thus I would propose to change Set<String> usedIds to Set<BibString> usedStrings.

public void getUsedStringsSingleStringWithString() {
BibEntry entry = new BibEntry(IdGenerator.next());
entry.setField("author", "#AAA#");
BibtexString string = new BibtexString(IdGenerator.next(), "AAA", "Some other #BBB#");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the string initalization and adding them to the db to the @setup method.

database.addString(string2);
database.addString(string3);
database.insertEntry(entry);
List<BibtexString> usedStrings = (List<BibtexString>) database.getUsedStrings(Arrays.asList(entry));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create overload of getUsedStrings for one entry and use Collections.singletonList.

@lenhard lenhard merged commit f1838e7 into JabRef:master Feb 15, 2017
@lenhard
Copy link
Member

lenhard commented Feb 15, 2017

As of 1684cdc this is integrated into master now. I had to do it on a separate branch since I have no commit access to the branch here. However, all commits are preserved.

I addressed the issues mentioned above that I agreed with, which were all rather minor quality things.

So this branch can be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants