New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dspace harvester first version #4
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joseph6x nice work. I have posted some comments, please read it and consider addressing some of them before you merge your PR to the main branch. Thanks for the nice work.
<artifactId>org.vivoweb.dspacevivo.model.openapi</artifactId> | ||
<version>0.0.1-SNAPSHOT</version> | ||
</dependency> | ||
</dependencies> | ||
</project> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add new line at the end of file
|
||
public abstract void connect(); | ||
public abstract Iterator<Item> harvestItems(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line at the end of file is missing
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line
|
||
|
||
|
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple empty lines
System.out.println("ºººººººººººººº"); | ||
System.out.println(origin.getConf().getProperty("uriPrefix")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use logs, not printing in console.
import java.util.Optional; | ||
import org.vivoweb.dspacevivo.model.Item; | ||
|
||
public class ItemItr implements Iterator<Item> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why do we need this class, but anyway its implementation should be improved. Beginning of hasNext is confusing, and the reason for that is overriding of next() method. List is also iterable, can we use next of that, and not removing elements from the list. Or it is implemented as that for the need of cleaning memory consumption (OAI supports resumption token). However, please consider redesigning of this class.
this.ignoreSSLWarnings = ignoreSSLWarnings; | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line
} | ||
|
||
public Optional<String> getResumptionToken() { | ||
Optional<String> vl = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is v1 abbreviation for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verb number 1???
</xsl:for-each> | ||
</oai_dc> | ||
</xsl:template> | ||
</xsl:stylesheet> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joseph6x please check my comments. Well done!!!
} | ||
|
||
@Override | ||
public Community next() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't like how this is implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get the data from Dspace multiple OAI requests have to be performed. Each OAI response returns a list (page) of elements e.g. Items, Communities, Collections. In these Iterator classes I am converting the responses from pages to a stream of elements, so the complexity of pagination in the OAI requests are hidden behind the Iterator class. this is the reason why I am using a temporary list to store the results of the current page and in the hasNext method the OAI requests are executed when the temporary list is empty. Please let me know if you have suggestions on how to improve it
while (harvestItemsItr.hasNext()) { | ||
count++; | ||
Item next = harvestItemsItr.next(); | ||
System.out.println("new Item harvested..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use loggers instead of printing to console?
Hi @chenejac . I have just pushed the last version of the harvester. Please let me know if I can merge the PR. |
Dspace harvester first version