-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crossref import with DOI URI behaves incorrectly #9561
Comments
@alanorth The current implementation Line 152 in 823ade9
So the effective Queries called in the search are e.g.
Reproduced on todays sandbox: Also searching after the title is possible. |
@floriangantner Ah I see! It's hard to imagine this free-text search being useful. Returning more than a single page of results—leave alone millions!—is a terrible user experience. Unless there's some way to make that free-text search more useful, I would say that we should make this explicitly use DOIs because that will return an exact match and is more likely the workflow that submitters will be using (at least at our institute, where our submitters are cataloging a journal article authored by one of our scientists). |
I'd blame Line 119 in 1517e8c
The crossrefimport explicitly checks if a DOI is given or not and only searches by query if no DOI is provided: Lines 83 to 87 in 1517e8c
However only an extremely limited set of prefixes is recognized by DSpace/dspace-api/src/main/java/org/dspace/importer/external/service/DoiCheck.java Line 22 in 1517e8c
so that https://doi.org/10.1108/CAER-03-2020-0040 is used with a query parameter and not as an ID. We ran into the same problem a while ago and added a few more valid prefixes. Some of them very legacy: private static final List<String> DOI_PREFIXES = Arrays.asList(
"http://dx.doi.org/",
"https://dx.doi.org/",
"http://www-dx.doi.org/",
"https://www-dx.doi.org/",
"http://doi.org/",
"https://doi.org/",
"dx.doi.org/",
"www.dx.doi.org/",
"doi:"); |
@hutattedonmyarm Wow yes this is very simple and obviously correct. While there doesn't seem to be a single canonical form for DOIs, I think it has become more common to use the Could you submit a patch with your additions? |
Describe the bug
In DSpace 7.6.1 and current DSpace 8.0-SNAPSHOT at least, if you try to import an item using a DOI using its URI form from Crossref, you get millions of results.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
DSpace should show exactly one result for the DOI.
Crossref's API supports retrieving the DOI in various formats, so I'm not sure what is going on. See:
Related work
#9385
The text was updated successfully, but these errors were encountered: