Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
a769ea1
OpenAIRE Datacite 4.1 exporter
francescopioscognamiglio May 11, 2018
1d0a72a
Merge branch 'develop' into openaire
abollini May 12, 2018
06ce109
minor version is now a proper long
francescopioscognamiglio May 14, 2018
32e5544
OpenAIRE compliance
juancorr May 24, 2018
d9ef102
Merge pull request #2 from Consorcio-Madrono/openaire
lap82 May 24, 2018
ccf419a
Merge branch 'develop' into openaire
lap82 May 24, 2018
b6743d8
- clean code
francescopioscognamiglio May 24, 2018
4b28306
add "DataCite OpenAIRE" to list of export formats #4257
pdurbin May 29, 2018
7c11bc0
put DataCite OpenAire export in dropdown #4257 #3697
pdurbin May 30, 2018
a51743b
add some JUnit tests #4257 #3697
pdurbin May 30, 2018
272290c
Merge branch 'develop' into openaire #4257 #3697
pdurbin May 30, 2018
5336e67
increase code coverage of openaire exporter #4257 #3697
pdurbin May 31, 2018
0a5ad78
Merged with develop branch
Apr 2, 2019
02aabcf
Openaire export reviewed, more tests added
Apr 10, 2019
103925a
Merge remote-tracking branch 'upstream/develop' into openaire
Apr 10, 2019
fd15d15
Revised metadata (restricted access, language, funder)
Apr 12, 2019
b691b3a
Merge remote-tracking branch 'upstream/develop' into openaire
Apr 12, 2019
3347b50
creator nametype improved using DataCite algorithm
Apr 17, 2019
578bf89
creator nametype improved using DataCite algorithm
Apr 17, 2019
02c6af9
Fix FirstNameTest tests
Apr 17, 2019
32565e3
Minor fixes
Apr 18, 2019
b6c6045
Merge remote-tracking branch 'contributor/openaire' into openaire
Apr 18, 2019
a745e18
Merge remote-tracking branch 'upstream/develop' into openaire
Apr 18, 2019
b6215ef
Minor fixes
Apr 18, 2019
d5571dd
Merge branch 'openaire' of https://github.com/fcadili/dataverse into …
Apr 18, 2019
fdc2328
Merge remote-tracking branch 'contributor/openaire' into openaire
Apr 18, 2019
78c1768
Merge branch 'develop' into openaire #4257
pdurbin Apr 22, 2019
5e9e924
note that some partial renaming of variables was done #4257
pdurbin Apr 22, 2019
03c959b
reference related tests about first name #4257
pdurbin Apr 22, 2019
68c0a93
DataCite algorithm was applied to handle contributors nametype properly
Apr 24, 2019
6b8d524
Merge remote-tracking branch 'upstream/develop' into openaire
Apr 24, 2019
cc1c6f9
Merge branch 'openaire' of https://github.com/4Science/dataverse into…
Apr 24, 2019
0311179
fix Organizational value of contributor's nameType
Apr 24, 2019
3a8247c
change button to "OpenAIRE", update guides #4257
pdurbin Apr 25, 2019
5f0a81a
Added hint file
Apr 26, 2019
8030521
Merge branch 'openaire' of https://github.com/4Science/dataverse into…
Apr 26, 2019
9496718
Added firstName resources to generated war
May 2, 2019
3a79173
Merge remote-tracking branch 'upstream/develop' into openaire
May 2, 2019
bd9a3e3
Adding organization recognition with NPL
May 6, 2019
c214229
Fixing Organizational nameType
May 6, 2019
62d5fc0
Merge remote-tracking branch 'upstream/develop' into openaire
May 6, 2019
4d4ef39
Adding NPL tokenizer to handle organization names with comma
May 7, 2019
5d876aa
Merge remote-tracking branch 'upstream/develop' into openaire
May 7, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@ nb-configuration.xml
target
infer-out
nbactions.xml
.settings
.classpath
.project
michael-local
GPATH
GTAGS
Expand Down
9 changes: 1 addition & 8 deletions doc/sphinx-guides/source/admin/metadataexport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,7 @@ Metadata Export
Automatic Exports
-----------------

Publishing a dataset automatically starts a metadata export job, that will run in the background, asynchronously. Once completed, it will make the dataset metadata exported and cached in all the supported formats:

- Dublin Core
- Data Documentation Initiative (DDI)
- DataCite 4
- native JSON (Dataverse-specific)
- OAI_ORE
- Schema.org JSON-LD
Publishing a dataset automatically starts a metadata export job, that will run in the background, asynchronously. Once completed, it will make the dataset metadata exported and cached in all the supported formats listed under :ref:`Supported Metadata Export Formats <metadata-export-formats>` in the :doc:`/user/dataset-management` section of the User Guide.

A scheduled timer job that runs nightly will attempt to export any published datasets that for whatever reason haven't been exported yet. This timer is activated automatically on the deployment, or restart, of the application. So, again, no need to start or configure it manually. (See the "Application Timers" section of this guide for more information)

Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ Export Metadata of a Dataset in Various Formats

GET http://$SERVER/api/datasets/export?exporter=ddi&persistentId=$persistentId

.. note:: Supported exporters (export formats) are ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, ``schema.org`` , ``OAI_ORE`` , ``Datacite`` and ``dataverse_json``.
.. note:: Supported exporters (export formats) are ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, ``schema.org`` , ``OAI_ORE`` , ``Datacite``, ``oai_datacite`` and ``dataverse_json``.

Schema.org JSON-LD
^^^^^^^^^^^^^^^^^^
Expand Down
17 changes: 15 additions & 2 deletions doc/sphinx-guides/source/user/dataset-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,20 @@ A dataset contains three levels of metadata:

For more details about what Citation and Domain Specific Metadata is supported please see our :ref:`user-appendix`.

Note that once a dataset has been published its metadata may be exported. A button on the dataset page's metadata tab will allow a user to export the metadata of the most recently published version of the dataset. Currently supported export formats are DDI, Dublin Core, Datacite 4, OAI_ORE, Schema.org JSON-LD, and Dataverse's native JSON format.
.. _metadata-export-formats:

Supported Metadata Export Formats
---------------------------------

Once a dataset has been published its metadata is exported in a variety of formats. A button on the dataset page's metadata tab will allow a user to export the metadata of the most recently published version of the dataset. Currently supported export formats are:

- Dublin Core
- DDI (Data Documentation Initiative)
- DataCite 4
- JSON (native Dataverse format)
- OAI_ORE
- OpenAIRE
- Schema.org JSON-LD

Adding a New Dataset
====================
Expand Down Expand Up @@ -510,4 +523,4 @@ If you deaccession the most recently published version of the dataset but not al
.. |file-upload-prov-window| image:: ./img/prov1.png
:class: img-responsive
.. |image-file-tree-view| image:: ./img/file-tree-view.png
:class: img-responsive
:class: img-responsive
7 changes: 7 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -599,6 +599,12 @@
<artifactId>tika-parsers</artifactId>
<version>1.19</version>
</dependency>
<!-- Named Entity Recognition -->
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-tools</artifactId>
<version>1.9.1</version>
</dependency>
</dependencies>
<build>
<!-- <testResources>
Expand Down Expand Up @@ -632,6 +638,7 @@
<includes>
<include>**/*.sql</include>
<include>**/*.xml</include>
<include>**/firstNames/*.*</include>
</includes>
</resource>
</resources>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException;
import edu.harvard.iq.dataverse.settings.SettingsServiceBean;
import edu.harvard.iq.dataverse.util.FileUtil;;
import edu.harvard.iq.dataverse.util.FileUtil;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ public class DataFileDTO {
private String id;
private String storageIdentifier;
private String contentType;
private Long filesize;
private String filename;
private String originalFileFormat;
private String originalFormatLabel;
Expand Down Expand Up @@ -61,6 +62,14 @@ public String getContentType() {
public void setContentType(String contentType) {
this.contentType = contentType;
}

public Long getFileSize() {
return filesize;
}

public void setFileSize(Long fileSize) {
this.filesize = fileSize;
}

public String getFilename() {
return filename;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@
public class DatasetVersionDTO {
String archiveNote;
String deacessionLink;
// FIXME: Change to versionNumberMajor and versionNumberMinor? Some partial renaming of "minor" was done.
Long versionNumber;
String minorVersionNumber;
Long versionMinorNumber;
long id;
VersionState versionState;
String releaseDate;
Expand All @@ -36,7 +37,8 @@ public class DatasetVersionDTO {
String availabilityStatus;
String contactForAccess;
String sizeOfCollection;
String studyCompletion;
String studyCompletion;
boolean fileAccessRequest;
String citation;
String license;
boolean inReview;
Expand Down Expand Up @@ -172,7 +174,15 @@ public String getStudyCompletion() {
public void setStudyCompletion(String studyCompletion) {
this.studyCompletion = studyCompletion;
}


public boolean isFileAccessRequest() {
return fileAccessRequest;
}

public void setFileAccessRequest(boolean fileAccessRequest) {
this.fileAccessRequest = fileAccessRequest;
}

public String getCitation() {
return citation;
}
Expand Down Expand Up @@ -229,12 +239,12 @@ public void setVersionNumber(Long versionNumber) {
this.versionNumber = versionNumber;
}

public String getMinorVersionNumber() {
return minorVersionNumber;
public Long getMinorVersionNumber() {
return versionMinorNumber;
}

public void setMinorVersionNumber(String minorVersionNumber) {
this.minorVersionNumber = minorVersionNumber;
public void setMinorVersionNumber(Long minorVersionNumber) {
this.versionMinorNumber = minorVersionNumber;
}

public long getId() {
Expand Down Expand Up @@ -320,7 +330,7 @@ public List<FieldDTO> getDatasetFields() {

@Override
public String toString() {
return "DatasetVersionDTO{" + "archiveNote=" + archiveNote + ", deacessionLink=" + deacessionLink + ", versionNumber=" + versionNumber + ", minorVersionNumber=" + minorVersionNumber + ", id=" + id + ", versionState=" + versionState + ", releaseDate=" + releaseDate + ", lastUpdateTime=" + lastUpdateTime + ", createTime=" + createTime + ", archiveTime=" + archiveTime + ", UNF=" + UNF + ", metadataBlocks=" + metadataBlocks + ", fileMetadatas=" + fileMetadatas + '}';
return "DatasetVersionDTO{" + "archiveNote=" + archiveNote + ", deacessionLink=" + deacessionLink + ", versionNumber=" + versionNumber + ", minorVersionNumber=" + versionMinorNumber + ", id=" + id + ", versionState=" + versionState + ", releaseDate=" + releaseDate + ", lastUpdateTime=" + lastUpdateTime + ", createTime=" + createTime + ", archiveTime=" + archiveTime + ", UNF=" + UNF + ", metadataBlocks=" + metadataBlocks + ", fileMetadatas=" + fileMetadatas + '}';
}


Expand Down
22 changes: 20 additions & 2 deletions src/main/java/edu/harvard/iq/dataverse/api/dto/FileDTO.java
Original file line number Diff line number Diff line change
@@ -1,9 +1,27 @@
package edu.harvard.iq.dataverse.api.dto;

public class FileDTO {


String label;
boolean restricted;
DataFileDTO dataFile;


public String getLabel() {
return label;
}

public void setLabel(String label) {
this.label = label;
}

public boolean isRestricted() {
return restricted;
}

public void setRestricted(boolean restricted) {
this.restricted = restricted;
}

public DataFileDTO getDataFile() {
return dataFile;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1154,10 +1154,10 @@ private void parseVersionNumber(DatasetVersionDTO dvDTO, String versionNumber) {
int firstIndex = versionNumber.indexOf('.');
if (firstIndex == -1) {
dvDTO.setVersionNumber(Long.parseLong(versionNumber));
dvDTO.setMinorVersionNumber("0");
dvDTO.setMinorVersionNumber(0L);
} else {
dvDTO.setVersionNumber(Long.parseLong(versionNumber.substring(0, firstIndex - 1)));
dvDTO.setMinorVersionNumber(versionNumber.substring(firstIndex + 1));
dvDTO.setMinorVersionNumber(Long.valueOf(versionNumber.substring(firstIndex + 1)));
}


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ public static String getDisplayNameFromDiscoFeed(String entityIdToFind, String d
*
* - "Guido|van Rossum"
*
* - "Philip Seymour|Hoffman"
* - "Philip Seymour|Hoffman" (see FirstNameTest.java)
*
* Also, we currently compel all Shibboleth IdPs to send us firstName and
* lastName so the logic to handle null/empty values for firstName and
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -316,7 +316,7 @@ private InputStream getCachedExportFormat(Dataset dataset, String formatName) th
try {
dataAccess = DataAccess.getStorageIO(dataset);
} catch (IOException ioex) {
throw new IOException("IO Exception thrown exporting as " + "export_" + formatName + ".cached");
throw new IOException("IO Exception thrown exporting as " + "export_" + formatName + ".cached", ioex);
}

InputStream cachedExportInputStream = null;
Expand All @@ -325,7 +325,7 @@ private InputStream getCachedExportFormat(Dataset dataset, String formatName) th
cachedExportInputStream = dataAccess.getAuxFileAsInputStream("export_" + formatName + ".cached");
return cachedExportInputStream;
} catch (IOException ioex) {
throw new IOException("IO Exception thrown exporting as " + "export_" + formatName + ".cached");
throw new IOException("IO Exception thrown exporting as " + "export_" + formatName + ".cached", ioex);
}

}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
package edu.harvard.iq.dataverse.export;

import java.io.OutputStream;

import javax.json.JsonObject;
import javax.xml.stream.XMLStreamException;

import com.google.auto.service.AutoService;

import edu.harvard.iq.dataverse.DatasetVersion;
import edu.harvard.iq.dataverse.export.openaire.OpenAireExportUtil;
import edu.harvard.iq.dataverse.export.spi.Exporter;
import edu.harvard.iq.dataverse.util.BundleUtil;

@AutoService(Exporter.class)
public class OpenAireExporter implements Exporter {

public OpenAireExporter() {
}

@Override
public String getProviderName() {
return "oai_datacite";
}

@Override
public String getDisplayName() {
return BundleUtil.getStringFromBundle("dataset.exportBtn.itemLabel.dataciteOpenAIRE");
}

@Override
public void exportDataset(DatasetVersion version, JsonObject json, OutputStream outputStream)
throws ExportException {
try {
OpenAireExportUtil.datasetJson2openaire(json, outputStream);
} catch (XMLStreamException xse) {
throw new ExportException("Caught XMLStreamException performing DataCite OpenAIRE export", xse);
}
}

@Override
public Boolean isXMLFormat() {
return true;
}

@Override
public Boolean isHarvestable() {
return true;
}

@Override
public Boolean isAvailableToUsers() {
return true;
}

@Override
public String getXMLNameSpace() throws ExportException {
return OpenAireExportUtil.RESOURCE_NAMESPACE;
}

@Override
public String getXMLSchemaLocation() throws ExportException {
return OpenAireExportUtil.RESOURCE_SCHEMA_LOCATION;
}

@Override
public String getXMLSchemaVersion() throws ExportException {
return OpenAireExportUtil.SCHEMA_VERSION;
}

@Override
public void setParam(String name, Object value) {
// not used
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
package edu.harvard.iq.dataverse.export.openaire;

import org.apache.commons.lang3.StringUtils;

/**
*
* @author francesco.cadili@4science.it
*/
public class Cleanup {

/**
* Normalize sentence
*
* @param sentence full name or organization name
* @return normalize string value
*/
static public String normalize(String sentence) {
if (StringUtils.isBlank(sentence)) {
return "";
}

sentence = sentence.trim()
.replaceAll(", *", ", ")
.replaceAll(" +", " ");

return sentence;
}
}
Loading