Skip to content

Commit

Permalink
10288 Add keywordTermURI metadata in keyword block (#10371)
Browse files Browse the repository at this point in the history
* New keywordTermURI Metadata in keyword Metadata Block

* update of the keywordVocabularyURI metadata to make it consistent with its name Controlled Vocabulary URL

* fix description and watermark properties

* 10288 adding documentation

* 10288 - adaptation of the SolR schema and dataset exports

* 10288 - Adjustment of typo and sql

* Adaptations for Dataverse 6.2

* 10288 - rollback keywordVocabularyURL to keywordVocabularyURI

* 10288 - removing obsolete SQL script

* 10288 - Label modification to follow Dataverse recommendations

* 10288 - Added valueURI attribute for OpenAire export

* Fix NoResultException on DatasetServiceBean.findDeep (.getSingleResult():L137)

---------

Co-authored-by: Ludovic DANIEL <ludovic.daniel@smile.fr>
Co-authored-by: Jérôme ROUCOU <jerome.roucou@inrae.fr>
  • Loading branch information
3 people committed Jun 14, 2024
1 parent ad58f3e commit 00020e2
Show file tree
Hide file tree
Showing 15 changed files with 208 additions and 133 deletions.
2 changes: 2 additions & 0 deletions conf/solr/9.3.0/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,7 @@
<field name="journalVolumeIssue" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="keyword" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="keywordValue" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="keywordTermURI" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="keywordVocabulary" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="keywordVocabularyURI" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="kindOfData" type="text_en" multiValued="true" stored="true" indexed="true"/>
Expand Down Expand Up @@ -565,6 +566,7 @@
<copyField source="journalVolumeIssue" dest="_text_" maxChars="3000"/>
<copyField source="keyword" dest="_text_" maxChars="3000"/>
<copyField source="keywordValue" dest="_text_" maxChars="3000"/>
<copyField source="keywordTermURI" dest="_text_" maxChars="3000"/>
<copyField source="keywordVocabulary" dest="_text_" maxChars="3000"/>
<copyField source="keywordVocabularyURI" dest="_text_" maxChars="3000"/>
<copyField source="kindOfData" dest="_text_" maxChars="3000"/>
Expand Down
53 changes: 53 additions & 0 deletions doc/release-notes/10288-add-term_uri-metadata-in-keyword-block.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
### New keywordTermURI Metadata in keyword Metadata Block

Adding a new metadata `keywordTermURI` to the `keyword` metadata block to facilitate the integration of controlled vocabulary services, in particular by adding the possibility of saving the "term" and its associated URI. For more information, see #10288 and PR #10371.

## Upgrade Instructions

1\. Update the Citation metadata block

- `wget https://github.com/IQSS/dataverse/releases/download/v6.3/citation.tsv`
- `curl http://localhost:8080/api/admin/datasetfield/load -X POST --data-binary @citation.tsv -H "Content-type: text/tab-separated-values"`

2\. Update your Solr `schema.xml` to include the new field.

For details, please see https://guides.dataverse.org/en/latest/admin/metadatacustomization.html#updating-the-solr-schema


3\. Reindex Solr.

Once the schema.xml is updated, Solr must be restarted and a reindex initiated.
For details, see https://guides.dataverse.org/en/latest/admin/solr-search-index.html but here is the reindex command:

`curl http://localhost:8080/api/admin/index`


4\. Run ReExportAll to update dataset metadata exports. Follow the instructions in the [Metadata Export of Admin Guide](https://guides.dataverse.org/en/latest/admin/metadataexport.html#batch-exports-through-the-api).


## Notes for Dataverse Installation Administrators

### Data migration to the new `keywordTermURI` field

You can migrate your `keywordValue` data containing URIs to the new `keywordTermURI` field.
In case of data migration, view the affected data with the following database query:

```
SELECT value FROM datasetfieldvalue dfv
INNER JOIN datasetfield df ON df.id = dfv.datasetfield_id
WHERE df.datasetfieldtype_id = (SELECT id FROM datasetfieldtype WHERE name = 'keywordValue')
AND value ILIKE 'http%';
```

If you wish to migrate your data, a database update is then necessary:

```
UPDATE datasetfield df
SET datasetfieldtype_id = (SELECT id FROM datasetfieldtype WHERE name = 'keywordTermURI')
FROM datasetfieldvalue dfv
WHERE dfv.datasetfield_id = df.id
AND df.datasetfieldtype_id = (SELECT id FROM datasetfieldtype WHERE name = 'keywordValue')
AND dfv.value ILIKE 'http%';
```

A ['Reindex in Place'](https://guides.dataverse.org/en/latest/admin/solr-search-index.html#reindex-in-place) will be required and ReExportAll will need to be run to update the metadata exports of the dataset. Follow the directions in the [Admin Guide](http://guides.dataverse.org/en/latest/admin/metadataexport.html#batch-exports-through-the-api).
12 changes: 12 additions & 0 deletions scripts/api/data/dataset-create-new-all-default-fields.json
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,12 @@
"typeClass": "primitive",
"value": "KeywordTerm1"
},
"keywordTermURI": {
"typeName": "keywordTermURI",
"multiple": false,
"typeClass": "primitive",
"value": "http://keywordTermURI1.org"
},
"keywordVocabulary": {
"typeName": "keywordVocabulary",
"multiple": false,
Expand All @@ -251,6 +257,12 @@
"typeClass": "primitive",
"value": "KeywordTerm2"
},
"keywordTermURI": {
"typeName": "keywordTermURI",
"multiple": false,
"typeClass": "primitive",
"value": "http://keywordTermURI2.org"
},
"keywordVocabulary": {
"typeName": "keywordVocabulary",
"multiple": false,
Expand Down
Loading

0 comments on commit 00020e2

Please sign in to comment.