Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Schema creator and validator #10109

Merged
merged 45 commits into from
Dec 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
61abac1
#9464 create json
sekmiller Nov 1, 2023
6ba4ef5
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 1, 2023
38f09f6
#9464 fix json schema formatting
sekmiller Nov 2, 2023
5ca4cc0
#9464 remove license from required
sekmiller Nov 2, 2023
02a570a
#9464 Add commands, endpoints, IT, etc
sekmiller Nov 8, 2023
42e055f
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 8, 2023
521e8d2
#9464 delete test dataverse
sekmiller Nov 8, 2023
7c630f7
#9464 add release note
sekmiller Nov 8, 2023
720b3b0
add doc for get schema
sekmiller Nov 8, 2023
7be5347
#9464 fix typo
sekmiller Nov 8, 2023
c553d1b
Add permission note
sekmiller Nov 8, 2023
a080f84
#9464 add doc for validate json
sekmiller Nov 9, 2023
7d38366
#9464 add strings to bundle
sekmiller Nov 9, 2023
7887a05
#9464 simplify commands
sekmiller Nov 9, 2023
437e7cc
#9464 remove unused import
sekmiller Nov 13, 2023
d7fccf7
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 17, 2023
73593ac
#9464 query by dvo. update IT
sekmiller Nov 17, 2023
33aefff
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 17, 2023
e4ede35
#9464 fix logger reference
sekmiller Nov 20, 2023
766c9c3
#9464 add base schema as a file
sekmiller Nov 20, 2023
c82faf9
#9464 fix formatting
sekmiller Nov 21, 2023
44a07a3
#9464 more code cleanup
sekmiller Nov 21, 2023
7d687e9
#9464 third time's the charm?
sekmiller Nov 21, 2023
3bc5ef7
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 21, 2023
e501845
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 27, 2023
212baf2
#9464 return json object as api response
sekmiller Nov 27, 2023
9367026
#9464 revert harvesting changes made in error
sekmiller Nov 27, 2023
b7a3e78
add dataset JSON Schema to API guide, add test #9464
pdurbin Nov 27, 2023
2d3f7ab
just return the JSON Schema, don't wrap in "data, message" #9464
pdurbin Nov 27, 2023
0a77e2a
tweak docs #9464
pdurbin Nov 27, 2023
7db3629
removing trailing newline #9464
pdurbin Nov 27, 2023
194945b
remove cruft (unused) #9464
pdurbin Nov 28, 2023
c1bd009
format code (no-op) #9464
pdurbin Nov 28, 2023
c4d9b6e
add new endpoints to API changelog #9464
pdurbin Nov 28, 2023
45df764
tweak release note #9464
pdurbin Nov 28, 2023
d8e327d
add "v" to make anchor links meaningful #9464 #10060
pdurbin Nov 28, 2023
866b5ea
Adds -X POST on the docs for validateDatasetJson
jp-tosca Nov 28, 2023
e235257
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 30, 2023
2c41687
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Nov 30, 2023
547d71c
#9464 add more detail to validation error message
sekmiller Dec 4, 2023
c9374f3
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Dec 4, 2023
7697157
#9464 handle single errors
sekmiller Dec 4, 2023
e3bff3c
Merge branch 'develop' into 9464-schema-creator-validator
sekmiller Dec 5, 2023
c54a85f
#9464 add caveats to release note.
sekmiller Dec 5, 2023
2379828
Update native-api.rst
sekmiller Dec 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions doc/release-notes/9464-json-validation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Functionality has been added to help validate dataset JSON prior to dataset creation. There are two new API endpoints in this release. The first takes in a collection alias and returns a custom dataset schema based on the required fields of the collection. The second takes in a collection alias and a dataset JSON file and does an automated validation of the JSON file against the custom schema for the collection. In this release funtionality is limited to json format validation and validating required elements. Future releases will address field types, controlled vocabulary, etc. (Issue #9464 and #9465)

For documentation see the API changelog: http://preview.guides.gdcc.io/en/develop/api/changelog.html
122 changes: 122 additions & 0 deletions doc/sphinx-guides/source/_static/api/dataset-schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"$defs": {
"field": {
"type": "object",
"required": ["typeClass", "multiple", "typeName"],
"properties": {
"value": {
"anyOf": [
{
"type": "array"
},
{
"type": "string"
},
{
"$ref": "#/$defs/field"
}
]
},
"typeClass": {
"type": "string"
},
"multiple": {
"type": "boolean"
},
"typeName": {
"type": "string"
}
}
}
},
"type": "object",
"properties": {
"datasetVersion": {
"type": "object",
"properties": {
"license": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"uri": {
"type": "string",
"format": "uri"
}
},
"required": ["name", "uri"]
},
"metadataBlocks": {
"type": "object",
"properties": {
"citation": {
"type": "object",
"properties": {
"fields": {
"type": "array",
"items": {
"$ref": "#/$defs/field"
},
"minItems": 5,
"allOf": [
{
"contains": {
"properties": {
"typeName": {
"const": "title"
}
}
}
},
{
"contains": {
"properties": {
"typeName": {
"const": "author"
}
}
}
},
{
"contains": {
"properties": {
"typeName": {
"const": "datasetContact"
}
}
}
},
{
"contains": {
"properties": {
"typeName": {
"const": "dsDescription"
}
}
}
},
{
"contains": {
"properties": {
"typeName": {
"const": "subject"
}
}
}
}
]
}
},
"required": ["fields"]
}
},
"required": ["citation"]
}
},
"required": ["metadataBlocks"]
}
},
"required": ["datasetVersion"]
}
13 changes: 9 additions & 4 deletions doc/sphinx-guides/source/api/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,20 @@ API Changelog
:local:
:depth: 1

6.1
---
v6.1
----

New
~~~
- **/api/dataverses/{id}/datasetSchema**: See :ref:`get-dataset-json-schema`.
- **/api/dataverses/{id}/validateDatasetJson**: See :ref:`validate-dataset-json`.

Changes
~~~~~~~
- **/api/datasets/{id}/versions/{versionId}/citation**: This endpoint now accepts a new boolean optional query parameter "includeDeaccessioned", which, if enabled, causes the endpoint to consider deaccessioned versions when searching for versions to obtain the citation. See :ref:`get-citation`.

6.0
---
v6.0
----

Changes
~~~~~~~
Expand Down
50 changes: 50 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -505,6 +505,56 @@ The fully expanded example above (without environment variables) looks like this

.. note:: Previous endpoints ``$SERVER/api/dataverses/$id/metadatablocks/:isRoot`` and ``POST https://$SERVER/api/dataverses/$id/metadatablocks/:isRoot?key=$apiKey`` are deprecated, but supported.

.. _get-dataset-json-schema:

Retrieve a Dataset JSON Schema for a Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Retrieves a JSON schema customized for a given collection in order to validate a dataset JSON file prior to creating the dataset. This
first version of the schema only includes required elements and fields. In the future we plan to improve the schema by adding controlled
vocabulary and more robust dataset field format testing:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=root

curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/datasetSchema"

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/dataverses/root/datasetSchema"

Note: you must have "Add Dataset" permission in the given collection to invoke this endpoint.

While it is recommended to download a copy of the JSON Schema from the collection (as above) to account for any fields that have been marked as required, you can also download a minimal :download:`dataset-schema.json <../_static/api/dataset-schema.json>` to get a sense of the schema when no customizations have been made.

.. _validate-dataset-json:

Validate Dataset JSON File for a Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Validates a dataset JSON file customized for a given collection prior to creating the dataset. The validation only tests for json formatting
and the presence of required elements:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=root

curl -H "X-Dataverse-key:$API_TOKEN" -X POST "$SERVER_URL/api/dataverses/$ID/validateDatasetJson" -H 'Content-type:application/json' --upload-file dataset.json

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST "https://demo.dataverse.org/api/dataverses/root/validateDatasetJson" -H 'Content-type:application/json' --upload-file dataset.json

Note: you must have "Add Dataset" permission in the given collection to invoke this endpoint.

.. _create-dataset-command:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,9 @@
@NamedQuery(name = "DataverseFieldTypeInputLevel.findByDataverseIdDatasetFieldTypeId",
query = "select f from DataverseFieldTypeInputLevel f where f.dataverse.id = :dataverseId and f.datasetFieldType.id = :datasetFieldTypeId"),
@NamedQuery(name = "DataverseFieldTypeInputLevel.findByDataverseIdAndDatasetFieldTypeIdList",
query = "select f from DataverseFieldTypeInputLevel f where f.dataverse.id = :dataverseId and f.datasetFieldType.id in :datasetFieldIdList")

query = "select f from DataverseFieldTypeInputLevel f where f.dataverse.id = :dataverseId and f.datasetFieldType.id in :datasetFieldIdList"),
@NamedQuery(name = "DataverseFieldTypeInputLevel.findRequiredByDataverseId",
query = "select f from DataverseFieldTypeInputLevel f where f.dataverse.id = :dataverseId and f.required = 'true' ")
})
@Table(name="DataverseFieldTypeInputLevel"
, uniqueConstraints={
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,16 @@ public DataverseFieldTypeInputLevel findByDataverseIdDatasetFieldTypeId(Long dat
return null;
}
}

public List<DataverseFieldTypeInputLevel> findRequiredByDataverseId(Long dataverseId) {
Query query = em.createNamedQuery("DataverseFieldTypeInputLevel.findRequiredByDataverseId", DataverseFieldTypeInputLevel.class);
query.setParameter("dataverseId", dataverseId);
try{
return query.getResultList();
} catch ( NoResultException nre ) {
return null;
}
}

public void delete(DataverseFieldTypeInputLevel dataverseFieldTypeInputLevel) {
em.remove(em.merge(dataverseFieldTypeInputLevel));
Expand Down