# About

The purpose of this document is to create a Dataverse API testing notebook. See [_about_dataverseTest.md](./_about_dataverseTest.md) for information about configuring and running this notebook.

In [None]:
import _installer_dataverseTest # run the _installer_dataverseTest.py script
%load_ext autoreload
%autoreload all
# we need the 'autoreload' above if we are actively making changes to the worker.py module and want to reload any changes to the module without restarting the notebook kernel
# NOTE: if we make changes to the worker script we need to rerun this code block for the notebook to use the new edits

from _worker_dataverseTest import Worker
objWorker = Worker("dataverseTest") # initialize our Worker object; we should only need to call this once for the notebook session (working with 'demo' configuration)

## Create a Dataverse Collection

### Configuration

Using the Dataverse starter object `DATAVERSE_COLLECTION_START` in our configuration file we will create a new collection through the API https://guides.dataverse.org/en/5.13/api/native-api.html#create-a-dataverse-collection. Luckily we do not need to follow the API documentation that instructs users to create a separate JSON file for use with the API endpoint. Since we added the JSON to our main configuration file we can simply reference the object in the `json` parameter of our request. We will place this collection under the root 'parent' collection.

### Retrieving our collection info

Since we already have our starter collection information defined in our main `_config_dataverseTest.json` file, there is no need to save the collection information sent back from the creation of our collection. We can always use the `DvViewCollection` method in our worker script to retrieve the collection information as long as we at least know our collection alias. 

### Issue

Note: If you use a GET request instead of a POST request to the API endpoint, the action may appear to be successful but it will simply be returning the Dataverse collection of the main parent collection, and NOT create a new collection for you.

In [None]:
objWorker.DvCreateCollection()  # initialize a new collection

In [None]:
objWorker.DvViewCollection()  # view information on our dataverse collection

In [None]:
# objWorker.DvDeleteCollection()  # delete our dataverse collection

In [None]:
objWorker.DvViewCollectionContents()  # view dataverse collection contents

## Create a dataset

Using the https://guides.dataverse.org/en/5.13/_downloads/4e04c8120d51efab20e480c6427f139c/dataset-create-new-all-default-fields.json referenced in https://guides.dataverse.org/en/5.13/api/native-api.html#create-a-dataset-in-a-dataverse-collection, will be our dataset template. We simply add this JSON object to our `_config_dataverseTest.json` file under the `DATAVERSE_DATASET` constant.


In [None]:
objWorker.DvCreateDataset()  # create a dataset

## Packaging the Dataverse API handlers

At this point we want to setup a package/module that we can load from GitHub and use in our notebook since all of our dataset notebooks will be using the same code. Have the core code in an imported module will all for ease of use by other users without the need to add the core functions into their notebook code. It simply makes things cleaner.

### Package files

This is an example of packaging on GitHub (which is what I want).
https://github.com/ceddlyburge/python_world/tree/master

https://packaging.python.org/en/latest/tutorials/packaging-projects/

### Using imported packages

https://github.com/wax911/plugin-architecture/blob/master/plugins/advanced-plugin/main.py#L14



## Adding files to the dataset

See https://guides.dataverse.org/en/5.13/api/native-api.html#add-file-api

Before we upload any files to our dataset we need to make sure we have saved the dataset persistentId to our notebook configuration. We will assume each curation notebook applies to one dataset so we only need to track on persistentId per notebook.



## Issue UNCDVSUP-38 (submitted on 8/17)

I’m trying to use the JSON from https://guides.dataverse.org/en/5.13/_downloads/4e04c8120d51efab20e480c6427f139c/dataset-create-new-all-default-fields.json to create a new dataset in http://demo-dataverse.rdmc.unc.edu .  However I am receiving an error that makes it seem that the JSON properties are incorrectly defined. Below is the response information (with the error message appearing in https://github.com/IQSS/dataverse.harvard.edu/issues/172 ):

json= {'status': 'ERROR', 'message': 'Error parsing Json: incorrect multiple   for field productionPlace'}
headers= {'Date': 'Sat, 17 Aug 2024 16:09:07 GMT', 'Server': 'Apache/2.4.37 (Rocky Linux) OpenSSL/1.1.1k', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Methods': 'PUT, GET, POST, DELETE, OPTIONS', 'Access-Control-Allow-Headers': 'Accept, Content-Type, X-Dataverse-Key, Range', 'Access-Control-Expose-Headers': 'Accept-Ranges, Content-Range, Content-Encoding', 'Content-Type': 'application/json;charset=UTF-8', 'Content-Length': '97', 'Connection': 'close'}
response status= 400

The JSON in question seems to be:

{
              "typeName": "productionPlace",
              "multiple": false,
              "typeClass": "primitive",
              "value": "ProductionPlace"
            },

The release notes for 5.13 state: 

Edit the following line to your schema.xml (to indicate that productionPlace is now multiValued='true"):

So I can’t tell if the UNC Dataverse schema simply needs updating or something else is going on. If I set "multiple": true, in the JSON then the response is:

json= {'status': 'ERROR', 'message': 'Error parsing Json: Invalid values submitted for productionPlace. It should be an array of values.'}

…but I do not know how to format the JSON for multiple values.

My Python method for creating the dataset is using POST so that should not be the issue.

def DvCreateDataset(self):
        print("start DvCreateDataset")
        strApiEndpoint = '%s/api/dataverses/%s/datasets' % (self.strDATAVERSE_DOMAIN, self._config["DATAVERSE_COLLECTION_START"]["alias"])
        print('making request: %s' % strApiEndpoint)
        objHeaders = {
            "Content-Type": "application/json",
            "X-Dataverse-Key": self.strDATAVERSE_API_TOKEN
        }
        r = requests.request("POST", strApiEndpoint, json=self._config["DATAVERSE_DATASET"], headers=objHeaders)
        self.printResponseInfo(r)
        print("end DvCreateDataset")