# POST an EAMENA dataset on Zenodo

Post an EAMENA dataset on Zenodo from an EAMENA GeoJSON URL using the Zenodo API. See also the documentation in GitHub [README](https://github.com/eamena-project/eamena-arches-dev/blob/main/data/bibref/README.md) file

---

Zenodo documentation: https://developers.zenodo.org/#quickstart-upload

## Functions and Libraries

In [2]:
!rm eamena-functions -R
!git clone https://github.com/eamena-project/eamena-functions.git
%cd /content/eamena-functions/zenodo
import zenodo as zn

import os
import requests
import json
import zipfile
import pandas as pd
import numpy as np
!pip install sickle
import sickle

# needed to export as JSON
class NpEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        if isinstance(obj, np.floating):
            return float(obj)
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

rm: cannot remove 'eamena-functions': No such file or directory
Cloning into 'eamena-functions'...
remote: Enumerating objects: 226, done.[K
remote: Counting objects: 100% (16/16), done.[K
remote: Compressing objects: 100% (8/8), done.[K
remote: Total 226 (delta 6), reused 13 (delta 4), pack-reused 210[K
Receiving objects: 100% (226/226), 48.15 KiB | 1.30 MiB/s, done.
Resolving deltas: 100% (129/129), done.
/content/eamena-functions/zenodo
Collecting sickle
  Downloading Sickle-0.7.0-py3-none-any.whl (12 kB)
Installing collected packages: sickle
Successfully installed sickle-0.7.0


### Query EAMENA DB

Query the database API using the `GEOJSON_URL` URL selection. Creates a GeoJSON data.

In [3]:
# GEOJSON_URL = r"https://database.eamena.org/api/search/export_results?paging-filter=1&tiles=true&format=geojson&reportlink=false&precision=6&total=326&language=en&advanced-search=%5B%7B%22op%22%3A%22and%22%2C%2234cfea78-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22~%22%2C%22lang%22%3A%22en%22%2C%22val%22%3A%22Sistan%22%7D%2C%2234cfea87-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22%22%7D%7D%2C%7B%22op%22%3A%22or%22%2C%2234cfea69-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22%22%7D%2C%2234cfea73-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22%22%7D%2C%2234cfea43-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%224ed99706-2d90-449a-9a70-700fc5326fb1%22%7D%2C%2234cfea5d-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22%22%7D%2C%2234cfea95-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22~%22%2C%22lang%22%3A%22en%22%2C%22val%22%3A%22%22%7D%7D%5D&resource-type-filter=%5B%7B%22graphid%22%3A%2234cfe98e-c2c0-11ea-9026-02e7594ce0a0%22%2C%22name%22%3A%22Heritage%20Place%22%2C%22inverted%22%3Afalse%7D%5D&map-filter=%7B%22type%22%3A%22FeatureCollection%22%2C%22features%22%3A%5B%7B%22id%22%3A%22e84886109295dcb2d515f9ab792832bf%22%2C%22type%22%3A%22Feature%22%2C%22properties%22%3A%7B%22buffer%22%3A%7B%22width%22%3A10%2C%22unit%22%3A%22m%22%7D%2C%22inverted%22%3Afalse%7D%2C%22geometry%22%3A%7B%22coordinates%22%3A%5B%5B%5B61.5629662657594%2C31.341070427554456%5D%2C%5B61.39269902363566%2C31.226740239181964%5D%2C%5B61.52316353383432%2C30.977760218239922%5D%2C%5B61.773036239808164%2C30.92940344148805%5D%2C%5B61.89244443558445%2C31.037461248216815%5D%2C%5B61.933352798951745%2C31.22484931983834%5D%2C%5B61.5629662657594%2C31.341070427554456%5D%5D%5D%2C%22type%22%3A%22Polygon%22%7D%7D%5D%7D"
GEOJSON_URL = "https://database.eamena.org/api/search/export_results?paging-filter=1&tiles=true&format=geojson&reportlink=false&precision=6&total=307&advanced-search=%5B%7B%22op%22%3A%22and%22%2C%2234cfea78-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22~%22%2C%22lang%22%3A%22en%22%2C%22val%22%3A%22Sistan%22%7D%2C%2234cfea87-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22e6e6abc5-3470-45c0-880e-8b29959672d2%22%7D%7D%2C%7B%22op%22%3A%22or%22%2C%2234cfea69-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22%22%7D%2C%2234cfea73-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22%22%7D%2C%2234cfea43-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%224ed99706-2d90-449a-9a70-700fc5326fb1%22%7D%2C%2234cfea5d-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22%22%2C%22val%22%3A%22%22%7D%2C%2234cfea95-c2c0-11ea-9026-02e7594ce0a0%22%3A%7B%22op%22%3A%22~%22%2C%22lang%22%3A%22en%22%2C%22val%22%3A%22%22%7D%7D%5D&resource-type-filter=%5B%7B%22graphid%22%3A%2234cfe98e-c2c0-11ea-9026-02e7594ce0a0%22%2C%22name%22%3A%22Heritage%20Place%22%2C%22inverted%22%3Afalse%7D%5D&map-filter=%7B%22type%22%3A%22FeatureCollection%22%2C%22features%22%3A%5B%7B%22id%22%3A%22ae42a8fbd96c8f995719a2688f2fad87%22%2C%22type%22%3A%22Feature%22%2C%22properties%22%3A%7B%22buffer%22%3A%7B%22width%22%3A0%2C%22unit%22%3A%22m%22%7D%2C%22inverted%22%3Afalse%7D%2C%22geometry%22%3A%7B%22coordinates%22%3A%5B%5B%5B61.6012854423829%2C31.200317996554716%5D%2C%5B61.43021147281084%2C31.09323453208181%5D%2C%5B61.59265954771092%2C31.014768151044933%5D%2C%5B61.759781654852475%2C30.9118316755916%5D%2C%5B62.03615110465293%2C31.065294359669124%5D%2C%5B61.76357322781999%2C31.32515741066436%5D%2C%5B61.73211151793541%2C31.294427967038885%5D%2C%5B61.68254540589061%2C31.26085644276789%5D%2C%5B61.6012854423829%2C31.200317996554716%5D%5D%5D%2C%22type%22%3A%22Polygon%22%7D%7D%5D%7D"
resp = requests.get(GEOJSON_URL)
data = resp.json()

### Generate metadata from the GeoJSON data

Expect for *free text* fields, metadata is collected from the GeoJSON file (see [eamena-functions](https://github.com/eamena-project/eamena-functions/blob/main/zenodo/zenodo.py))
---
See [Zenodo metadata](https://developers.zenodo.org/#depositions) list (under 'Deposit metadata' chapter)

In [22]:
TITLE = "Once again .. This is the title of my Zenodo deposit" # free text
DESCRIPTION = "Once again .. This is the description of my Zenodo deposit" # free tex

metadata = {
     'metadata': {
         'title': TITLE,
         'description': DESCRIPTION,
         'upload_type': 'dataset',
         'license': 'cc-by',
         'subjects': [{"term": "Cultural property", "identifier": "https://id.loc.gov/authorities/subjects/sh97000183.html", "scheme": "url"}],
         'method': 'EAMENA data entry methodology',
         'creators': [{'name': "EAMENA database",
                       'affiliation': "University of Oxford, University of Southampton"}],
         'contributors': zn.zenodo_contributors(data),
         'keywords': zn.zenodo_keywords(data),
         'dates': zn.zenodo_dates(data),
         'related_identifiers': zn.zenodo_related_identifiers(),
     }
 }

print(json.dumps(metadata, indent=4))

{
    "metadata": {
        "title": "Once again .. This is the title of my Zenodo deposit",
        "description": "Once again .. This is the description of my Zenodo deposit",
        "upload_type": "dataset",
        "license": "cc-by",
        "subjects": [
            {
                "term": "Cultural property",
                "identifier": "https://id.loc.gov/authorities/subjects/sh97000183.html",
                "scheme": "url"
            }
        ],
        "method": "EAMENA data entry methodology",
        "creators": [
            {
                "name": "EAMENA database",
                "affiliation": "University of Oxford, University of Southampton"
            }
        ],
        "contributors": [
            {
                "name": "Bijan Rouhani",
                "type": "DataCollector"
            },
            {
                "name": "Danlei Zhou",
                "type": "DataCollector"
            },
            {
                "name": "Yasaman Nabati

ℹ️ see the [list of Zenodo metadata](https://developers.zenodo.org/#depositions)  

ℹ️ exists an issue when uploading the grants: `'grants': [{'id': '051z6e826::4178'}]`

ℹ️  exist an issue when uploading: `'communities': "[{'identifier': 'eamena'}]"` on https://sandbox.zenodo.org/

### Data output

Write JSON and ZIP locally

In [5]:
# JSON file name and ZIP file
FILENAME = "d_filename"
json_file_name = FILENAME + ".geojson"
zip_file_name = FILENAME + ".zip"

# Create the JSON file and write the data to it
json_string = json.dumps(data, cls = NpEncoder)
json_string = json.loads(json_string)
with open(json_file_name, 'w') as json_file:
	json.dump(json_string, json_file, indent=4)
	print(json_file_name + " has been exported in " + os.getcwd())

# Create a ZIP file and add the JSON file to it
with zipfile.ZipFile(zip_file_name, "w", zipfile.ZIP_DEFLATED) as zipf:
    zipf.write(json_file_name)
    print(zip_file_name + " has been exported in " + os.getcwd())

d_filename.geojson has been exported in /content/eamena-functions/zenodo
d_filename.zip has been exported in /content/eamena-functions/zenodo


ℹ️ Further data can be created and files added into the ZIP

## Zenodo

### Creates an empty bucket

Paste your `ACCESS_TOKEN` (see Zenodo documentation [here](https://zenodo.org/account/settings/applications/tokens))

In [10]:
ACCESS_TOKEN = 'T3FFez8WYNJGV73l61v7ooQwGk9kRE2tJ8sL3XV9QbxeYHI8cyYuLxzZJ9KY'

Choose one of selected Zenodo deposit:
* Zenodo (`ZENODO_URL = 'https://zenodo.org/api/deposit/depositions'`)`
* Zenodo sanbox for tests (`ZENODO_URL = 'https://sandbox.zenodo.org/api/deposit/depositions'`)

In [7]:
ZENODO_URL = 'https://sandbox.zenodo.org/api/deposit/depositions'

Create the bucket

In [11]:
params = {'access_token': ACCESS_TOKEN}
r = requests.post(ZENODO_URL,
                   params=params,
                   json={})
r.status_code
r.json()
# collect the deposition id
deposition_id = r.json()['id']
print("The deposition_id is: " + str(deposition_id))

The deposition_id is: 47


### Add data

In [12]:
bucket_url = r.json()["links"]["bucket"]
with open(zip_file_name, "rb") as fp:
    r = requests.put(
        "%s/%s" % (bucket_url, zip_file_name),
        data = fp,
        params = params,
    )
r.json()

{'created': '2023-11-17T14:30:28.693945+00:00',
 'updated': '2023-11-17T14:30:28.882536+00:00',
 'version_id': 'dfe25aef-ec15-4a59-b666-706c2e8d3bec',
 'key': 'd_filename.zip',
 'size': 49712,
 'mimetype': 'application/zip',
 'checksum': 'md5:53c46128d11a42297ed3f2d82c986a05',
 'is_head': True,
 'delete_marker': False,
 'links': {'self': 'https://sandbox.zenodo.org/api/files/cf07a324-8b5b-4382-a213-36bfe1a1bc6e/d_filename.zip',
  'version': 'https://sandbox.zenodo.org/api/files/cf07a324-8b5b-4382-a213-36bfe1a1bc6e/d_filename.zip?versionId=dfe25aef-ec15-4a59-b666-706c2e8d3bec',
  'uploads': 'https://sandbox.zenodo.org/api/files/cf07a324-8b5b-4382-a213-36bfe1a1bc6e/d_filename.zip?uploads'}}

### Add metadata

In [23]:
r = requests.put('%s/%s' % (ZENODO_URL, deposition_id),
                  params = {'access_token': ACCESS_TOKEN},
                  data = json.dumps(metadata)) # ,
                  # headers = headers)
r.status_code
# 200

200

### Publish

In [24]:
r = requests.post('%s/%s/actions/publish' % (ZENODO_URL, deposition_id),
                      params={'access_token': ACCESS_TOKEN} )
r.status_code
# 504

202

### Check

Have a look at the last deposit (`r.json()[0]`)

In [25]:
r = requests.get(ZENODO_URL,
                  params={'access_token': ACCESS_TOKEN})
r.status_code
# 200
r.json()[0]

{'created': '2023-11-17T14:35:56.950600+00:00',
 'modified': '2023-11-17T14:35:57.134933+00:00',
 'id': 47,
 'conceptrecid': '46',
 'doi': '10.5072/zenodo.47',
 'conceptdoi': '10.5072/zenodo.46',
 'doi_url': 'https://doi.org/10.5072/zenodo.47',
 'metadata': {'title': 'Once again .. This is the title of my Zenodo deposit',
  'doi': '10.5072/zenodo.47',
  'publication_date': '2023-11-17',
  'description': 'Once again .. This is the description of my Zenodo deposit',
  'access_right': 'open',
  'creators': [{'name': 'EAMENA database',
    'affiliation': 'University of Oxford, University of Southampton'}],
  'contributors': [{'name': 'Bijan Rouhani',
    'affiliation': None,
    'type': 'DataCollector'},
   {'name': 'Danlei Zhou', 'affiliation': None, 'type': 'DataCollector'},
   {'name': 'Yasaman Nabati Mazloumi',
    'affiliation': None,
    'type': 'DataCollector'}],
  'keywords': ['EAMENA',
   'MaREA',
   'Iran (Islamic Republic of)',
   'Afghanistan',
   'Islamic (Iran)',
   'Contempo

The same information can be retrieved in `r.json()[0]['links']['latest_draft']`:

In [30]:
from IPython.display import display, HTML, Markdown
html_link = r.json()[0]['links']['html']
url_markdown = "[See the latest record]({})".format(html_link)
display(Markdown(url_markdown))

[See the latest record](https://sandbox.zenodo.org/records/47)

ℹ️ The Zenodo link to the record is recorded in `r.json()[0]['links']['html']`

In [27]:
# show badge
from IPython.display import Markdown as md
badge = r.json()[0]['links']['badge']
Markdown("![]({})".format(badge))
# badge

![](https://sandbox.zenodo.org/badge/doi/10.5072%2Fzenodo.47.svg)

ℹ️ Its DOI is `r.json()[0]['doi_url']`

In [31]:
r.json()[0]['doi_url']

'https://doi.org/10.5072/zenodo.47'