# 6 - Registering a Data file (as URL)

Demonstrates how to register a new Data file, in this case without upload but through a URL.

In [None]:
import requests
import json
import string

import getpass

In [None]:
base_url = 'https://sandbox2.fairdomhub.org'

Set up the headers and authenticate just as in the earlier steps.

In [None]:
headers = {"Content-type": "application/vnd.api+json",
           "Accept": "application/vnd.api+json",
           "Accept-Charset": "ISO-8859-1"}

session = requests.Session()
session.headers.update(headers)
session.auth = (input('Username:'), getpass.getpass('Password'))

Define the projects id.

In [None]:
containing_project_id = 8

The general data_file JSON structure is built as a Hash. In this case, the policy says that anybody can download.

The license is set as [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/). The list of available licenses can be found in the SEEK [API Overview](https://docs.seek4science.org/tech/api/index.html)

In [None]:
data_file = {}
data_file['data'] = {}
data_file['data']['type'] = 'data_files'
data_file['data']['attributes'] = {}
data_file['data']['attributes']['title'] = 'my lovely datafile'
data_file['data']['attributes']['license'] = 'CC-BY-4.0'
data_file['data']['attributes']['policy'] = {'access':'download'}
data_file['data']['relationships'] = {}
data_file['data']['relationships']['projects'] = {}
data_file['data']['relationships']['projects']['data'] = [{'id' : containing_project_id, 'type' : 'projects'}]

We describe a Content Blob just as it is was described in a earlier step. In this case we use the url key, and point to an available file (not webpage). In this case a copy of the FAIRDOM logo in our Github repository. A custom filename has also been provided, without this the filename would be determined from the link.

In [None]:
remote_blob = {'url' : 'https://github.com/seek4science/seek/raw/master/app/assets/images/logos/fairdom-logo.png', 'original_filename':'logo.png'}
data_file['data']['attributes']['content_blobs'] = [remote_blob]

The data file and content blob is registered in one step with a POST to the data_files root.

In [None]:
r = session.post(base_url + '/data_files', json=data_file)
r.raise_for_status()

The resulting JSON should provide the details about the created Data file. Note that in the content blob details, the size and content_type have automatically been determined by inspecting the URL (a HEAD request is used).

In [None]:
populated_data_file = r.json()
populated_data_file

Check content by downloading the image with the link provided.

In [None]:
image_url = populated_data_file['data']['attributes']['content_blobs'][0]['link'] + '/download'
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= image_url)

Delete the data file.  
Response should be `200` - Success!

In [None]:
session.delete(base_url + populated_data_file['data']['links']['self'])

# Exercise 6



*  Find a URL to a resource online. This can be a picture on a website or a raw file in github. Update the data file to use that URL