# Copernicus Open Access Hub [1]

Here we will be giving a general outline of how to download data from the Copernicus Open Access Hub using an API

The data can also be accessed from: https://scihub.copernicus.eu/dhus/#/home

Within it you can grab data from the following missions:

- Sentinel-1   :   two identical radar imagery satellites in the same orbit, providing an all-weather, day-and-night supply of images of Earth’s surface [2]
- Sentinel-2   :   two identical satellites in the same orbit, images land and coastal areas at high spatial resolution in the optical domain [3]
- Sentinel-3   :   one satelite, measures sea surface topography, sea and land surface temperature, and ocean and land surface colour [4]
- Sentinel-5P  :   Sentinel-5 Precursor mission to monitor our atmosphere, its intended use is air quality, ozone & UV radtiation, and climate monitoring & forcatsting measurements [5]


A full introduction can be found here: https://scihub.copernicus.eu/userguide/WebHome

## First we wil focus on downloading the data

To do this we will use OData, a data access protocoll build on HTTP and REST, we can communicate with it using an Application Program Interface (API)

In [207]:
import requests

#Enter log in data here, an account can be made here: https://www.copernicus.eu/en

user = ''
password = ''


#The base URL is:
service_root_uri = 'https://scihub.copernicus.eu/dhus/odata/v1'
#The first addative to the URL can be any of these
resource_paths = ['/Products','/DeletedProducts','/Collections']
#We will be using the Products resources to construct our uri (unique resource identifier)
uri = service_root_uri+ resource_paths[0]

    #If you were to request this data you will get a table with metadata about the individual products, but not much usefull for our purposes

## Making Querries

To narrow down your search we can use several querry options, to start our querry we must place the character '?' at the end of our uri

In the next few cells we will see how different options work

In [208]:
    #Making Querries 
#To start the querry string we must add a question mark
uri += '?'
# The below are our querrrying options
querry_options = ['$format=',        # HTTP response type either xml or json
                  '$filter=',        # An expression or function that must return true for the entry to be returned
                  '$orderby=',       # According to what values to order the output data
                  '$select=',        # Request properties of entries
                  '$skip=',          # Number of records to skip 
                  '$top=',           # Maximum number of records to return
                  '$count=',         # Request a count of matching resources 
                  '$inlinecount=',   # Request the count with the data
                  '$expand='         # Include related resources 
                  ]

## Format

This option specifies the data output format, 

we will be using the json format, it is generally read as a dictionary object containing nested lists and dictionaries


In [209]:
formats = [
            'json',
            'atom',
            'xml',
            'application/metalink4%2Bxml',
            'text/csv'
          ]

uri += '$format=json'

when adding multiple filters one must seperate them by and, if you would like more options these need to be seperated by &

## Filter

The filter option selects a subset of entries that satisfy the expression specified by the querry

For a full explanation please go to the manual https://scihub.copernicus.eu/twiki/do/view/SciHubUserGuide/ODataAPI#URI_Components 

They can be found under OData System Querry $filter

**Comparison operators**:

- <   	lt   	    Lower than
     
- ≤   	le   	    Lower or equal than
     
- \>   	gt   	    Greater than
     
- ≥   	ge   	    Greater or equal than
     
- =   	eq   	    Equal
     
- ≠   	ne   	    Not Equal



**Date Built-in functions**:
- IngestionDate
- CreationDate
- ContentDate/Start and ContentDate/End 

We can use the following identifiers:
- **year**
- **month**
- **day**
- **hour**
- **minute**
- **second**
- **datetime**2

**Example**:

https://scihub.copernicus.eu/dhus/odata/v1/Products?$filter=**year(IngestionDate) eq 2017 and month(IngestionDate) eq 12**


**Substring Built-in functions:**
- substringof 	Returns records with names containing a particular string at any position
- endswith 	The endswith function returns true if the first parameter string value ends with the second parameter string value, otherwise it returns false
- startwith 	The startswith function returns true if the first parameter string value starts with the second parameter string value, otherwise it returns false 

**Examples:**

This OData URI lists the products having SLC in the file name 

https://scihub.copernicus.eu/dhus/odata/v1/Products?$**filter=substringof('SLC',Name)**

This URI selects all the S1 products

https://scihub.copernicus.eu/dhus/odata/v1/Products?**$filter=startswith(Name,'S1')**


**Products archiving status:**
- **true**
- **false**


If we wanted to see only entries newer than the first of February 2015 which are from the S1 product:

In [210]:
uri += "&$filter=IngestionDate gt datetime'2015-02-01T00:00:00.000'"
uri += " and startswith(Name,'S1')"

## Orderby

Order objects in ascending (asc) or descending order (desc).

We will order by IngestionDate descending

In [211]:
uri += '&$orderby=IngestionDate desc'

## Select

Filter for a subset of properties to be returned, using a comma seperated list of selection clauses. 

Each selection clause may be a Property name, Navigation Property name, or the "*" character.

We will only want the return value of Name CreationDate, ContentType and Id

In [212]:
uri+= '&$select=Id,Name,ContentType,ContentDate'

## Skip and top

top will select the first n entries, where n is an integer provided

skip will skip the first n entries, where n is an integer provided

In [213]:
uri+= '&$skip=10'
uri += '&$top=100'

## Count and inlinecount

count allows to return a count of returned resources with the response

inlinecount specifies the count of the number of entities to be included in the response

In [214]:
uri+= '&$inlinecount=allpages'

## Expand

To expand our results based on a specific Navigation Propery, we can write expand

In [215]:
uri += '&$expand=Nodes'

In [216]:
print(uri)
#To retrieve the data all we need to do is fill in our log in information, an account can be made on https://www.copernicus.eu/en
data = requests.get(uri, auth=(user, password))
#The Below will return the json decoded text, with .content you can get the raw text and with .text you can get the decoded content
print(data.text[:1000],'........')
#However this output may not be useful

https://scihub.copernicus.eu/dhus/odata/v1/Products?$format=json&$filter=IngestionDate gt datetime'2015-02-01T00:00:00.000' and startswith(Name,'S1')&$orderby=IngestionDate desc&$select=Id,Name,ContentType,ContentDate&$skip=10&$top=100&$inlinecount=allpages&$expand=Nodes
{"d":{"__count":"7096567","results":[{"__metadata":{"id":"https://scihub.copernicus.eu/dhus/odata/v1/Products('7642dfbb-975f-4745-94f5-70f11aee6013')","uri":"https://scihub.copernicus.eu/dhus/odata/v1/Products('7642dfbb-975f-4745-94f5-70f11aee6013')","type":"DHuS.Product","content_type":"application/octet-stream","media_src":"https://scihub.copernicus.eu/dhus/odata/v1/Products('7642dfbb-975f-4745-94f5-70f11aee6013')/$value","edit_media":"https://scihub.copernicus.eu/dhus/odata/v1/Products('7642dfbb-975f-4745-94f5-70f11aee6013')/$value"},"Id":"7642dfbb-975f-4745-94f5-70f11aee6013","Name":"S1B_IW_GRDH_1SDV_20210929T030608_20210929T030635_028907_037328_8F38","ContentType":"application/octet-stream","ContentDate":{"__metad

In [218]:
# The first few redundant keys are
data.json()['d']['results']
# What we are left with is a list of different products

# If we now access the first entry, we have a choice of different parameters for this product
data.json()['d']['results'][0].keys()

dict_keys(['__metadata', 'Id', 'Name', 'ContentType', 'ContentDate'])

In [219]:
#To get one product
data.json()['d']['results'][2]

{'__metadata': {'id': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')",
  'uri': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')",
  'type': 'DHuS.Product',
  'content_type': 'application/octet-stream',
  'media_src': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/$value",
  'edit_media': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/$value"},
 'Id': '263ace8d-680a-475e-88dc-b68e76d1159c',
 'Name': 'S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6',
 'ContentType': 'application/octet-stream',
 'ContentDate': {'__metadata': {'type': 'DHuS.TimeRange'},
  'Start': '/Date(1632902683435)/',
  'End': '/Date(1632902715834)/'}}

## Download

We can download a full resource, it will be returned in xml format, if we know the products UUID or simply ID, we can get an outline of the xml schema of the node with:

\<ServiceRootUri\>/Products('\<Id\>')/Nodes('[PRODUCT_NAME.SAFE]')/Nodes

We can download the manifest file using:

\<ServiceRootUri\>/Products('Id')/Nodes('Filename')/Nodes('manifest.safe')/$value

And the quick look file:

\<ServiceRootUri\>/Products('Id')/Nodes('Filename')/Nodes('preview')/Nodes('quick-look.png')/$value

In [220]:
# An example
uri = "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes"

first_level_nodes = requests.get(uri, auth=(user, password))


In [221]:
#Here we are retrieving an xml format, we can retrieve json by simply placing ?$format=json at the end of the querrry

from bs4 import BeautifulSoup

fln = BeautifulSoup(first_level_nodes.content, 'xml')

fln

<?xml version="1.0" encoding="utf-8"?>
<feed xml:base="https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/" xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"><id>https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes</id><title type="text">Nodes</title><updated>2021-09-29T09:57:15.113Z</updated><author><name/></author><link href="Nodes" rel="self" title="Nodes"/><entry><id>https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes('S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE')</id><title type="text">S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE</title><updated>2021-09-29T09:57:15.131Z</updated><category scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" term="DHuS.Node"/

In [222]:
#To find specific things within this

fln.find_all('Id')
#This gives us the first NodeId, now we can get the xml_scheme

[<d:Id>S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE</d:Id>]

In [232]:
#Here we are using the json format again, as I find it easier to work with, here I also expanded on Nodes
uri = "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes('S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE')/Nodes?$format=json&$expand=Nodes"

xml_scheme = requests.get(uri, auth=(user, password))

xml_scheme.json()['d']['results'][0]

{'__metadata': {'id': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes('S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE')/Nodes('support')",
  'uri': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes('S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE')/Nodes('support')",
  'type': 'DHuS.Node',
  'content_type': 'application/octet-stream',
  'media_src': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes('S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE')/Nodes('support')/$value",
  'edit_media': "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes('S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE')/Nodes('support')/$value"},
 'Id': 'support',
 'Name': 'support',
 'ContentType': 'Item',
 'ContentLength'

In [224]:
# If we now want the manifest we would do:
        #If you figure out how to change the format of this querry please tell me
uri = "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/Nodes('S1B_IW_RAW__0SDV_20210929T080443_20210929T080515_028910_037342_30C6.SAFE')/Nodes('manifest.safe')/$value"

manifest = requests.get(uri, auth=(user, password))


from bs4 import BeautifulSoup

mf = BeautifulSoup(manifest.content, 'xml')

mf.find_all()

[<xfdu:XFDU version="esa/safe/sentinel-1.0/sentinel-1/sar/level-0/standard/iwdp" xmlns:s1="http://www.esa.int/safe/sentinel-1.0/sentinel-1" xmlns:s1sar="http://www.esa.int/safe/sentinel-1.0/sentinel-1/sar" xmlns:xfdu="urn:ccsds:schema:xfdu:1">
 <informationPackageMap>
 <xfdu:contentUnit dmdID="acquisitionPeriod platform generalProductInformation" pdiID="processing" textInfo="Sentinel-1 Level 0 Package" unitType="SAFE Archive Information Package">
 <xfdu:contentUnit repID="measurementAnnotationSchema" textInfo="Annotation for Measurement Data 1" unitType="Metadata Unit">
 <dataObjectPointer dataObjectID="measurementAnnotData1"/>
 </xfdu:contentUnit>
 <xfdu:contentUnit repID="measurementAnnotationSchema" textInfo="Annotation for Measurement Data 2" unitType="Metadata Unit">
 <dataObjectPointer dataObjectID="measurementAnnotData2"/>
 </xfdu:contentUnit>
 <xfdu:contentUnit dmdID="measurementOrbitReference measurementFrameSet measurementQualityInformation measurement1IndexAnnotation measure

And at the very end we will Download a product (This takes a while)

In [234]:
uri = "https://scihub.copernicus.eu/dhus/odata/v1/Products('263ace8d-680a-475e-88dc-b68e76d1159c')/$value"

product = requests.get(uri, auth=(user, password))

In [None]:
product

# Sources:

-[1]: https://scihub.copernicus.eu/

-[2]: https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-1

-[3]: https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-2

-[4]: https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-3

-[5]: https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-5p