# Sentinel data download

    Author: Ben Ross
    Approx date of code: 25 June 2020
    Date of reinvention: 28 April 2021

Description: This script is intended to download raster data from SENTINEL

#### Research 
1. To get data from sentinel try this: https://github.com/shakasom/rs-python-tutorials & https://pypi.org/project/sentinelsat/ & https://sentinelsat.readthedocs.io/en/stable/api.html#quickstart & https://github.com/sentinelsat/sentinelsat
2. To get data from NASA worldview try this: https://pypi.org/project/worldview-dl/
3. To get data from landsat try this: https://pypi.org/project/landsatxplore/ (generally low resolution)
4. https://developers.google.com/earth-engine/python_install

There are two pathways by which we can obtain satellite data. The first is the traditional method of downloading direct from source (note that a seperate approach must be used to collect from each instrument). The second and more modernised pathway is to use the google earth engine to search for and analyse data directly on the cloud. Without downloading anything, this results in impressive memory and bandwidth use but minimal file handling. Which is the root cause of many un-repeatable analyses.

In [1]:
from sentinelsat.sentinel import SentinelAPI, read_geojson, geojson_to_wkt
from datetime import date
import pandas
import geopandas

List filtering to boil down the master list of 4 hourly images to one approximately every 6 months. The selected time periods are April and December to target specific seasons. You can always change these in the code below and get a different range or season.

In [None]:
month_list=[]
year_list=[]

for i in range(0,len(products_gdf)):
    Month = products_gdf['ingestiondate'][i].month
    Year = products_gdf['ingestiondate'][i].year
    month_list.append(Month)
    year_list.append(Year)
products_gdf['Month'] = month_list
products_gdf['Year'] = year_list
grouped = products_gdf.groupby(['Year','Month'])['ingestiondate'].idxmin().reset_index()
grouped = grouped.rename(columns={'Year':'Group_Year','Month':'Group_Month','ingestiondate':'group_ingestiondate'})

#grouped

# Enter selected month
selected_month= [3,6,9,12]
grouped=grouped[grouped['Group_Month'].isin(selected_month)]
grouped=grouped.merge(products_gdf.reset_index(),left_on='group_ingestiondate', right_on='index', how='left')
grouped.drop(columns=['Group_Year', 'Group_Month','group_ingestiondate'])

#grouped.to_csv(r'C:\Users\rossb1\DATA\script_outputs\raster-data-collection\grouped.csv', header=True)

## Bulk download
This code will download every item in the above list and for those not able to be extracted it will trigger a retrieval from the long term archive. These can be downloaded from now until 72 hours, after which they become unavailable again.

In [9]:
api = SentinelAPI('benross', 'password', 'https://scihub.copernicus.eu/dhus')
print('Downloading these items from the grouped list above. Selected months are {min} & {max}.'.format(min=grouped['Month'].min(), max=grouped['Month'].max()))
for index in grouped['index']:
    print(index)
    try:
        api.download(index)
    except:
        print('Moving on to the next item.')

Downloading these items from the grouped list above. Selected months are 3 & 12.
3fd52032-f9d7-4b46-b4c6-0b4fb0a1b9af


Downloading: 100%|██████████| 828M/828M [02:30<00:00, 5.50MB/s] 
MD5 checksumming: 100%|██████████| 828M/828M [00:03<00:00, 214MB/s] 


d4188474-9b32-4b7f-b14a-054b299bb2ca


Product d4188474-9b32-4b7f-b14a-054b299bb2ca is not online. Triggering retrieval from long term archive.


41ff0108-e93a-4b42-a10c-65fe20ea9cdb


Product 41ff0108-e93a-4b42-a10c-65fe20ea9cdb is not online. Triggering retrieval from long term archive.


ec15a6a2-8f57-4845-9daa-a7c9abd10ea6


Product ec15a6a2-8f57-4845-9daa-a7c9abd10ea6 is not online. Triggering retrieval from long term archive.


2ccb65a0-40f2-4754-b623-546cf77fe5b9


Downloading: 100%|██████████| 796M/796M [03:30<00:00, 3.77MB/s]  
MD5 checksumming: 100%|██████████| 796M/796M [00:04<00:00, 191MB/s] 


126fd2fa-659d-460e-aee6-7ea3b579ce18


Product 126fd2fa-659d-460e-aee6-7ea3b579ce18 is not online. Triggering retrieval from long term archive.


3d2be71e-d48f-4c76-a4cd-17fc0eb845a0


Product 3d2be71e-d48f-4c76-a4cd-17fc0eb845a0 is not online. Triggering retrieval from long term archive.


2ccc0982-c545-477c-b42c-0d7649fa77dd


Product 2ccc0982-c545-477c-b42c-0d7649fa77dd is not online. Triggering retrieval from long term archive.


04a2261d-d250-4554-b13e-0cac3de7259f


Product 04a2261d-d250-4554-b13e-0cac3de7259f is not online. Triggering retrieval from long term archive.


82ccae78-8b33-46b7-9337-4cca2e2de0da


Product 82ccae78-8b33-46b7-9337-4cca2e2de0da is not online. Triggering retrieval from long term archive.


06c73a2a-ebac-4a68-a6ad-e07b24e94e4b


Product 06c73a2a-ebac-4a68-a6ad-e07b24e94e4b is not online. Triggering retrieval from long term archive.


4be1c3d2-4019-4ac0-b479-efad2bd68ee9


Downloading: 100%|██████████| 810M/810M [02:14<00:00, 6.03MB/s] 
MD5 checksumming: 100%|██████████| 810M/810M [00:02<00:00, 287MB/s] 


af74b254-e34d-4298-beae-fd0573322ef5


Downloading: 100%|██████████| 806M/806M [01:47<00:00, 7.48MB/s] 
MD5 checksumming: 100%|██████████| 806M/806M [00:02<00:00, 296MB/s] 


1f8c6a2b-af04-4fbe-b126-9740d12eff91


Downloading: 100%|██████████| 825M/825M [02:11<00:00, 6.30MB/s] 
MD5 checksumming: 100%|██████████| 825M/825M [00:02<00:00, 362MB/s] 


a789a299-fa52-4a3b-abb1-6dfddfb3af17


Downloading: 100%|██████████| 808M/808M [01:53<00:00, 7.13MB/s] 
MD5 checksumming: 100%|██████████| 808M/808M [00:02<00:00, 291MB/s] 


## Single download
For those rasters selected by hand. When trying to download an offline product with download() it will trigger its retrieval from the LTA.

Given a list of offline and online products, download_all() will download online products, while concurrently triggering the retrieval of offline products from the LTA. Offline products that become online while downloading will be added to the download queue. download_all() terminates when the download queue is empty, even if not all products were retrieved from the LTA. We suggest repeatedly calling download_all() to download all products, either manually or using a third-party library, e.g. tenacity.

<code>from sentinelsat import SentinelAPI
import tenacity
api = SentinelAPI('user', 'password')
@tenacity.retry(stop=tenacity.stop_after_attempt(3), wait=tenacity.wait_fixed(3600))
def download_all(*args, **kwargs):
    return api.download_all(*args, **kwargs)
downloaded, triggered, failed = download_all(<product_ids>)</code>

In [7]:
# api = SentinelAPI('benross', 'password', 'https://scihub.copernicus.eu/dhus')
api = SentinelAPI('benross', 'password', 'https://scihub.copernicus.eu/apihub')
api.download('1d5ac0b4-bc95-46cc-a177-17cb505dc1ce')

Downloading:  26%|██████████████████████████████████▌                                                                                                    | 208M/810M [00:32<01:35, 6.30MB/s]


KeyboardInterrupt: 

In [5]:
api = SentinelAPI('benross', 'password', 'https://scihub.copernicus.eu/dhus')
api.download('0f7630f5-adc1-4ae7-b4e6-2bb6b6eaf120')

Product 0f7630f5-adc1-4ae7-b4e6-2bb6b6eaf120 is not online. Triggering retrieval from long term archive.


{'id': '0f7630f5-adc1-4ae7-b4e6-2bb6b6eaf120',
 'title': 'S2B_MSIL1C_20180624T000239_N0206_R030_T56JLR_20180624T012635',
 'size': 822617873,
 'md5': '64D8C5A7A25EC0C754B1B5EEF370A724',
 'date': datetime.datetime(2018, 6, 24, 0, 2, 39, 24000),
 'footprint': 'POLYGON((150.9983071749483 -26.205453480792766,152.09706101688445 -26.216552345730634,152.0892072651098 -27.207807639529957,150.9809079062571 -27.196222268228084,150.9983071749483 -26.205453480792766))',
 'url': "https://scihub.copernicus.eu/dhus/odata/v1/Products('0f7630f5-adc1-4ae7-b4e6-2bb6b6eaf120')/$value",
 'Online': False,
 'Creation Date': datetime.datetime(2018, 6, 24, 10, 53, 20, 2000),
 'Ingestion Date': datetime.datetime(2018, 6, 24, 10, 48, 31, 358000),
 'path': '.\\S2B_MSIL1C_20180624T000239_N0206_R030_T56JLR_20180624T012635.zip',
 'downloaded_bytes': 0}

In [5]:
api.download('d52cb82d-b3fb-4f3f-85bf-17620a3eb76a')

Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 811M/811M [02:36<00:00, 5.19MB/s]
MD5 checksumming: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 811M/811M [00:01<00:00, 406MB/s]


{'id': 'd52cb82d-b3fb-4f3f-85bf-17620a3eb76a',
 'title': 'S2A_MSIL1C_20200608T000251_N0209_R030_T56JLQ_20200608T013101',
 'size': 811368897,
 'md5': '578DFAAD5340D3DF477A8DDEC89FB35D',
 'date': datetime.datetime(2020, 6, 8, 0, 2, 51, 24000),
 'footprint': 'POLYGON((150.98249434480272 -27.107979523970712,152.08992336364332 -27.11952122632138,152.08169329891487 -28.11064693156211,150.96426167914376 -28.09861104481092,150.98249434480272 -27.107979523970712))',
 'url': "https://scihub.copernicus.eu/apihub/odata/v1/Products('d52cb82d-b3fb-4f3f-85bf-17620a3eb76a')/$value",
 'Online': True,
 'Creation Date': datetime.datetime(2020, 6, 8, 8, 21, 32, 879000),
 'Ingestion Date': datetime.datetime(2020, 6, 8, 8, 21, 13, 695000),
 'path': '.\\S2A_MSIL1C_20200608T000251_N0209_R030_T56JLQ_20200608T013101.zip',
 'downloaded_bytes': 811368897}

# Metadata if required

If you are interested in the metadata for each raster you are collecting then use the following code block

In [30]:
api = SentinelAPI('benross', 'password', 'https://scihub.copernicus.eu/dhus')
api.get_product_odata('75c6596d-c7b9-49e9-a75c-fb5f0e8be6cd', full=True)

{'id': '75c6596d-c7b9-49e9-a75c-fb5f0e8be6cd',
 'title': 'S2B_MSIL2A_20200914T001109_N0214_R073_T56JKR_20200914T021322',
 'size': 1148790859,
 'md5': '7C491D8C40E8BD6CE66AF086056D7B8A',
 'date': datetime.datetime(2020, 9, 14, 0, 11, 9, 24000),
 'footprint': 'POLYGON((151.08150930269144 -27.083033655662746,151.06049297691044 -27.168479359676883,151.0534235411505 -27.19714730798293,149.97211026226412 -27.178056091346864,149.9981760849971 -26.188049972733253,151.09614563513634 -26.20678235710146,151.08150930269144 -27.083033655662746))',
 'url': "https://scihub.copernicus.eu/dhus/odata/v1/Products('75c6596d-c7b9-49e9-a75c-fb5f0e8be6cd')/$value",
 'Online': True,
 'Creation Date': datetime.datetime(2020, 9, 14, 5, 38, 2, 795000),
 'Ingestion Date': datetime.datetime(2020, 9, 14, 5, 37, 18, 193000),
 'Aot retrieval accuracy': 0.0,
 'Cloud cover percentage': 0.200839,
 'Cloud shadow percentage': 0.221176,
 'Dark features percentage': 1.652487,
 'Date': datetime.datetime(2020, 9, 14, 0, 11, 9