## In this notebook we will explore what data is offered by Copernicus Satellites Sentinel-2

More detailed information can be found here: https://scihub.copernicus.eu/userguide/

The technical guide can be found here: https://sentinels.copernicus.eu/web/sentinel/technical-guides/sentinel-2-msi

And the user guide here: https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi

### The data that we can access from Sentinel-2 is classified as

- Level-1C: provides orthorectified Top-Of-Atmosphere (TOA) reflectance, with sub-pixel multispectral registration. Cloud and land/water masks are included in the product. 

- Level-2C: provides orthorectified Bottom-Of-Atmosphere (BOA) reflectance, with sub-pixel multispectral registration. A Scene Classification map (cloud, cloud shadows, vegetation, soils/deserts, water, snow, etc.) is included in the product.

These products are processed by the Sentinel-2 ground segment

All identifiers for the querry can be found here https://scihub.copernicus.eu/twiki/do/view/SciHubUserGuide/FullTextSearch?redirectedfrom=SciHubUserGuide.3FullTextSearch

Within it you can also find the naming conventions

In [2]:
# Here we will retrieve one example of each offered product level and display its quick look image
from sentinelsat import SentinelAPI
user = ''
password = ''

api = SentinelAPI(user, password, 'https://apihub.copernicus.eu/apihub')

In [2]:
# Level 1C
        # TODO: This wont work i have to select all available databases, just give basic opening instructions, or are all datasets in here
# This querry will look for products from the Sentinel 2A satelite S2A, from level 1C MSIL1C, we add a start date so less products get querried and then the wildcard *
products = api.query(platformname = 'Sentinel-2',
                     filename='S2A_MSIL1C_202108*')

# We will use sort according to size and then download the first dataset that is not in the Long Term Archive (These must be requested 30 min in advance)
try:
    df = api.to_dataframe(products).sort_values(['size'], ascending=[True])
except:
    print('Size not specified')
    df = api.to_dataframe(products).sort_values(['size'], ascending=[True])

for i in range(len(df)):
    #Check if resource is online
    if api.is_online('{}'.format(df.index[i])):
        api.download('{}'.format(df.index[i]))
        #And break the i loop
        break
    else: pass

print('The File Name is {}'.format(df.iloc[i]['filename']))

Querying products: 100%|██████████| 185642/185642 [14:59<00:00, 206.39product/s]
Downloading S2A_MSIL1C_20210801T182921_N0301_R027_T11RNQ_20210801T220817.zip: 100%|██████████| 10.5M/10.5M [00:25<00:00, 416kB/s]
                                                             

The File Name is S2A_MSIL1C_20210801T182921_N0301_R027_T11RNQ_20210801T220817.SAFE




In [4]:
# We will now give basic instructions and further reference to open the data

import zipfile

with zipfile.ZipFile('S2A_MSIL1C_20210801T182921_N0301_R027_T11RNQ_20210801T220817.SAFE'.replace('.SAFE','.zip')) as z:
    z.extractall()


In [28]:
# This returned Top-Of-Atmosphere reflectances in cartographic geometry

# These are 100x100 km2 ortho-images in UTM/WGS84 projection
# They are stored in jp2 (jpeg-2000) format 

#Loading the AUX_DATA
import numpy as np
import os
#Loading the image data
import rasterio
import numpy as np

arrs = []
jp2s = []

# Get all the files paths
for root, dirs, files in os.walk('./S2A_MSIL1C_20210801T182921_N0301_R027_T11RNQ_20210801T220817.SAFE/GRANULE/L1C_T11RNQ_A031913_20210801T184255/IMG_DATA'):
    #Iterate over files in base directory
    for i in files:
        jp2s.append(os.path.join(root, i))
    #Break so no other directories within get iterated upon
    break

#Load them with rasterio
for jp2 in jp2s:
    arrs.append(rasterio.open(jp2,'r+'))
    break

data = np.array(arrs)

# To go further use the dir() method on the elements of data or check rasterio config for further options

None


In [4]:
# Level 2A

# This querry will look for products from the Sentinel 2B satelite S2B, from level 2A MSIL2A, and we dont care about the rest so * 
products = api.query(platformname = 'Sentinel-2',
                     filename='S2B_MSIL2A_202108*')

# We will use sort according to size and then download the first dataset that is not in the Long Term Archive (These must be requested 30 min in advance)
try:
    df = api.to_dataframe(products).sort_values(['size'], ascending=[True])
except:
    print('Size not specified')
for i in range(len(df)):
    #Check if resource is online
    if api.is_online('{}'.format(df.index[i])):
        api.download('{}'.format(df.index[i]))
        #And break the i loop
        break
    else: pass

print('The File Name is {}'.format(df.iloc[i]['filename']))

Querying products: 100%|██████████| 196695/196695 [15:47<00:00, 207.47product/s]
Downloading S2B_MSIL2A_20210804T144729_N0301_R139_T20PPB_20210804T170445.zip: 100%|██████████| 1.07G/1.07G [34:32<00:00, 518kB/s]


The File Name is S2B_MSIL2A_20210804T144729_N0301_R139_T20PPB_20210804T170445.SAFE


In [6]:
# We will now give basic instructions and further reference to open the data

import zipfile

with zipfile.ZipFile('S2B_MSIL2A_20210804T144729_N0301_R139_T20PPB_20210804T170445.SAFE'.replace('.SAFE','.zip')) as z:
    z.extractall()

For processing this data I will refer you to user product 1C as these are also jp2 files, but further processed

Do note that the folder structure within IMG_DATA is different