# Downloading data from Sentinel hub
First part of the project is to build a notebook to download satelite images taken by sentinel satellites.  
The data is available both online and through the Sentinel API.  
  
[Sentinel-2](https://apihub.copernicus.eu/apihub) - available if you register to sentinel hub (free)  
[Sentinel-5](https://s5phub.copernicus.eu/dhus) - available without registration (free)  
    
In this notebook I will provide a code to download both `Sentinel-5` and `Sentinel-2` satellite images. 
   
Let's start with Importing the necessary libraries. Keep in mind that `rasterio` is `gdal` dependant.

### Sentinel 5

In [1]:
from sentinelsat import SentinelAPI, read_geojson, geojson_to_wkt
from datetime import date
import glob
import zipfile
import shutil
import os
import rasterio
from rasterio.plot import show
from rasterio.merge import merge

In [9]:
# S5 hub is free for everyone with guest credentials
api = SentinelAPI('s5pguest', 's5pguest', 'https://s5phub.copernicus.eu/dhus')
# set area of interest == Lithuania
footprint = geojson_to_wkt(read_geojson('mapLT.geojson'))
# set dates that you want to search for
init_day = date(2023, 2, 1)
end_day = date(2023, 3, 1)
# set the initial orbit that you want to follow
init_day_orbit = 27453

In [10]:
def orbit_nr_counter(first_day, last_day, initial_orbit_nr):
    '''
    Given time interval and first number of orbit returns a list containing orbit numbers.
    Input:
        first_day : first day to look for the values 
        last_day : end day to look for the values
        initial_orbit_nr : initial orbit number
    Output: list
    '''
    n = (last_day-first_day).days  # Number of days to look for
    increment = 14  # Amount to increment by each day (every fifth day +1)
    orbit_numbers = [initial_orbit_nr] # first day is the initial orbit, from here, we will increment the count
    count = 0
    # let's generate the orbit_numbers list to use in api search
    for i in range(1, n + 1):
        if i % 5 == 0:
            count += 1
            orbit_numbers.append((i * increment) + 1 + initial_orbit_nr)
        else:
            orbit_numbers.append((i * increment) + count + initial_orbit_nr)

    return orbit_numbers

In [11]:
# first let's get the orbit numbers that we will be looking for
orbit_numbers = orbit_nr_counter(init_day, end_day, init_day_orbit)
# then let's search for products available in the sentinel api, given the footprint(area)
products = api.query(footprint, 
                    date = (init_day, end_day),
                    platformname = 'Sentinel-5',
                    producttype = 'L2__NO2___')
# return a df to check data easily
products_df = api.to_dataframe(products)
# filter the necessary orbits and return a list of ID's
products_list = products_df[products_df.orbitnumber.isin(orbit_numbers)].index

In [15]:
# checking if product is online and downloading if so
for raster in products_list:
    if api.is_online(raster):
        api.download(raster, directory_path='sentinel_data/sentinel5/no2 files', checksum=False)
    else:
        print("This tile is not available" + str(raster))

Downloading S5P_OFFL_L2__NO2____20230226T092559_20230226T110729_27836_03_020400_20230228T012659.nc: 100%|██████████| 599M/599M [05:12<00:00, 1.92MB/s] 
Downloading S5P_OFFL_L2__NO2____20230225T094454_20230225T112625_27822_03_020400_20230227T015410.nc: 100%|██████████| 605M/605M [05:05<00:00, 1.98MB/s] 
Downloading S5P_OFFL_L2__NO2____20230223T084116_20230223T102246_27793_03_020400_20230225T004111.nc: 100%|██████████| 577M/577M [04:56<00:00, 1.95MB/s] 
Downloading S5P_OFFL_L2__NO2____20230222T090012_20230222T104142_27779_03_020400_20230224T011420.nc: 100%|██████████| 601M/601M [04:05<00:00, 2.44MB/s] 
Downloading S5P_OFFL_L2__NO2____20230221T091907_20230221T110038_27765_03_020400_20230223T013507.nc: 100%|██████████| 599M/599M [03:48<00:00, 2.63MB/s] 
Downloading S5P_OFFL_L2__NO2____20230220T093803_20230220T111934_27751_03_020400_20230222T015319.nc: 100%|██████████| 603M/603M [03:57<00:00, 2.54MB/s] 
Downloading S5P_OFFL_L2__NO2____20230218T083425_20230218T101555_27722_03_020400_20230220

### Sentinel 2

Similar to previous example, we will use the sentinel api to retrieve Sentinel-2 data. Instead of collecting scanlines, here, we search for raster tiles that covers the country map which is provided in ` geojson ` format.

In [2]:
# days for which you want to fetch the data

begin_date = date(2022, 8, 3)
end_date = date(2022, 10, 10)
# your credentials to connect to sentinel hub
api = SentinelAPI('dontat', 'Uqga2N2RYbRDgYy', 'https://apihub.copernicus.eu/apihub')
# api = SentinelAPI('#user', '#password', 'https://apihub.copernicus.eu/apihub')
# set area of interest == Lithuania
footprint = geojson_to_wkt(read_geojson('mapLT.geojson'))
# define a list of raster tiles that cover the country
LT_tiles = ['34VDH','34UDG','34VEH','34UEG',
            '34UEF','34VFH','34UFG','34UFF',
            '34UFE','35VLC','35ULB','35ULA',
            '35ULV','35VMC','35UMB','35UMA']

In [12]:
# find all tiles with less than 25% cloud coverage
# L2A products provide bottom of atmosphere reflectances in cartographic geometry.

products = api.query(footprint, 
                    date = (begin_date, end_date),
                    platformname = 'Sentinel-2',
                    processinglevel = 'Level-2A',	
                    cloudcoverpercentage = (0, 25))

In [16]:
# convert to Pandas DataFrame
products_df = api.to_dataframe(products)
# filter most recent products with least coulds
products_df = products_df.sort_values(['cloudcoverpercentage', 'ingestiondate'], ascending=[True, True])
# define unique tiles
products_df = products_df[products_df['tileid'].isin(LT_tiles)].drop_duplicates(subset='tileid', keep='first', inplace=False)
print("should be 16 tiles to cover Lithuania, now we have:",len(products_df))

should be 16 tiles to cover Lithuania, now we have: 77


In [85]:
# check if tile is online and download if so
# data will be downloaded in .zip format
# keep in mind that raster might not contain a full image, so you can check the precise image and download it manually @ https://scihub.copernicus.eu/dhus/#/home
for tile in products_df.index:
    if api.is_online(tile):
        api.download(tile, directory_path='sentinel_data/sentinel2/s2 tiles', checksum=False)
    else:
        print("This tile is not available" + str(tile))

In [19]:
## if you are not looking for single "good" tile, you can download all tiles from the database
## this is what you should do if you want to create a completely cloud free s2 image using several images
# api.download_all(products_df.index, directory_path='sentinel_data/sentinel2/s2 tiles', checksum=False)