<a href="https://colab.research.google.com/github/ihawryluk/GP_nowcasting/blob/main/Sentinel2_download_change_cloud_for_faulty_images.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fixing not-the-best images downloaded in the first batch of Sentinel-2 download

The first batch was downloaded by taking a median of 3-least cloudy images from 2017-2020. Some of the images are too bright, some are too dark, some have clouds in them. I manually picked those from the original downloaded data and will attempt fixing them. There are about 400 of those images.
Here I will re-download those faulty images as a median of 1-, 2- and 4- leats cloudy images.



---

Edit: I obtained more satisfactory images for ~90 images for cloud median 1 and 2 clouds. Still ~300 images with issues. such as cropped images, too bright, too dark. Need to look into these in more detail.

In [1]:
#!pip install numpy --upgrade  # this is needed for the dependencies -- 1) Run this line 2) Restart runtime 3) Comment out and run the rest

In [4]:
# Install the library
!pip -q install FireHR==0.1.2
# don't worry about the torch error

[K     |████████████████████████████████| 776.8MB 23kB/s 
[K     |████████████████████████████████| 15.4MB 182kB/s 
[K     |████████████████████████████████| 6.6MB 26.3MB/s 
[K     |████████████████████████████████| 61kB 7.6MB/s 
[?25h  Building wheel for cdsapi (setup.py) ... [?25l[?25hdone
[31mERROR: torchtext 0.10.0 has requirement torch==1.9.0, but you'll have torch 1.7.1 which is incompatible.[0m


In [6]:
# Authenticate to use Google Earth Engine API
import ee
ee.Authenticate()

To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://accounts.google.com/o/oauth2/auth?client_id=517222506229-vsmmajv00ul0bs7p89v5m89qs8eb9359.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fearthengine+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.full_control&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&code_challenge=gxGpwFaeI5en8moAqy2llesArXD3nTLCUi7Ir9Z0tPc&code_challenge_method=S256

The authorization workflow will generate a code, which you should paste in the box below. 
Enter verification code: 4/1AY0e-g5TfYAaQyg0-jHlxF_pCyoxBJ-oDHRqQX_vzBMFYlteFEXwIEuxO3c

Successfully saved authorization token.


In [5]:
import pandas as pd
from google.colab import files
import numpy as np
import matplotlib.pyplot as plt
from banet.data import open_tif
from pathlib import Path
from FireHR.data import *
from PIL import Image
from time import time


## Manually upload CNN/malaria_df.csv dataframe and a list of faulty images


In [13]:
from google.colab import files
uploaded = files.upload()
uploaded = files.upload()


Saving faulty_images_list.csv to faulty_images_list.csv


In [14]:
# malaria_df = pd.read_csv('malaria_df.csv')
malaria_df = pd.read_csv('malaria_df_sentinel(5).csv')
faulty = pd.read_csv('faulty_images_list.csv')
print(malaria_df.tail())
print(faulty.tail())
print(type(faulty))
print(faulty.columns)

       country  ...                new_filename
3109  Zimbabwe  ...  sent2_-16.6613_29.6643.png
3110  Zimbabwe  ...  sent2_-16.9454_29.6912.png
3111  Zimbabwe  ...  sent2_-18.3369_29.9142.png
3112  Zimbabwe  ...  sent2_-16.9522_29.5913.png
3113  Zimbabwe  ...  sent2_-18.2664_30.9808.png

[5 rows x 9 columns]
                    filename
437  sent2_9.6026_43.336.png
438  sent2_9.63333_-5.25.png
439   sent2_9.6416_-1.05.png
440   sent2_9.6444_-1.02.png
441  sent2_9.95222_43.37.png
<class 'pandas.core.frame.DataFrame'>
Index(['filename'], dtype='object')


## Function for downloading a single Sentinel-2 image by coordinates



In [28]:
failed = []
failed_shape = []
%mkdir images

In [29]:
def get_long_and_lat_by_filename(filename):
  long = malaria_df[malaria_df['new_filename'] == filename]['longitude'].values[0]
  lat = malaria_df[malaria_df['new_filename'] == filename]['latitude'].values[0]
  return (long, lat)

In [30]:
def download_tiff(lat, long, pad, savename, cloud):

  %rm data/*.tif
  bottom = lat - pad
  top = lat + pad
  left = long - pad
  right = long + pad

  path_save   = Path('data')
  products    = ["COPERNICUS/S2"]  # Product id in google earth engine
  bands       = ['B4', 'B3', 'B2'] # Red, Green, Blue

  R = RegionST(name         = 'Sentinel-2_malaria', 
              bbox         = [left,bottom,right,top], 
              scale_meters = 10, 
              time_start   = '2017-01-01', 
              time_end     = '2020-01-01')

  # Download time series
  # download_data_ts(R, products, bands, path_save)

  time_window = R.times[0], R.times[-1]

  # Download median composite of the `cloud` least cloudy images within the time_window
  # `cloud` is one of the arguments of the function
  try:
    download_data(R, time_window, products, bands, path_save, 
                  use_least_cloudy=cloud, show_progress=False)
  except:
    print('Could not download image', savename)
    failed.append(savename)

  try:    
     convert_to_png_and_save(savename)
  except:
    print('File', savename, 'downloaded but could not be converted to png or saved')



In [31]:
def convert_to_png_and_save(savename):
  try:
    im = np.concatenate([open_tif(f'data/download.{b}.tif').read() for b in ['B4', 'B3', 'B2']])
  except:
    print('Could not concat the tiffs')
  im = im.transpose(1,2,0)
  pad_x = im.shape[0] - 244
  pad_y = im.shape[1] - 244
  try:
    im = im[int(pad_x/2):, int(pad_y/2):, :]
  except:
    print('Problems with shape of image', savename)
    failed_shape.append(savename)
  im = im[0:224, 0:224, :]
  im = im.astype(np.float32)
  im = im / im.max()
  im_int = (im * 255).astype(np.uint8)
  ims = Image.fromarray(im_int).convert('RGB') 
  ims.save('images/' + savename)
  # files.download('images/' + savename)

In [33]:
def download_from_faulty_df(cloud):
  for i in range(len(faulty.index)):
    if i % 10 == 0:
      print(i)
    filename = faulty.loc[i,'filename']
    long, lat = get_long_and_lat_by_filename(filename)
    long = round(long,5)
    lat = round(lat,5)
    pad = 0.011
    savename = filename
    download_tiff(lat, long, pad, savename, cloud = cloud)
  #   malaria_df.loc[i,'new_filename'] =  savename
  # malaria_df.to_csv('malaria_df_sentinel.csv', index=False)
  # files.download('malaria_df_sentinel.csv')
  print('Images from downloaded')

In [34]:
def zip_and_download():
  !zip -r /content/images.zip /content/images
  files.download("/content/images.zip")

In [37]:
def clean():
  !rm *.zip
  !rm images/*.png
  !rm data/*.tif

## Iterate over all rows in the dataframe

In [35]:
print('Altogether we have', len(faulty.index), 'locations in the faulty images dataframe')

Altogether we have 442 locations in the faulty images dataframe


In [36]:
download_from_faulty_df(2)
zip_and_download()

0
rm: cannot remove 'data/*.tif': No such file or directory
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
420
430


  app.launch_new_instance()


440
Images from downloaded
  adding: content/images/ (stored 0%)
  adding: content/images/sent2_-4.42108_15.36884.png (deflated 0%)
  adding: content/images/sent2_6.3851_2.4026.png (deflated 0%)
  adding: content/images/sent2_3.717_8.833.png (deflated 0%)
  adding: content/images/sent2_9.0236_-1.02.png (deflated 0%)
  adding: content/images/sent2_9.233_-5.734.png (deflated 0%)
  adding: content/images/sent2_6.3609_2.4228.png (deflated 0%)
  adding: content/images/sent2_6.453_3.396.png (deflated 0%)
  adding: content/images/sent2_-7.72236_23.68859.png (deflated 0%)
  adding: content/images/sent2_7.48525_-7.47344.png (deflated 0%)
  adding: content/images/sent2_11.5166_-8.2833.png (deflated 0%)
  adding: content/images/sent2_-17.814_25.149.png (deflated 0%)
  adding: content/images/sent2_4.5447_9.3305.png (deflated 0%)
  adding: content/images/sent2_-15.865_34.966.png (deflated 0%)
  adding: content/images/sent2_3.551_8.849.png (deflated 0%)
  adding: content/images/sent2_14.46963_39.298

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [40]:
clean()

In [41]:
download_from_faulty_df(1)
zip_and_download()

0
rm: cannot remove 'data/*.tif': No such file or directory
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
420
430


  app.launch_new_instance()


440
Images from downloaded
  adding: content/images/ (stored 0%)
  adding: content/images/sent2_-4.42108_15.36884.png (deflated 0%)
  adding: content/images/sent2_6.3851_2.4026.png (deflated 0%)
  adding: content/images/sent2_3.717_8.833.png (deflated 0%)
  adding: content/images/sent2_9.0236_-1.02.png (deflated 0%)
  adding: content/images/sent2_9.233_-5.734.png (deflated 0%)
  adding: content/images/sent2_6.3609_2.4228.png (deflated 0%)
  adding: content/images/sent2_6.453_3.396.png (deflated 0%)
  adding: content/images/sent2_-7.72236_23.68859.png (deflated 0%)
  adding: content/images/sent2_7.48525_-7.47344.png (deflated 0%)
  adding: content/images/sent2_11.5166_-8.2833.png (deflated 0%)
  adding: content/images/sent2_-17.814_25.149.png (deflated 0%)
  adding: content/images/sent2_4.5447_9.3305.png (deflated 0%)
  adding: content/images/sent2_-15.865_34.966.png (deflated 0%)
  adding: content/images/sent2_3.551_8.849.png (deflated 0%)
  adding: content/images/sent2_14.46963_39.298

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [42]:
clean()

In [43]:
download_from_faulty_df(4)
zip_and_download()

0
rm: cannot remove 'data/*.tif': No such file or directory
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
420
430
440
Images from downloaded
  adding: content/images/ (stored 0%)
  adding: content/images/sent2_-4.42108_15.36884.png (deflated 0%)
  adding: content/images/sent2_6.3851_2.4026.png (deflated 0%)
  adding: content/images/sent2_3.717_8.833.png (deflated 0%)
  adding: content/images/sent2_9.0236_-1.02.png (deflated 0%)
  adding: content/images/sent2_9.233_-5.734.png (deflated 0%)
  adding: content/images/sent2_6.3609_2.4228.png (deflated 0%)
  adding: content/images/sent2_6.453_3.396.png (deflated 0%)
  adding: content/images/sent2_-7.72236_23.68859.png (deflated 0%)
  adding: content/images/sent2_7.48525_-7.47344.png (deflated 0%)
  adding: content/images/sent2_11.5166_-8.2833.png (deflated 0%)
  adding: content/images/sent2_-17.814_25.149.png (deflated 0%)
  adding: c

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>