## Accessing Sentinel-1 RTC data with the Planetary Computer STAC API

The [Sentinel 1 RTC](https://planetarycomputer.microsoft.com/dataset/sentinel-1-rtc) product in this collection is a radiometrically terrain corrected product derived from the [Sentinel-1 Ground Range Detected (GRD)](https://planetarycomputer.microsoft.com/dataset/sentinel-1-grd) Level-1 products produced by the European Space Agency.

### Environment setup

Running this notebook requires an API key.

* The [Planetary Computer Hub](https://planetarycomputer.microsoft.com/compute) is pre-configured to use your API key.
* To use your API key locally, set the environment variable `PC_SDK_SUBSCRIPTION_KEY` or use `planetary_computer.settings.set_subscription_key(<YOUR API Key>)`

See [when an account is needed](https://planetarycomputer.microsoft.com/docs/concepts/sas/#when-an-account-is-needed) for more, and [request an account](http://planetarycomputer.microsoft.com/account/request) if needed.

In [1]:
# Supress Warnings
import warnings
warnings.filterwarnings('ignore')

# Visualization
import ipyleaflet
import matplotlib.pyplot as plt
from IPython.display import Image
import seaborn as sns

# Data Science
import numpy as np
import pandas as pd
import statsmodels.api as sm

# Feature Engineering
from sklearn.model_selection import train_test_split

# Machine Learning
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.metrics import r2_score


# Planetary Computer Tools
import pystac
import pystac_client
import odc
from pystac_client import Client
from pystac.extensions.eo import EOExtension as eo
from odc.stac import stac_load
import planetary_computer as pc

# Please pass your API key here
# pc.settings.set_subscription_key('********************')

# Others
import requests
import rich.table
from itertools import cycle
from tqdm import tqdm
tqdm.pandas()
from tqdm.notebook import tqdm_notebook
tqdm_notebook.pandas()

In [2]:
import ipyleaflet
import matplotlib.pyplot as plt
import numpy as np
import pystac
import pystac_client
import planetary_computer
import requests
import rich.table

from IPython.display import Image

In [3]:
import pandas as pd

# read the CSV file
df = pd.read_csv('Crop_Yield_Data_challenge_2.csv')

# print the contents of the CSV file
print(df.head())

   District   Latitude   Longitude  \
0  Chau_Phu  10.510542  105.248554   
1  Chau_Phu  10.509150  105.265098   
2  Chau_Phu  10.467721  105.192464   
3  Chau_Phu  10.494453  105.241281   
4  Chau_Phu  10.535058  105.252744   

  Season(SA = Summer Autumn, WS = Winter Spring)  \
0                                             SA   
1                                             SA   
2                                             SA   
3                                             SA   
4                                             SA   

  Rice Crop Intensity(D=Double, T=Triple) Date of Harvest  Field size (ha)  \
0                                       T      15-07-2022             3.40   
1                                       T      15-07-2022             2.43   
2                                       D      15-07-2022             1.95   
3                                       T      15-07-2022             4.30   
4                                       D      14-07-2022           

### Data access

The datasets hosted by the Planetary Computer are available from [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/). We'll use [pystac-client](https://pystac-client.readthedocs.io/) to search the Planetary Computer's [STAC API](https://planetarycomputer.microsoft.com/api/stac/v1/docs) for the subset of the data that we care about, and then we'll load the data directly from Azure Blob Storage. We'll specify a `modifier` so that we can access the data stored in the Planetary Computer's private Blob Storage Containers. See [Reading from the STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/) and [Using tokens for data access](https://planetarycomputer.microsoft.com/docs/concepts/sas/) for more.

In [4]:
catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)

### Choose an area and time of interest

We'll search for assets acquired over Panama in the first week of May, 2022. You can use the [Planetary Computer Explorer](https://planetarycomputer.microsoft.com/explore?c=-79.6735%2C9.0461&z=9.91&ae=0&v=2&d=sentinel-1-rtc&s=false%3A%3A100%3A%3Atrue&m=Most+recent+-+VV%2C+VH&r=VV%2C+VH+False-color+composite) to find areas of interest.

In [5]:
# TEST EXAMPLE
bbox = [105.248554, 10.510542, 105.248554, 10.510542]
search = catalog.search(
    collections=["sentinel-1-rtc"], bbox=bbox, datetime="2022-01-01/2022-08-31"
)
items = search.item_collection()
print(f"Found {len(items)} items")
item = items[0]
# Iterate through each item and print its date
for item in items:
    date = item.properties["datetime"]
    print(date)


Found 29 items
2022-08-26T11:12:00.463421Z
2022-08-25T22:46:14.709082Z
2022-08-14T11:12:00.071377Z
2022-08-02T11:11:59.338449Z
2022-07-21T11:11:58.504828Z
2022-07-09T11:11:57.770925Z
2022-06-27T11:11:57.147850Z
2022-06-26T22:46:10.363160Z
2022-06-15T11:11:56.125679Z
2022-06-03T11:11:55.667800Z
2022-06-02T22:46:08.840031Z
2022-05-22T11:11:54.509303Z
2022-05-21T22:46:07.705465Z
2022-05-10T11:11:53.723854Z
2022-04-27T22:46:06.296981Z
2022-04-16T11:11:52.470632Z
2022-04-15T22:46:05.644861Z
2022-04-04T11:11:52.234391Z
2022-04-03T22:46:05.477503Z
2022-03-23T11:11:52.064532Z
2022-03-22T22:46:05.287764Z
2022-03-11T11:11:51.691272Z
2022-02-26T22:46:04.969244Z
2022-02-15T11:11:51.767747Z
2022-02-03T11:11:51.699083Z
2022-02-02T22:46:04.929540Z
2022-01-22T11:11:52.377922Z
2022-01-21T22:46:05.657153Z
2022-01-09T22:46:06.347730Z


The `rendered_preview` asset lets us quickly visualize the data. For Seninel-1 RTC, this produces a false-color composite from a combination of the VV and VH bands.

In [7]:
Image(url=item.assets["rendered_preview"].href)

In [8]:
import csv
from datetime import datetime
def get_vv_vh(longitude, latitude, season, date, csv_file):
    
    assests = ['vh','vv']
    str_latitude = str(latitude)
    found_start_date = False
    if season == "SA":
        with open(csv_file, 'r') as file:
            reader = csv.DictReader(file)
            for row in reader:
                if (row['Latitude'] == str_latitude) & (row['Season(SA = Summer Autumn, WS = Winter Spring)'] != season):
                    start_date = row['Date of Harvest']
                    found_start_date = True
                    #print(start_date)
    
        date_obj = datetime.strptime(date, "%d-%m-%Y")
        new_date_csv = date_obj.strftime("%Y-%m-%d")

        if found_start_date:
            #print("Found")
            start_date_obj = datetime.strptime(start_date, "%d-%m-%Y")
            new_start_date_csv = start_date_obj.strftime("%Y-%m-%d")
            if new_start_date_csv > "2022-04-01":
                #print("bigger")
                datetime1 = new_start_date_csv + "/" + new_date_csv
                #print(datetime1)

            else:
                #print("smaller")
                datetime1 = "2022-04-01" + "/" + new_date_csv
                #print(datetime1)

        else:
            #print("not")
            datetime1 = "2022-04-01" + "/" + new_date_csv
            #print(datetime1)

    
    if season == 'WS':
        date_obj = datetime.strptime(date, "%d-%m-%Y")
        new_date_csv = date_obj.strftime("%Y-%m-%d")
        datetime1 = "2021-11-01" +"/" + new_date_csv
        #print(datetime1)

    
    bbox = [longitude, latitude, longitude, latitude]
    search = catalog.search(
        collections=["sentinel-1-rtc"], bbox=bbox, datetime=datetime1
    )

    items = search.item_collection()
    #print(f"Found {len(items)} items")
    item = items[1]
    # Access the datetime for each item
    #for item in items:
    #   print(f"Item ID: {item.id}, Datetime: {item.datetime}")
    import stackstac

    ds = stackstac.stack(items[0], bounds_latlon=bbox, epsg=32630, resolution=100)
    #ds

    vv_list = []
    vh_list = []
    bands_of_interest = ['vh', 'vv']
    for item in items:
        data = stac_load([item], bands=bands_of_interest, patch_url=pc.sign, bbox=bbox).isel(time=0)
        if (data['vh'].values[0][0] !=-32768.0 and data['vv'].values[0][0]!=-32768.0):
            vv_list.append(np.median(data["vv"].astype("float64")))
            vh_list.append(np.median(data["vh"].astype("float64")))
    
    return vv_list, vh_list

In [9]:
new_columns = df.progress_apply(lambda x: get_vv_vh(x['Longitude'], x['Latitude'],x['Season(SA = Summer Autumn, WS = Winter Spring)'], x['Date of Harvest'], 'Crop_Yield_Data_challenge_2.csv'), axis=1)

  0%|          | 0/557 [00:00<?, ?it/s]

In [10]:
#print(new_columns[0][0])

vh = [x[0] for x in new_columns]
vv = [x[1] for x in new_columns]
#print(vh)
#print(vv)
import statistics
import numpy as np

vh_vv_data = pd.DataFrame(list(zip(vh,vv,)),columns = ["vv_list","vh_list"])
#print(vh_vv_data)
#print(vh_vv_data['vv_list'][0])
def find_attributes(dataframe):
    min_vh = []
    max_vh = []
    range_vh = []
    mean_vh = []
    std_vh = []
    for i in dataframe["vh_list"]:
        min_vh.append(min(i))
        max_vh.append(max(i))
        range_vh.append(max(i)-min(i))
        mean_vh.append(statistics.mean(i))
        std_vh.append(statistics.stdev(i))

    
    min_vv = []
    max_vv = []
    range_vv = []
    mean_vv = []
    std_vv = []
    for i in dataframe["vv_list"]:
        min_vv.append(min(i))
        max_vv.append(max(i))
        range_vv.append(max(i)-min(i))
        mean_vv.append(statistics.mean(i))
        std_vv.append(statistics.stdev(i))
    
    ratio_vv_vh = [mean_vv[i] / mean_vh[i] for i in range(len(mean_vv))]

    #ratio_vv_vh = mean_vv / mean_vh
    rvi = [np.sqrt(1- mean_vv[i] / (mean_vv[i] + mean_vh[i])) * 4 * (mean_vh[i] / (mean_vv[i] + mean_vh[i])) for i in range(len(mean_vv))]
    
    
    return min_vh, min_vv, max_vh, max_vv, range_vh, range_vv, mean_vh, mean_vv, std_vh, std_vv, ratio_vv_vh, rvi

#print(find_attributes(vh_vv_data)[0])
#list(zip(find_attributes(vh_vv_data)))
new = pd.DataFrame({"min_vh": find_attributes(vh_vv_data)[0], "min_vv": find_attributes(vh_vv_data)[1], 
                    "max_vh": find_attributes(vh_vv_data)[2], "max_vv": find_attributes(vh_vv_data)[3],
                   "range_vh": find_attributes(vh_vv_data)[4], "range_vv": find_attributes(vh_vv_data)[5],
                   "mean_vh": find_attributes(vh_vv_data)[6], "mean_vv": find_attributes(vh_vv_data)[7],
                   "std_vh": find_attributes(vh_vv_data)[8], "std_vv": find_attributes(vh_vv_data)[9],
                   "ratio_vv_vh":find_attributes(vh_vv_data)[10], "rvi":find_attributes(vh_vv_data)[11]})
print(new)
new.to_csv("new_crop_data.csv", index=False, header=["min_vh", "min_vv", "max_vh", "max_vv", "range_vh", "range_vv",
                                                    "mean_vh", "mean_vv", "std_vh", "std_vv", "ratio_vv_vh", "rvi"])


#def find_attributes(dataframe):


'''
# Concatenate the two dataframes horizontally
new_df = pd.concat([df, vh_vv_data], axis=1)

# Save the new dataframe to a new CSV file
new_df.to_csv('new.csv', index=False)'''

       min_vh    min_vv    max_vh    max_vv  range_vh  range_vv   mean_vh  \
0    0.005246  0.024129  0.059441  0.357296  0.054196  0.333167  0.026118   
1    0.016737  0.053835  0.136669  0.448553  0.119932  0.394718  0.051274   
2    0.010781  0.016936  0.114186  0.424133  0.103404  0.407197  0.030350   
3    0.003393  0.027307  0.070942  0.335494  0.067549  0.308188  0.026225   
4    0.004080  0.021431  0.089368  0.226502  0.085289  0.205071  0.028807   
..        ...       ...       ...       ...       ...       ...       ...   
552  0.002246  0.007758  0.092458  0.447422  0.090212  0.439664  0.031105   
553  0.003157  0.023660  0.080506  0.255528  0.077349  0.231868  0.025837   
554  0.002891  0.008751  0.155461  0.387613  0.152571  0.378862  0.037837   
555  0.001930  0.019068  0.097946  0.488400  0.096017  0.469332  0.029845   
556  0.002429  0.013906  0.196655  0.639950  0.194226  0.626043  0.042741   

      mean_vv    std_vh    std_vv  ratio_vv_vh       rvi  
0    0.132912  0

"\n# Concatenate the two dataframes horizontally\nnew_df = pd.concat([df, vh_vv_data], axis=1)\n\n# Save the new dataframe to a new CSV file\nnew_df.to_csv('new.csv', index=False)"