### Time Series 

Goal:
- Plot the changes of methane concentration over a period of time for a given area (ROI)

Variables to be used: 
- **Methane (CH4) Concentration Data**: The concentration data that is collected by the Sentinel-5P satellite and hosted by Microsoft Planetary Computer
- **Date period**: Date period as specified by the user (Minimum one day)

---

### Setting up environment

In [None]:
import matplotlib.pyplot as plt
import netCDF4 as nc4
import xarray as xr
import fsspec
import numpy as np
import xarray as xr
import planetary_computer
import pystac_client
import geopandas as gpd
import pandas as pd

In [None]:
# Initialize PySTAC client for data query
planetary_computer.set_subscription_key("c27669c4bdec434d804e2bd738cb16fc")
catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)

---

### Determining appropriate data to display

Characteristics of the plot:
1. Each point of data is the total average concentration of the ROI per interval
2. The ROI is constant

---

To get the total average concentration of the ROI per interval, it will need to:
1. Select all relevant datasets within an interval 
2. Calculate the average concentration for each datasets
3. Calculate the total average concentration for all datasets by averaging all concentration values out

---

#### Query data

For the following code, we will query data with the following properties:

1. Bounding Box (bbox):[112.70505, -44.52755, 154.38241, -11.29524] (Australia)
2. Collections: Sentinel 5P Level 2A
3. Date time: 01/08/2023 - 01/09/2023
4. S5P Processing Mode: Offline
5. S5P Product Name: ch4 (Methane)

In [None]:
aus_bbox = [112.70505, -44.52755, 154.38241, -11.29524]

search = catalog.search(
    collections="sentinel-5p-l2-netcdf",
    bbox = aus_bbox,
    datetime="2023-08-01/2023-09-01",
    query={"s5p:processing_mode": {"eq": "OFFL"}, "s5p:product_name": {"eq": "ch4"}},
)
items = search.item_collection()

print(len(items))

#### Select all relevant datasets within an interval 

In [None]:
item_links = [item.assets['ch4'].href for item in items]
item_links

f = fsspec.open_files(item_links)
f = [file.open() for file in f]


In [None]:
ds = xr.open_mfdataset(f, group="PRODUCT", engine="h5netcdf", concat_dim='t', combine='nested') 
ds

#### Calculate average concentration per datasets

In [None]:
ds["methane_mixing_ratio_bias_corrected"].plot()

In [None]:
# Create a scatterplot with color mapping
plt.figure(figsize=(15, 15))
plt.scatter(
    lon,
    lat,
    c=combined_methane_data,
    cmap="viridis",
    vmin=vmin,
    vmax=vmax,marker=".",
    s=1,
)

plt.title("Methane Concentration Scatterplot")
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.grid(True)
plt.show()


#### Calculate total average concentration 

### 1. Set up variables

|Date Period
---

**Data type**: String

**Format**: "dd/mm/yyyy"

In [None]:
date_start = ""
date_end = ""

|Methane Concentration Data
---



In [None]:
nc_file1 = "Download Results\S5P_OFFL_L2__CH4____20230403T063304_20230403T081434_28345_03_020500_20230404T225423.nc"
nc_file2 = "Download Results\S5P_OFFL_L2__CH4____20230514T034638_20230514T052808_28925_03_020500_20230515T195331.nc"
file_header = nc4.Dataset(nc_file1, mode='r') # Create a file header containing the nc_file's metadata in read mode
file_header.groups["PRODUCT"]

f1 = fsspec.open(nc_file1).open()
f2 = fsspec.open(nc_file2).open()
ds1 = xr.open_dataset(f1, group="PRODUCT", engine="h5netcdf")
ds2 = xr.open_dataset(f2, group="PRODUCT", engine="h5netcdf")

ds1
ds2

### 2. Plot data time series