Phương pháp chọn điểm lấy dữ liệu trên Việt Nam sẽ là chia Việt Nam thành các ô lưới 40k (được tạo ra từ lưới 10k), lọc và giữ lại các ô chứa vị trí 
lấy dữ liệu xác định trong file csv/sample.csv. 


Chương trình sau đây sẽ:
- Ghép tạo lưới 40k từ lưới 10k, 
- Lọc và chỉ giữ lại các ô có chứa các điểm trong csv/sample.csv
- Lưu kết quả vào file grid_40km_with_points_1.gpkg
- Từ các ô lưới 40k, sẽ xác định các ngày có dữ liệu Sentinel-1 của từng ô. Như vậy để sau này khi xử lý giá trị sm 1km NSIDC của các điểm 
  thuộc các ô lưới 40k, ta sẽ chỉ giữ lại các giá trị sm có ngày có dữ liệu Sentinel-1.

  Đầu ra sẽ là các file csv chứa ngày có dữ liệu Sentinel-1 của từng ô trong thư mục: s1_dates_per_grid

### Merge 10k grid into 40k grid, then filter 40k grid cells and retain grid cells containing points in csv/sample.csv

In [None]:
import geopandas as gpd
import pandas as pd
from shapely.geometry import Point

# Load 10k grid
grid = gpd.read_file("grid/Grid_10K/grid_10km.gpkg").to_crs("EPSG:4326")

# Create 'row' and 'col' if not available in the grid
if 'row' not in grid.columns or 'col' not in grid.columns:
    grid['centroid_x'] = grid.centroid.x.round(4)
    grid['centroid_y'] = grid.centroid.y.round(4)
    grid['row'] = grid['centroid_y'].rank(method='dense').astype(int)
    grid['col'] = grid['centroid_x'].rank(method='dense').astype(int)

# Merge 10k grid into 40k grid by grouping into 2x2 blocks
# Assign group id for each 2x2 block (4 cells)
grid['group_id'] = ((grid['row'] // 3).astype(int)).astype(str) + '_' + ((grid['col'] // 3).astype(int)).astype(str)

# Dissolve by group_id
merged_grid = grid.dissolve(by='group_id', as_index=False)

# Keep only geometry and group_id properties
merged_grid = merged_grid[['group_id', 'geometry']]

# Assign new sequential IDs starting from 1 for simplicity 
merged_grid = merged_grid.reset_index(drop=True)
merged_grid['id'] = range(1, len(merged_grid) + 1)

""" Now filtered 40k grids, keep only those containing points from csv/sample.csv """
# Load points from CSV and create GeoDataFrame
points_df = pd.read_csv("csv/sample.csv")
geometry = [Point(xy) for xy in zip(points_df['lon'], points_df['lat'])]
points_gdf = gpd.GeoDataFrame(points_df, geometry=geometry, crs="EPSG:4326")

# Filter merged grid to keep only those containing points
joined = gpd.sjoin(merged_grid, points_gdf, how="inner", predicate="contains")
selected_grid = merged_grid[merged_grid['group_id'].isin(joined['group_id'])]
# Copy 'id' column with name 'grid_id'
selected_grid['grid_id'] = selected_grid['id']

# Save the selected grid to a new GeoPackage file
selected_grid.to_file("grid/grid_40km_with_points_1.gpkg", driver="GPKG")

### Get Sentinel-1 dates on each filtered 40k grid cell

In [None]:
import ee
import geopandas as gpd
import json
import os
import time

# Authenticate and initialize Earth Engine
ee.Initialize()

# Get Sentinel-1 dates for each grid cell from 2021 to 2022
start_date = "2021-01-01"
end_date = "2022-12-31"

# INPUT : filtered 40k grid file 
grid_file = "grid/grid_40km_with_points_1.gpkg"  # Must contain a 'grid_id' column
output_dir = "s1_dates_per_grid"
os.makedirs(output_dir, exist_ok=True)

# === Load grid geometries ===
grid_gdf = gpd.read_file(grid_file).to_crs("EPSG:4326")

# Function to get Sentinel-1 dates for a given geometry and date range
def get_s1_dates(geom, start_date, end_date, orbit_pass):
    ee_geom = ee.Geometry(geom.__geo_interface__)
    s1 = ee.ImageCollection("COPERNICUS/S1_GRD") \
        .filterDate(start_date, end_date) \
        .filterBounds(ee_geom) \
        .filter(ee.Filter.eq("instrumentMode", "IW")) \
        .filter(ee.Filter.eq("orbitProperties_pass", orbit_pass)) \
        .select(["VV", "VH"])
    
    dates = s1.aggregate_array("system:time_start").getInfo()
    unique_dates = sorted(set([
        ee.Date(d).format("YYYY-MM-dd").getInfo() for d in dates
    ]))
    # Return Sentinel-1 dates as a list of unique date strings
    return unique_dates

# Loop through each grid cell and retrieve Sentinel-1 dates
for _, row in grid_gdf.iterrows():
    grid_id = row["id"]
    geom = row["geometry"]
    # Check if dates already exist for this grid_id
    if os.path.exists(os.path.join(output_dir, f"s1_dates_{grid_id}.json")):
        print(f"Already retrieved {grid_id} data")
        continue
    try:
        ascending_dates = get_s1_dates(geom, start_date, end_date, "ASCENDING")
        descending_dates = get_s1_dates(geom, start_date, end_date, "DESCENDING")

        if not ascending_dates and not descending_dates:
            print(f"No Sentinel-1 data for {grid_id}")
            continue
        
        # Save the dates to a JSON file
        out_data = {
            "grid_id": grid_id,
            "ascending": ascending_dates,
            "descending": descending_dates
        }

        out_path = os.path.join(output_dir, f"s1_dates_{grid_id}.json")
        with open(out_path, "w") as f:
            json.dump(out_data, f)

        print(f"Saved dates for {grid_id}: {len(ascending_dates)} ASC, {len(descending_dates)} DESC")

        time.sleep(5)

    except Exception as e:
        print(f"Failed for {grid_id}: {e}")


Already retrieved 12 data
Already retrieved 13 data
Already retrieved 14 data
Already retrieved 15 data
Already retrieved 16 data
Already retrieved 17 data
Already retrieved 18 data
Already retrieved 19 data
Already retrieved 24 data
Already retrieved 25 data
Already retrieved 26 data
Already retrieved 27 data
Already retrieved 28 data
Already retrieved 29 data
Already retrieved 30 data
Already retrieved 31 data
Already retrieved 32 data
Already retrieved 33 data
Already retrieved 42 data
Already retrieved 43 data
Already retrieved 44 data
Already retrieved 45 data
Already retrieved 46 data
Already retrieved 47 data
Already retrieved 48 data
Already retrieved 49 data
Already retrieved 50 data
Already retrieved 51 data
Already retrieved 52 data
Already retrieved 53 data
Already retrieved 61 data
Already retrieved 62 data
Already retrieved 63 data
Already retrieved 64 data
Already retrieved 65 data
Already retrieved 66 data
Already retrieved 67 data
Already retrieved 68 data
Already retr