# Arctic Sea Ice Area & Extent (1978–2023)

In [4]:
import xarray as xr
import numpy as np
import pandas as pd
from pathlib import Path

We use NSIDC Climate Data Record (CDR, v4) daily data to generate monthly averages of **Sea Ice Area (SIA)** and **Sea Ice Extent (SIE)**.  

- **SIA (million km²):** grid-cell area × ice fraction.  
- **SIE (million km²):** total area with ≥15% ice concentration.  

Monthly aggregation reduces day-to-day weather noise and aligns with other climate datasets for long-term Arctic analysis.

### Load and Prepare Dataset
We open the NetCDF file with `xarray`, inspect the structure, and convert the **time** variable from a no-leap climate calendar into standard datetimes for easier grouping.

In [5]:
# File Path
BASE_DIR = Path().resolve().parents[1]

# Load Arctic gridded temperature NetCDF file
arctic_path = BASE_DIR / "data" / "raw" / "NSIDC_CDR_daily_v4_SIA_SIE_197901_202312_no_leap.nc"
ds = xr.open_dataset(arctic_path, decode_times= True)
print(ds)

# Convert CFTime to pandas datetime
time_values = pd.to_datetime(ds["time"].values.astype("datetime64[ns]"))

<xarray.Dataset> Size: 726kB
Dimensions:  (time: 16493)
Coordinates:
  * time     (time) object 132kB 1978-10-25 00:00:00 ... 2023-12-31 00:00:00
Data variables:
    CDR_SIA  (time) float32 66kB ...
    BT_SIA   (time) float32 66kB ...
    NT_SIA   (time) float32 66kB ...
    CDR_SIE  (time) float64 132kB ...
    BT_SIE   (time) float64 132kB ...
    NT_SIE   (time) float64 132kB ...
Attributes:
    Description:  Arctic sea ice area (SIA) and sea ice extent (SIE) from the...
    Units:        million square km
    Timestamp:    22:24 UTC Wed 2024-05-22
    Data source:  NOAA/NSIDC Climate Data Record of Passive Microwave Sea Ice...


### Aggregate and Save Monthly Means
From daily records, we compute **monthly averages** of SIA and SIE, then export the processed dataset (`arctic_sia_sie_monthly.csv`).  
This file will be used in later notebooks for climatologies, anomaly detection, and forecasting models.

In [6]:
# Extract variables
sia = ds["CDR_SIA"].values  # million km²
sie = ds["CDR_SIE"].values  # million km²

# Create DataFrame
df_sia_sie = pd.DataFrame({
    "date": time_values,
    "sia_million_km2": sia,
    "sie_million_km2": sie
})

# Drop missing values
df_sia_sie = df_sia_sie.dropna(subset=["sia_million_km2", "sie_million_km2"])

# Add year and month
df_sia_sie["year"] = df_sia_sie["date"].dt.year
df_sia_sie["month"] = df_sia_sie["date"].dt.month


# Save monthly mean dataset
monthly_sia_sie = df_sia_sie.groupby(["year", "month"], as_index=False)[
    ["sia_million_km2", "sie_million_km2"]
].mean()

monthly_out_path = BASE_DIR / "data" / "pre_processed" / "arctic_sia_sie_monthly.csv"
monthly_sia_sie.to_csv(monthly_out_path, index=False)

print("Monthly data saved to:", monthly_out_path)
print(monthly_sia_sie.head())

Monthly data saved to: D:\Msc Data and Computational Science\Summer\Projects in Maths Modelling\Github\project-acm40960-ss\data\pre_processed\arctic_sia_sie_monthly.csv
   year  month  sia_million_km2  sie_million_km2
0  1978     10         9.524144        10.153839
1  1978     11        10.811478        11.506771
2  1978     12        12.834232        13.668629
3  1979      1        14.615531        15.543609
4  1979      2        15.492461        16.448393
