# 00: Python Fundamentals Reference

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Austfi/xsnowForPatrol/blob/main/notebooks/00_python_fundamentals_reference.ipynb)

This notebook provides a reference for Python fundamentals needed for working with xsnow, including NumPy, Pandas, and Xarray basics.

## What You'll Learn

- NumPy arrays and basic operations
- Pandas DataFrames and Series
- Xarray DataArrays and Datasets
- Essential Python concepts for scientific computing

> **Note**: This is an optional reference notebook. The main tutorial notebooks assume basic Python knowledge but don't require deep NumPy/Pandas/Xarray expertise.

## Installation (For Colab Users)

If you're using Google Colab, run the cell below to install dependencies.

In [1]:
%pip install -q numpy pandas xarray matplotlib


Note: you may need to restart the kernel to use updated packages.


In [2]:
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt

%matplotlib inline


## Part 1: NumPy Basics

 NumPy provides N-dimensional arrays and mathematical operations. The image below illustrates the structure of NumPy arrays.
# <img src="https://nustat.github.io/DataScience_Intro_python/Datasets/numpy_image.png" width="720" style="display:inline;">
<em>Image source: <a href="https://nustat.github.io/DataScience_Intro_python/">nuStat Data Science Intro Python</a>.</em>


In [8]:
# Create 1D NumPy arrays
numpy_1d = np.array([1, 2, 3, 4, 5])
print(f"\n1D array: {numpy_1d}\n")

# Create 2D NumPy arrays
numpy_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"\n2D array:\n{numpy_2d}\n")

# Create 3D NumPy arrays
numpy_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(f"\n3D array:\n{numpy_3d}\n")





1D array: [1 2 3 4 5]


2D array:
[[1 2 3]
 [4 5 6]]


3D array:
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]



## Part 2: Pandas Basics
 
Pandas revolves around two main data structures: the Series (one-dimensional) and the DataFrame (two-dimensional table).
# <img src="https://storage.googleapis.com/lds-media/images/series-and-dataframe.width-1200.png" width="720" style="display:inline;">
<em>Image source: LDS Media.</em>

In [None]:
# Create a Pandas DataFrame
data = {
    'time': pd.date_range('2024-01-01', periods=5, freq='D'), 
    'temperature': [-5, -3, -1, 0, -2],
    'snow_height': [1.2, 1.3, 1.4, 1.5, 1.6]
}
df = pd.DataFrame(data)
df



Unnamed: 0,time,temperature,snow_height
0,2024-01-01,-5,1.2
1,2024-01-02,-3,1.3
2,2024-01-03,-1,1.4
3,2024-01-04,0,1.5
4,2024-01-05,-2,1.6


## Part 3: Xarray Basics

Xarray provides labeled multi-dimensional arrays, forming the foundation of xsnow.

# <img src="https://api.wandb.ai/files/capecape/images/projects/37231064/40059d9c.png" width="720" style="display:inline;">
<em>Image source: Xarray Documentation.</em>

https://tutorial.xarray.dev/overview/xarray-in-45-min

The above link has a tutorial from xarray documentation to follow that is highly helpful for understanding the foundation of xsnow.

In [24]:
# This loads a dataset from xarray tutorial
# Running ds will show the dataset in a dataframe format
ds = xr.tutorial.open_dataset('air_temperature')
ds

In [38]:
# Datasets are a collection of containers of data arrays
# This uses dot notation to access the air data array in the dataset
ds.air 



In [39]:
# Naming the data array 
da = ds.air
da.name


'air'

In [42]:
# Dimensions correspond to the axes of the data array
da.dims




('time', 'lat', 'lon')

In [44]:
# Coordinates are the values associated with the dimensions
da.coords

Coordinates:
  * lat      (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon      (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
  * time     (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00

In [45]:
# Attributes are metadata associated with the data array
da.attrs

{'long_name': '4xDaily Air temperature at sigma level 995',
 'units': 'degK',
 'precision': 2,
 'GRIB_id': 11,
 'GRIB_name': 'TMP',
 'var_desc': 'Air temperature',
 'dataset': 'NMC Reanalysis',
 'level_desc': 'Surface',
 'statistic': 'Individual Obs',
 'parent_stat': 'Other',
 'actual_range': array([185.16, 322.1 ], dtype=float32)}

In [48]:
# The data array is the actual data in the dataset
da.data



array([[[241.2 , 242.5 , 243.5 , ..., 232.8 , 235.5 , 238.6 ],
        [243.8 , 244.5 , 244.7 , ..., 232.8 , 235.3 , 239.3 ],
        [250.  , 249.8 , 248.89, ..., 233.2 , 236.39, 241.7 ],
        ...,
        [296.6 , 296.2 , 296.4 , ..., 295.4 , 295.1 , 294.7 ],
        [295.9 , 296.2 , 296.79, ..., 295.9 , 295.9 , 295.2 ],
        [296.29, 296.79, 297.1 , ..., 296.9 , 296.79, 296.6 ]],

       [[242.1 , 242.7 , 243.1 , ..., 232.  , 233.6 , 235.8 ],
        [243.6 , 244.1 , 244.2 , ..., 231.  , 232.5 , 235.7 ],
        [253.2 , 252.89, 252.1 , ..., 230.8 , 233.39, 238.5 ],
        ...,
        [296.4 , 295.9 , 296.2 , ..., 295.4 , 295.1 , 294.79],
        [296.2 , 296.7 , 296.79, ..., 295.6 , 295.5 , 295.1 ],
        [296.29, 297.2 , 297.4 , ..., 296.4 , 296.4 , 296.6 ]],

       [[242.3 , 242.2 , 242.3 , ..., 234.3 , 236.1 , 238.7 ],
        [244.6 , 244.39, 244.  , ..., 230.3 , 232.  , 235.7 ],
        [256.2 , 255.5 , 254.2 , ..., 231.2 , 233.2 , 238.2 ],
        ...,
        [295

## Summary

✅ **What we covered:**

1. **NumPy**: N-dimensional arrays and mathematical operations
2. **Pandas**: DataFrames and Series for tabular data
3. **Xarray**: Labeled multi-dimensional arrays

## Next Steps

Return to **01_introduction_and_loading_data.ipynb** to start learning xsnow.