# Numerical Weather Prediction Exploratory Data Analysis
This repo is a starting point for exploring the Numerical Weather Prediction (NWP) data. You must have the setup (outlined in the README) completed before running this notebook.

## Getting the data
To explore the data, you first need to download it locally. For the NWP data, there is surface and atmosphere data. For initial experiments we will be using only one day of data from each of the surface and atmosphere data. Currently there is a couple of years worth of data, OCF is adding more every day. Download the following data:
- Atmosphere: 1st of January. [link](https://huggingface.co/datasets/openclimatefix/era5-reanalysis/blob/main/data/atmosphere/2022/01/20220101.zarr.zip).
- Surface: 1st of January. [link](https://huggingface.co/datasets/openclimatefix/era5-reanalysis/blob/main/data/surface/2022/01/20220101.zarr.zip).

**It is important that you read the "Note on version control for data" section of the README linked [here](https://github.com/WAT-ai/open-climate-fix-project/blob/main/README.md#note-on-version-control-for-data).**


## Unzipping the data
The data files are in zarr format, and they also come zipped. First you must unzip the files. You can use the [unzip.py](https://github.com/WAT-ai/open-climate-fix-project/blob/2c2b70e42a78a051050ff331eee56a5dd7f3c1f0/utils/unzip.py) script, or you can unzip your files using whatever method you like. Just make sure to save the data in the `data/` folder in your working directory. The steps to using the unzip.py script are linked [here](https://github.com/WAT-ai/open-climate-fix-project/blob/main/README.md#utilsunzippy).

Ideally we could inspect data over the year to see how seasons affect the NWP data, but the data is simply so large that it is difficult to work with multiple days of data at once.

In [4]:
import xarray as xr

In [7]:
DATA_PATH = 'ABSOLUTE PATH TO DATA'
# here is an example: 'C:/Users/areel/watai/watai_repo/data/nwp/atmosphere/2022/01'

In [8]:
dataset = xr.open_dataset(DATA_PATH, engine='zarr', chunks='auto')



In [9]:
dataset

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GiB 73.27 MiB Shape (24, 37, 721, 1440) (1, 37, 721, 720) Count 49 Tasks 48 Chunks Type float32 numpy.ndarray",24  1  1440  721  37,

Unnamed: 0,Array,Chunk
Bytes,3.43 GiB,73.27 MiB
Shape,"(24, 37, 721, 1440)","(1, 37, 721, 720)"
Count,49 Tasks,48 Chunks
Type,float32,numpy.ndarray
