## Single Shot Detector for ExtremeWeather Data

### Preprocessing the Data
The [ExtremeWeather](https://extremeweatherdataset.github.io/) data set consists of weather data for a specific 25km region from 1979 to 2005. 
The data is organized by year and contains 4 images per day. Each image has 768x1152 pixels across 16 channels for different weather variables. 
In addition, each day has up to 15 bounding boxes
surrounding extreme weather events classified as Tropical Depression, Tropical Cyclone, Extratropical Cyclone, and Atmospheric River.

The goal is to correctly identify and classify extreme weather events. 
The first challenge is the sheer size of the data set.
Each year, even when compressed, takes 64 GB to store.
As the "small" data set, we consider training on 1979 and 1981 then testing on 1984.
We preprocess each year by considering only the month of July and rescaling each image so its of size 300x300.
In addition, we choose 3 of the 16 channels by hand for 'visual explainability' of bounding boxes.
All together, we are able to reduce the size of the compressed data from 64 GB to 450 MB consisting of 124 300x300 images and bounding boxes.




### Training the SSD Model
Instead of building the SSD model from scratch, we use a pretrained model on the COCO database. 

In [None]:
## Import libraries
import torch
import gc

In [None]:
gc.collect()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
ssd_model = torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_ssd')

In [None]:
from preprocess import read_data
year = '1979'
images, boxes = read_data(year)