## Project: Forecast of El Nino

1. **Understand El Niño:**
   - Begin by reading about El Niño to gain a deep understanding of what it is, how it affects the Pacific equatorial ocean, and its impact on the local economy. This foundational knowledge will help you make informed decisions about feature selection and model building.
   
   ![Weather Data](../images/monthly-sst-lanina-normal-elnino.png)

2. **Data Exploration:**
   - Start by exploring the provided dataset, which includes monthly sea surface temperature (SST) data. Understand the structure of the data, the temporal and spatial resolutions, and any additional variables if available.

3. **Characterize El Niño:**
   - Define and create a classifier variable that represents El Niño events in the dataset. El Niño is typically associated with higher sea surface temperatures in the tropical Pacific Ocean. You can use historical data and domain knowledge to label time periods as either "El Niño" or "Non-El Niño" based on specific criteria, such as SST thresholds.

4. **Data Preprocessing:**
   - Prepare the dataset for machine learning by cleaning, transforming, and encoding the data. Ensure that the dataset is in a format suitable for classification, with features and a target variable (El Niño classifier).

5. **Feature Engineering:**
   - Select relevant features that may influence El Niño events. These features could include sea surface temperatures, ocean current patterns, atmospheric pressure, and other climatic variables. You may also consider lagged variables, seasonal patterns, and spatial features.

6. **Model Selection:**
   - Choose an appropriate machine learning algorithm for classification. Common classifiers for binary classification tasks like this one include Logistic Regression, Random Forest, Support Vector Machine (SVM), Gradient Boosting, and Neural Networks.
   - Experiment with different algorithms and hyperparameters to find the one that provides the best predictive performance.

7. **Train and Validate the Model:**
   - Split the dataset into training and validation sets to assess the model's performance. You can use techniques like cross-validation to evaluate model accuracy, precision, recall, and F1-score.

8. **Model Interpretation:**
   - Interpret the model's results to understand which features are most influential in predicting El Niño events. This insight can provide valuable information about the drivers of El Niño and its predictability.

9. **Predict El Niño Events:**
   - Once your model is trained and validated, you can use it to predict El Niño events for future time periods. Input the relevant features, and the model will output a prediction of whether El Niño is likely to occur.

10. **Evaluate Predictability:**
    - Assess the predictability of El Niño based on the model's performance. Calculate relevant evaluation metrics and consider the model's accuracy in predicting El Niño events.

11. **Continuous Improvement:**
    - Continue to refine and improve your model based on new data and insights. Monitoring and updating the model is essential for maintaining its predictive accuracy.

In summary, building a machine learning model to predict El Niño involves data exploration, feature engineering, model selection, and evaluation. Additionally, understanding the implications of El Niño on the local economy and the predictability of El Niño events is crucial for building a relevant and impactful predictive model.


In [1]:
# Data retrieval

In [None]:
import cdsapi

# Define your CDS API key
api_key = '3604f1db-162d-4647-9b6b-d4f1af8fb039'

# Create a CDS API client session
c = cdsapi.Client()

# Specify the dataset and request parameters
request = {
    'product_type': 'reanalysis',
    'format': 'netcdf',
    'variable': 'sea_surface_temperature',
    'area': [-20, -100, 20, 80],  # [latitude_north, longitude_west, latitude_south, longitude_east]
    'year': list(range(1990, 2022)),  # Request data from 1990 to the current year
    'month': list(range(1, 13)),  # Request data for all months (1 to 12)
}

# Make the API request and download the data
c.retrieve('reanalysis-era5-single-levels', request, 'download.nc')

# Close the API session
c.close()


2023-11-16 14:59:15,267 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
2023-11-16 15:01:23,122 INFO Retrying now...
2023-11-16 15:03:30,970 INFO Retrying now...
2023-11-16 15:06:31,021 INFO Retrying now...
2023-11-16 15:09:31,071 INFO Retrying now...
2023-11-16 15:12:31,117 INFO Retrying now...
2023-11-16 15:15:31,159 INFO Retrying now...
2023-11-16 15:18:31,200 INFO Retrying now...
2023-11-16 15:21:31,247 INFO Retrying now...
2023-11-16 15:24:31,292 INFO Retrying now...
2023-11-16 15:27:31,336 INFO Retrying now...
2023-11-16 15:30:31,396 INFO Retrying now...
2023-11-16 15:33:31,481 INFO Retrying now...
2023-11-16 15:36:31,541 INFO Retrying now...
2023-11-16 15:39:31,594 INFO Retrying now...
2023-11-16 15:42:31,665 INFO Retrying now...
2023-11-16 15:45:31,709 INFO Retrying now...
2023-11-16 15:48:31,744 INFO Retrying now...
2023-11-16 15:51:31,777 INFO Retrying now...
2023-11-16 15:54:31,814 INFO Retrying now...
2023-11-16 15:57

## Model (Neural Network)

**Why Use a Neural Network (NN) for El Niño Prediction?**

- **Complex SST Patterns:** El Niño events involve intricate sea surface temperature (SST) patterns. NNs are great at detecting these complex relationships.

- **Automatic Feature Discovery:** NNs can find which SST features matter most for El Niño prediction, even when we're uncertain which ones are crucial.

- **Non-Linear Relationships:** El Niño's behavior is not linear; it involves many factors interacting in non-obvious ways. NNs can capture these intricate, non-linear connections.

- **Temporal and Spatial Data Handling:** NNs effectively process the temporal (time-based) and spatial (location-based) complexity in SST data, which is essential for predicting El Niño.

- **Generalization:** Once trained on historical data, NNs can generalize to predict future El Niño events, learning from past SST patterns to make forecasts.

- **Data Integration:** NNs can seamlessly integrate various data sources, like atmospheric and oceanic data, enhancing El Niño predictions with multi-modal information.

In summary, NNs excel at untangling complex SST patterns, discovering important features, and handling non-linear relationships, making them a powerful tool for predicting El Niño events with precision.


In [None]:
# Import necessary libraries
import numpy as np
import tensorflow as tf
from tensorflow import keras
from sklearn.model_selection import train_test_split

# Load your SST data and labels (1 for El Niño, 0 for non-El Niño)
# Replace 'X' and 'y' with your data and labels
X = ...  # SST data in the shape (samples, 40, 180)
y = ...  # Labels (1 for El Niño, 0 for non-El Niño)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the neural network model
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(40, 180)),  # Flatten the input grid
    keras.layers.Dense(128, activation='relu'),   # Fully connected layer with 128 units and ReLU activation
    keras.layers.Dense(1, activation='sigmoid')   # Output layer with 1 unit and sigmoid activation (binary classification)
])

# Compile the model
model.compile(optimizer='adam',
              loss='binary_crossentropy',  # Binary cross-entropy loss for binary classification
              metrics=['accuracy'])

# Train the model on the training data
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))

# Evaluate the model on the test data
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"Test accuracy: {test_accuracy}")


## Create a CDS API Key
To access data from the Copernicus Climate Data Store, you'll need an API key. You can obtain one by registering on the CDS website ([CDS Website](https://cds.climate.copernicus.eu/)).

## Install the cdsapi Library
Ensure you have the cdsapi Python library installed. You can install it using pip:

In [1]:
!pip install cdsapi

## Import the cdsapi Library
Import the cdsapi library at the beginning of your Python script or Jupyter Notebook.

In [2]:
import cdsapi

## Define your CDS API Key
Set your API key obtained from the CDS website.

In [3]:
api_key = 'Your-API-key'

## Create a CDS API Client Session
Initialize a client session with the CDS API.

In [4]:
c = cdsapi.Client()

## Specify Request Parameters
Define the dataset you want to retrieve and specify the request parameters. In your example, you are requesting sea surface temperature (SST) data over a specific geographic area, time range, and data format.

In [5]:
request = {
    'product_type': 'reanalysis',
    'format': 'netcdf',
    'variable': 'sea_surface_temperature',
    'area': [-20, -100, 20, 80],  # [latitude_north, longitude_west, latitude_south, longitude_east]
    'year': list(range(1990, 2024)),  # Request data from 1990 to the current year
    'month': list(range(1, 13)),  # Request data for all months (1 to 12)
}

## Make the API Request and Download Data
Use the c.retrieve method to send the request to the CDS API and download the data. In your example, it retrieves ERA5 reanalysis data for sea surface temperature.

In [6]:
c.retrieve('reanalysis-era5-single-levels', request, 'download.nc')

This line of code will download the data in NetCDF format and save it as 'download.nc' in your current working directory.

## Close the API Session
After retrieving the data, it's good practice to close the API session.

In [7]:
c.close()

These steps outline how to use the cdsapi library to access and download climate data from the Copernicus Climate Data Store. Remember to replace the example API key with your actual API key obtained from the CDS website.