AI-Downscale is a research project aimed at developing a machine learning model to downscale coarse-resolution Global Climate Model (GCM) outputs to high-resolution regional climate projections over Europe. By integrating physical constraints and enhancing model interpretability, the project addresses the limitations of traditional downscaling methods. The developed Convolutional Neural Network (CNN) model incorporates domain knowledge and ensures generalizability across different regions and climate scenarios. This repository contains all the code, documentation, and resources needed to replicate the study and apply the model to your own climate data analysis projects.
- Abstract
- Features
- Installation
- Data Acquisition and Preparation
- Running the Code
- Project Structure
- Contributing
- License
- Acknowledgments
- Contact
- High-Resolution Downscaling: Converts coarse GCM outputs to high-resolution climate projections.
- Physical Constraints: Incorporates physical laws into the machine learning model to ensure physical consistency.
- Model Interpretability: Utilizes SHAP and LIME for model explanation and transparency.
- Generalizability: Designed to work across different regions and future climate scenarios.
- Open Source: All code and resources are available for use and adaptation under the MIT License.
- Operating System: Linux, macOS, or Windows
- Python: Version 3.7 or higher
- Git: Version control system
- Anaconda or Miniconda (Recommended): For environment management
- Disk Space: At least 50 GB free space for data storage and processing
- Memory: 16 GB RAM or higher recommended for processing large datasets
Open a terminal and run:
git clone https://github.com/Rezaian/AI-Downscale.git
cd AI-Downscale
Using Anaconda/Miniconda:
conda create -n ai-downscale python=3.8
conda activate ai-downscale
Alternatively, using venv
:
python -m venv ai-downscale
source ai-downscale/bin/activate # On Windows use `ai-downscale\Scripts\activate`
Install the required Python packages:
pip install -r requirements.txt
Due to the large size of climate datasets, data is not included in the repository. You will need to download the data from the following sources:
-
CMIP6 GCM Data: Obtain low-resolution GCM outputs.
- Access via ESGF nodes: ESGF Data Portal
- Variables: Near-surface air temperature (
tas
), precipitation (pr
), etc.
-
ERA5 Reanalysis Data: Obtain high-resolution observational data.
- Access via Copernicus Climate Data Store: ERA5 Data
- Variables: 2m temperature, total precipitation, etc.
-
Additional Data:
- Topography: Elevation data from ETOPO1
- Land-Sea Mask: From ERA5 or other reliable sources
Use the data/scripts/data_acquisition.py
script to guide you through the process. Modify the script according to your specific needs and data access protocols.
Note: You may need to register and agree to data usage terms.
Use the Copernicus Climate Data Store API to download ERA5 data. Instructions are available on their website.
Download topography and land-sea mask data as required.
Once you have downloaded the data, preprocess it using the provided scripts.
python data/scripts/data_preprocessing.py --config configs/data_config.yaml
This script will:
- Regrid CMIP6 data to match the resolution of ERA5 data.
- Normalize and standardize datasets.
- Handle missing values.
- Generate additional features like topography and time variables.
-
Configure Training Parameters
Edit the
configs/train_config.yaml
file to set your training parameters, such as batch size, number of epochs, learning rate, and data paths. -
Start Training
Run the training script:
python scripts/train_model.py --config configs/train_config.yaml
Options:
- Use
--config
to specify a different configuration file if needed.
- Use
-
Monitor Training
Training progress, including loss and metric values, will be displayed in the console. Model checkpoints will be saved to the path specified in the configuration file.
-
Configure Evaluation Parameters
Edit the
configs/evaluate_config.yaml
file to set the evaluation parameters and data paths. -
Run Evaluation
python scripts/evaluate_model.py --config configs/evaluate_config.yaml
The script will output evaluation metrics such as MSE, MAE, and R².
-
Visualize Results
Use the Jupyter notebook
notebooks/model_evaluation.ipynb
to generate detailed visualizations and analyses.jupyter notebook notebooks/model_evaluation.ipynb
-
Run the Interpretation Notebook
Open the
notebooks/model_evaluation.ipynb
notebook and execute the cells related to model interpretability. -
Analyze SHAP Values
The notebook will guide you through computing SHAP values to understand feature importance and model decision-making processes.
AI-Downscale/
├── README.md
├── LICENSE
├── requirements.txt
├── configs/
│ ├── data_config.yaml
│ ├── train_config.yaml
│ └── evaluate_config.yaml
├── data/
│ ├── raw/
│ │ ├── cmip6/
│ │ └── era5/
│ ├── processed/
│ └── scripts/
│ ├── data_acquisition.py
│ └── data_preprocessing.py
├── notebooks/
│ ├── data_exploration.ipynb
│ └── model_evaluation.ipynb
├── src/
│ ├── models/
│ │ ├── downscaling_model.py
│ │ └── custom_loss.py
│ ├── utils/
│ │ ├── data_preprocessing.py
│ │ ├── feature_engineering.py
│ │ └── interpretability.py
│ └── main.py
├── scripts/
│ ├── train_model.py
│ └── evaluate_model.py
└── docs/
├── references.md
└── images/
- configs/: Configuration files for data processing, training, and evaluation.
- data/: Data storage and processing scripts.
- notebooks/: Jupyter notebooks for data exploration and model evaluation.
- src/: Source code for models and utilities.
- scripts/: Command-line scripts for training and evaluation.
- docs/: Documentation and references.
Contributions are welcome! Please follow these steps:
-
Fork the Repository
-
Create a Feature Branch
git checkout -b feature/your-feature-name
-
Commit Your Changes
git commit -am 'Add new feature'
-
Push to the Branch
git push origin feature/your-feature-name
-
Open a Pull Request
This project is licensed under the MIT License. See the LICENSE file for details.
- Data Providers:
- CMIP6 Project: For providing GCM outputs.
- Copernicus Climate Change Service: For providing ERA5 reanalysis data.
- Research Inspiration: Based on the project proposal "AI-Downscale: Machine Learning Approaches for High-Resolution Regional Climate Projections" by [Principal Investigator's Name].
For questions or assistance, please contact:
- Email: Rezaian@ut.ac.ir
- GitHub: Rezaian
Note: Ensure compliance with data usage agreements and licenses when downloading and using climate datasets. The user is responsible for obtaining necessary permissions and complying with all applicable laws and regulations.