Skip to content

Forest-Fire-Research/noah

Repository files navigation

NOAH: Now Observation Assemble Horizon

Paper Link: https://doi.org/10.3390/rs18030466

TLDR: NOAH is a GenAI dataset that covers 8,742,469 km2 of non-overlapping land areas in Canada at 815 distinct locations under 5 modalities at 30 m spatial resolution, where each modality covers 40,000 km2.

NOAH logo

Earth observation and Remote Sensing (RS) data are widely used in various applications, including natural disaster modeling and prediction. Currently, there are two main types of satellites used in RS: geostationary and polar orbiting. However, the coverage of geostationary satellites is limited to a smaller region. Additionally, images from the polar orbiting satellites are discontinuous, which limits their effectiveness for real-time disaster modeling, especially in rapidly evolving situations like wildfires. To address these limitations, we introduce Now Observation Assemble Horizon (NOAH), a multi-modal, sensor fusion dataset that combines Ground-Based Sensors (GBS) of weather stations with topography, vegetation (land cover, biomass, and crown cover), and fuel types. NOAH is collated using publicly available Canadian data from Environment and Climate Change Canada (ECCC), Spatialized Canadian National Forest Inventory (SCANFI) and Landsat 8, which are well-maintained, documented, and reliable. Models trained on NOAH can produce real-time data for disaster modeling in remote locations, complementing the use of field instruments and can be used for Generative Artificial Intelligence (GenAI) applications. The baseline modeling was done on UNet backbone with Feature-wise Linear Modulation (FiLM) injection of GBS data.

Each image in NOAH is 100 MB+, with about 234K+, totaling to 20TB+ of data. Hence, a mini version of the NOAH is made accessible at Hugging Face. Full data access can be provided upon request. It should be noted that a physical hard drive will need to be shipped to make it possible. The code for the research can be accessed from GitHub.

This repository contains the following code:

  • Collation of NOAH and NOAH mini datasets
  • Code for data preprocessing
  • Code for data splitting
  • Code for data visualization
  • Code for data modelling with UNet + FiLM

Table of Contents

Data

The dataset covers 8,742,469 Km2 of non-overlapping land areas in Canada at 815 distinct locations. Each sample covers 40,000 km2. The spatial resolution is 30 m. A figure showing the coverage is given below. coverage

Data Sources

Name Provider Link
Topography SCANFI Source
Land Cover of Canada NRCan Source
Biomass SCANFI Source
Crown Cover SCANFI Source
Fuel Types NRCan Source
Landsat 8 NASA & USGS Source
Weather Station ECCC Source

Data Sample

A sample of the data with diffrent modalities can be seen in the figure below. Data Modality

Data Repository

A sample of the samaller verion of NOAH has been uploaded to Hugging Face.

Hugging Face NOAH mini

Model Architecture

To benchmark the results A UNet + FiLM apprach was used to account for the multi-modal dataset. The architecure is given below.

Model Architecture

Code Initialization

Following are a list of dependancies need to execute the code with ease:

  • Python 3.10.12
  • Jupyter Notebooks
  • Docker
  • Docker Compose

Ideally use the docker deployment to run the code as all the dependancies are preinstalled.

The code is available as notebooks

Docker Deployment

  • Ensure port 8899 is open on your host machine.
  • Ensure you have the .env values configured according to your system requirements.
git clone https://github.com/Forest-Fire-Research/noah.git
cd noah
docker compose up -d

Open Notebook - localhost:8899 to run the code

Local Code Execution

You need to have running jupyter notebook environment and python installed

git clone https://github.com/Forest-Fire-Research/noah.git
cd noah
pip install -r requirements.txt

Once the requirements are installed you can run the code in notebooks

Citation

@inproceedings{
  noah2026RemoteSensing,
  title={NOAH: A Multi-Modal and Sensor Fusion Dataset for Generative Artificial Intelligence in Remote Sensing},
  author={Abdul Mutakabbir, Chung-Horng Lung, Marzia Zaman, Darshana Upadhyay, Koreen Millard, Thambirajah Ravichandran, and Richard Purcell},
  booktitle={Remote Sensing},
  year={2026},
  doi={https://doi.org/10.3390/rs18030466}
}

Acknowledgments

The research produced is part of ongoing collaborative work between Carleton University, University of Waterloo, and Dalhousie University with industry partners Cistel Tech nology and Hegyi Geomatics International Inc. Additional support was received from Research Computing Services at Carleton University.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors