---WORK IN PROGRESS---

EO4AI

Earth Observation preprocessing tools for AI and machine learning applications

This project provides easy-to-use tools for preprocessing datasets for image segmentation tasks in Earth Observation. We hope to remove the barrier to entry for data scientists in EO, by reducing the amount of time spent on reformatting datasets. These EO datasets are frequently characterised by very large image formats, high bit-depths, non-standard label formats, pixel values in Digital Number, varied naming conventions, and other dataset-specific peculiarities which slow down development of AI applications.

This package aims to provide users with a pre-prepared dataset ready immediately for AI / Deep Learning applications. The processed datasets are all:

Normalised to reflectance values
Resampled to the same resolution
Split into smaller images for quicker read times
Transformed into one-hot encoded masks
Organised into simple directory tree structure
Documented with useful metadata and command for replication

Cloud Masking datasets

Landsat 8: Biome^link (USGS, 2016)

96 manually annotated Landsat 8 scenes (~8k-by-8k pixels) from 8 different terrain types (biomes). Data provided at 30m res. for all bands.

Landsat 8: SPARCS^link (USGS, 2016)

80 manually annotated cropped Landsat 8 scenes (1k-by-1k pixels). Data provided at 30m resolution but does not include sharper 'Panchromatic' band.

Landsat 7: Irish^link (USGS, 2016)

206 manually annotated Landsat 7 scenes from a diverse range of latitudes. Data provided at nominal Landsat 7 resolution of 30m.

Sentinel-2: ALCD^link (Baetens et al., 2018)

38 Sentinel-2 scenes annotated through an "active learning" system. Data provided in native band resolutions (10m - 60m). Does not include the parent scenes, only the masks. Therefore we include a download tool to retrieve the relevant scenes from the Copernicus Open Access Hub, for which a username and password is needed.

Sentinel-2: IRIS^link (Francis et al., 2020)

513 subscenes from Sentinel-2. Each image and mask pair is 1022 pixels across.

Sentinel-2: KappaZeta^link (Domnich et al., 2021)

4403 subscenes from 155 Sentinel-2 products. Each image and mask pair is 512 pixels across at 10 m/pixel resolution.

Credits and Contributions

Please use these tools freely in your work. Give this repository an acknowledgement and always credit and cite the datasets' creators, who have put a huge amount of work into these labelled datasets!

If you have a dataset that you think would be a good fit, or would like to contribute to the repository, please post an issue, send a PR, or just get in touch!

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
constants/datasets		constants/datasets
eo4ai		eo4ai
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
conftest.py		conftest.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

constants/datasets

constants/datasets

eo4ai

eo4ai

tests

tests

.gitignore

.gitignore

.travis.yml

.travis.yml

LICENSE

LICENSE

README.md

README.md

conftest.py

conftest.py

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

---WORK IN PROGRESS---

EO4AI

Earth Observation preprocessing tools for AI and machine learning applications

Cloud Masking datasets

Landsat 8: Biome^link (USGS, 2016)

Landsat 8: SPARCS^link (USGS, 2016)

Landsat 7: Irish^link (USGS, 2016)

Sentinel-2: ALCD^link (Baetens et al., 2018)

Sentinel-2: IRIS^link (Francis et al., 2020)

Sentinel-2: KappaZeta^link (Domnich et al., 2021)

Credits and Contributions

About

Releases

Packages

Contributors 3

Languages

License

ESA-PhiLab/eo4ai

Folders and files

Latest commit

History

Repository files navigation

---WORK IN PROGRESS---

EO4AI

Earth Observation preprocessing tools for AI and machine learning applications

Cloud Masking datasets

Landsat 8: Biomelink (USGS, 2016)

Landsat 8: SPARCSlink (USGS, 2016)

Landsat 7: Irishlink (USGS, 2016)

Sentinel-2: ALCDlink (Baetens et al., 2018)

Sentinel-2: IRISlink (Francis et al., 2020)

Sentinel-2: KappaZetalink (Domnich et al., 2021)

Credits and Contributions

About

Resources

License

Stars

Watchers

Forks

Languages

Landsat 8: Biome^link (USGS, 2016)

Landsat 8: SPARCS^link (USGS, 2016)

Landsat 7: Irish^link (USGS, 2016)

Sentinel-2: ALCD^link (Baetens et al., 2018)

Sentinel-2: IRIS^link (Francis et al., 2020)

Sentinel-2: KappaZeta^link (Domnich et al., 2021)