Cimice.Net - Datasets and code

This repository exposes the data collected within the regional project PSR 2014-2020 Op. 16.1.01 - Go Pei-Agri Focus Area 4B Pr. "Cimice.Net" (during the field seasons 2020 and 2021) and its prosecution, supported by the Emilia-Romagna Producers Organizations with the coordination of Ri.Nova (during the field season 2022).

Further details about these data are available in the following paper:

Chiara Forresi, Enrico Gallinucci, Matteo Golfarelli, Lara Maistrello, Michele Preti, Giacomo Vaccari. A Data Platform for Real-Time Monitoring and Analysis of the Brown Marmorated Stink Bug in Northern Italy. Submitted for publication at Ecological Informatics.

This repository has published following FAIR principles.

Findability is ensured by the assignment of a globally unique and persistent identifier, generated by Zenodo and associated to the release of this repository.
Accessibility is enabled via an HTTPS connection to the GitHub repository.
Interoperability is ensured by the usage of standard languages and formats for code, data, and metadata.
Reusability is enabled by the ample documentation and the publication under the MIT license.

Project structure

datasets/   -- where datasets are stored
outputs/    -- where graphical results are stored
src/        -- source code

Datasets

The datasets are described by metadata that follow the Dublin Core standard. Each data file is associated with a .json metadata file in the same folder. Big files that do not fit within this repository are available at the link.

In particular, the datasets folder contains the following:

CASE: the data collected using the CASE application.
Environment registry: a mockup of the environment data (which are not currently publishable) from Consorzio CER (Canale Emiliano-Romagnolo), showing which information is accessed by the code.
Satellite images: the rasters of satellite images collected from the ESA (European Space Agency); due to size limitations, this repository contains only the metadata, while the files are available at this link.
Weather: the weather data collected from ARPAE (Agency for Prevention, Environment, and Water); due to size limitations, this repository contains only the metadata, while the files are available at this link.
Cube: the multidimensional cube generated from the previous datasets; the dimension of the traps is also available as a shapefile.

Code

The code is structured in three main parts:

Multidimensional cube generation. This part takes as input the data located in the CASE, environment registry, satellite images, and weather datasets, to generate the multidimensional cube and save it in the dataset/cube folder. The code used is all in the folder src/main/scala/it/unibo/big, for running the code, you need to run the src/main/scala/it/unibo/big/GenerateCube.scala class. This code needs to be executed on a machine with Spark (or a Spark cluster), using the versions specified in the build.gradle file and with a Java 8 JDK that supports TSLv1.3.
Analytical processes. This part takes as input the data located in the dataset/cube folder, and generates the graphs that are saved in the outputs folder. The code used is all in the folder src/main/python, for run the code you need to:
- build the docker container docker build -t graph-container src/main/python
- run the docker container:
  - WINDOWS:
    - docker run -v %cd%/src/main/python:/app -v %cd%/datasets/cube:/app/datasets/cube -v %cd%/outputs/graphs:/app/graphs graph-generator
  - LINUX
    - docker run -v $(pwd)/src/main/python:/app -v $(pwd)/datasets:/app/datasets -v $(pwd)/outputs/graphs:/app/graphs graph-generator
Traps shapefile generation. This part takes as input the data located in the dataset/cube folder, and generates the shapefile dataset that is saved in the dataset/cube/shapefile folder. The code used is in the folder src/main/bash, for running the code, you need to:
- build the docker container docker build -t ogr2ogr-container src/main/bash
- run the docker container:
  - WINDOWS:
    - docker run -v %cd%/datasets/cube:/app/input -v %cd%/datasets/cube:/app/output ogr2ogr-container
  - LINUX
    - docker run -v $(pwd)/datasets/cube:/app/input -v $(pwd)/datasets/cube:/app/output ogr2ogr-container

Name		Name	Last commit message	Last commit date
Latest commit History 150 Commits
.github/workflows		.github/workflows
config		config
datasets		datasets
gradle/wrapper		gradle/wrapper
outputs/graphs		outputs/graphs
src/main		src/main
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
package-lock.json		package-lock.json
package.json		package.json
release.config.js		release.config.js
renovate.json		renovate.json
settings.gradle		settings.gradle

License

big-unibo/stink-bug

Folders and files

Latest commit

History

Repository files navigation

Cimice.Net - Datasets and code

Project structure

Datasets

Code

About

Resources

License

Stars

Watchers

Forks

Languages