Repo structure #3

peterdesmet · 2019-02-15T09:43:47Z

├── README.md
├── LICENCE
├── .gitignore
│
├── data
│   │
│   ├── raw
│   │   ├── modelling_species.csv
│   │   └── (do not include GBIF data in repo)
│   │
│   ├── interim (.gitignore)
│   │   └── occ with assigned coordinates + grid => S3?
│   │
│   └── processed
│       ├── cube_europe.csv
│       ├── cube_belgium.csv
│       └── cube_belgium_baseline.csv
│
└── src
    │
    ├── belgium
    │   ├── download.Rmd: define filters, trigger download
    │   ├── create_db.Rmd: create sqlite, fill with data, filter on issues
    │   ├── assign_grid.Rmd: assign coordinates, assign grid (chunk based)
    │   └── aggregate.Rmd: (filter on taxa), agg for alien, agg for baseline
    │
    └── europe
        ├── download.Rmd
        ├── create_db.Rmd
        ├── assign_grid.Rmd
        └── aggregate.Rmd

The text was updated successfully, but these errors were encountered:

damianooldoni · 2019-02-15T15:11:37Z

we should add notebook for filtering post download... (e.g. absences, coordinate issues) or just adding the filtering at the begin of assign_coord.ipynb`.

damianooldoni · 2019-02-21T09:12:35Z

@peterdesmet : up to now we were used to put the file with all triggered downloads (gbif_downloads.tsv) in ./data/output. However, ./data/output doesn't exist anymore, changed to the standard datascience cookiecutter folder ./data/processed, which contains:

The final, canonical data sets for modeling.

I don't like to mix the cubes with this file with all GBIF downloads.
So, should we put it in ./data/processed or maybe use the folder ./references, which contains

Data dictionaries, manuals, and all other explanatory materials.

This is also not completely correct but maybe it is a better solution. The pipeline for triggering a download is ready in branch https://github.com/trias-project/occ-processing/tree/add_download_pipeline_belgium. This is the only doubt I have before PR.

peterdesmet · 2019-02-21T09:18:36Z

I would put in data/interim

Copied from my local version in trias-project/indicators/data/output. See #3 for possible other location in repo.

damianooldoni mentioned this issue Feb 18, 2019

From occurrences and AreaOfOccupancy to an emerging species decision rule at year level trias-project/indicators#49

Closed

damianooldoni added a commit that referenced this issue Feb 21, 2019

Add gbif_downloads.tsv and library(here)

b5e45db

Copied from my local version in trias-project/indicators/data/output. See #3 for possible other location in repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repo structure #3

Repo structure #3

peterdesmet commented Feb 15, 2019 •

edited

Loading

damianooldoni commented Feb 15, 2019 •

edited

Loading

damianooldoni commented Feb 21, 2019

peterdesmet commented Feb 21, 2019

Repo structure #3

Repo structure #3

Comments

peterdesmet commented Feb 15, 2019 • edited Loading

damianooldoni commented Feb 15, 2019 • edited Loading

damianooldoni commented Feb 21, 2019

peterdesmet commented Feb 21, 2019

peterdesmet commented Feb 15, 2019 •

edited

Loading

damianooldoni commented Feb 15, 2019 •

edited

Loading