## A Simple Model Workflow


In [1]:
# Run this cell if you want to follow along 
options(warn = -1)
suppressMessages(library(neotoma2))
suppressMessages(library(sf))
suppressMessages(library(geojsonsf))
suppressMessages(library(dplyr))
suppressMessages(library(ggplot2))
suppressMessages(library(leaflet))

### Goals

1. Geographic search for sites
2. Collect datasets
3. Filter for time/space/etc.
4. Get full download
5. Analyze & plot


## Search for Sites

### `get_sites()`

* Site names: `sitename=’Lait%’`
* Location: `loc=c()`
* Altitude: `altmin`, `altmax`

In [2]:
laitSites <- neotoma2::get_sites(sitename = "%Lait%")
laitSites

 siteid    sitename      lat    long altitude
   3220 Lac du Lait 45.31417 6.81528     2190

In [3]:
neotoma2::plotLeaflet(laitSites)

### Location `loc=c()`

In [13]:
czGeoJson <-'{"type": "Polygon",
        "coordinates": [[
            [12.40, 50.14],[14.10, 48.64],[16.95, 48.66],
            [18.91, 49.61],[15.24, 50.99],[12.40, 50.14]]]}'
czGeoJson <- geojsonsf::geojson_sf(czGeoJson)
cz_sites <- neotoma2::get_sites(loc = czGeoJson)
neotoma2::plotLeaflet(cz_sites)

In [8]:
czWKT = 'POLYGON ((12.4 50.14, 
                         14.1 48.64, 
                         16.95 48.66, 
                         18.91 49.61,
                         15.24 50.99,
                         12.4 50.14))'
cz_sites <- neotoma2::get_sites(loc = czWKT)
neotoma2::plotLeaflet(cz_sites)

In [11]:
czBbox = c(12.4, 48.64, 18.91, 50.99)
cz_sites <- neotoma2::get_sites(loc = czBbox)
neotoma2::plotLeaflet(cz_sites)

In [14]:
neotoma2::plotLeaflet(cz_sites) %>% 
leaflet::addPolygons(map = ., 
                       data = czGeoJson, 
                       color = "green")

## Helper Functions

###  `summary()`

In [15]:
neotoma2::summary(cz_sites) %>%
  DT::datatable(data = ., rownames = FALSE)

## Search for Datasets

### `get_datasets()`

* Datasettype: `datasettype=’Diatom surface sample’`
* Location: `loc=c()`
* Altitude: `altmin`, `altmax`

### `datasets()`

In [16]:
cz_datasets <- neotoma2::get_datasets(cz_sites, all_data = TRUE, verbose = FALSE)
datasets(cz_datasets) %>% 
  as.data.frame() %>% 
  DT::datatable(data = .)

In [17]:
datasets(cz_sites) %>% 
  as.data.frame() %>% 
  DT::datatable(data = .)

## Helper Functions

###  `filter()`

In [None]:
cz_pollen <- cz_datasets %>% 
  neotoma2::filter(datasettype == "pollen")
neotoma2::summary(cz_pollen) %>% DT::datatable(data = .)

**Remember** that the order in which packages are loaded makes a difference. 
```python
Error in UseMethod("filter"): 
  no applicable method for 'filter' applied to an object of class "sites"
```

The previous error message means that a different package is trying to run `filter()`

## Pulling the Data

### `get_downloads()`

* Done after the preliminary filtering

In [18]:
## This line is commented out because we've already run it for you.
## cz_dl <- cz_pollen %>% get_downloads(all_data = TRUE)
cz_dl <- readRDS('data/czDownload.RDS')

In [19]:
allSamp <- samples(cz_dl)
head(allSamp, n = 2)

Unnamed: 0_level_0,age,agetype,ageolder,ageyounger,chronologyid,chronologyname,units,value,context,element,⋯,area,sitenotes,description,elev,collunitid,database,datasettype,age_range_old,age_range_young,datasetnotes
Unnamed: 0_level_1,<int>,<chr>,<int>,<int>,<int>,<chr>,<chr>,<int>,<chr>,<chr>,⋯,<int>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<int>,<int>,<chr>
1,-49,Calibrated radiocarbon years BP,,,12777,Clam,grains/tablet,12542,,concentration,⋯,,,Peat bog in a sandstone valley. Physiography: Valley. Surrounding vegetation: Spruce forest.,696,16190,European Pollen Database,pollen,9078,-49,Data contributed by PALYCZ via Kunes Petr.
2,-49,Calibrated radiocarbon years BP,,,12777,Clam,ml,1,,volume,⋯,,,Peat bog in a sandstone valley. Physiography: Valley. Surrounding vegetation: Spruce forest.,696,16190,European Pollen Database,pollen,9078,-49,Data contributed by PALYCZ via Kunes Petr.


In [20]:
names(allSamp)

## Extracting taxa

### `taxa()`

- Returns:
    * unique taxa
    * number of sites
    * number of samples
    
- taxonid is in `samples()` too. This allows to build harmonization tables.

In [21]:
neotomatx <- neotoma2::taxa(cz_dl) %>% 
  unique()

DT::datatable(data = head(neotomatx, n = 10), rownames = FALSE)