# sampley exemplar: Stage 3
Before going through this exemplar, please consult the Introduction to sampley exemplars (```intro.ipynb```).
<br>This exemplar illustrates the classes that can be made in Stage 3, their class methods, their attributes, and their methods. 

## Setup

### Import the package

In [1]:
from sampley import *

### Set the input folder
To run this exemplar, download the mock data files, put them in a folder, and set the path to the folder below.

In [2]:
input_folder = './input/'

### Set the output folder
To run this exemplar, make a folder to save the outputs in and set the path to the folder below.

In [3]:
output_folder = './output/'

### Make DataPoints, Sections, and delimiters (Stages 1 and 2)
Before making any objects in Stage 3, we, of course, have to conduct Stages 1 and 2. For the sake of this exemplar, we make one ```Sections``` and one ```DataPoints``` object below that are then used to make all the subsequent delimiters. See the Stage 1 and 2 exemplars and the User Manual for more details on Stages 1 and 2.

In [4]:
u_sightings = DataPoints.from_file(
    filepath=input_folder+'sightings.gpkg',
    crs_working='EPSG:32619',
    datetime_col='datetime',
    tz_input='UTC-05:00',
    tz_working='UTC-05:00'
)

Opening file...
Success: file successfully input.
Reprojecting CRS...
Note: reprojection to CRS 'EPSG:32619' not necessary as already in CRS 'EPSG:32619'.
Parsing datetimes...
Success: the column 'datetime' successfully reformatted to datetimes.
Success: the timezone of column 'datetime' successfully set to 'UTC-05:00'.
Converting timezone...
Note: conversion of column 'datetime' to timezone 'UTC-05:00' is not necessary as it is already in timezone 'UTC-05:00'.
Success: datapoint IDs successfully generated.


In [5]:
u_sections = Sections.from_file(
    filepath=input_folder+'sections.gpkg',
    crs_working='EPSG:32619',
    datetime_col='datetime_beg',
    tz_input='UTC-05:00',
    tz_working='UTC-05:00'
)

Opening file...
Success: file successfully input.
Reprojecting CRS...
Success: reprojected to CRS 'EPSG:32619'
Parsing datetimes...
Success: the column 'datetime_beg' successfully reformatted to datetimes.
Success: the timezone of column 'datetime_beg' successfully set to 'UTC-05:00'.
Converting timezone...
Note: conversion of column 'datetime_beg' to timezone 'UTC-05:00' is not necessary as it is already in timezone 'UTC-05:00'.
Note: column 'datetime_beg' renamed to 'datetime'.
Success: section IDs successfully generated.


In [6]:
u_periods = Periods.delimit(
    extent=u_sections,
    unit='day',
    num=8)

In [7]:
u_cells = Cells.delimit(
    extent=u_sections,
    var='hexagonal',
    side=5000,
    buffer=2000)

In [8]:
u_segments = Segments.delimit(
    sections=u_sections,
    var='simple',
    target=10000,
    randomise=True)

In [9]:
u_presences = Presences.delimit(
    datapoints=u_sightings,
    presence_col='individuals')
u_presences.thin(
    sp_threshold=10000,
    tm_threshold=5,
    tm_unit='day')

Conducting spatiotemporal thinning...
Thinning complete.


In [10]:
u_absencelines = AbsenceLines.delimit(
    sections=u_sections,
    presences=u_presences,
    sp_threshold=10000,
    tm_threshold=5,
    tm_unit='day',
)

Note: absence lines to be generated with a temporal threshold of 5 day(s).


In [11]:
u_absences = Absences.delimit(
    absencelines=u_absencelines,
    var='along',
    target=20,
    dfls=None)
u_absences.thin(
    sp_threshold=10000,
    tm_threshold=5,
    tm_unit='day',
    target=9)

Conducting spatiotemporal thinning...
Thinning complete.


## Samples - grid approach

### Make a ```Samples``` object via the grid approach...
If using the grid approach, we can make a ```Samples``` object from a ```DataPoints``` object with the ```Samples.grid()``` class method.
<br>We can also make a ```Samples``` object with measures of survey effort from a ```Sections``` object with the ```Samples.grid_se()``` class method.
<br>Additionally, we can merge multiple ```Samples``` objects into a single new ```Samples``` object with the ```Samples.merge()``` class method

#### ...from a ```DataPoints``` object

In [12]:
u_samples_sightings = Samples.grid(
    datapoints=u_sightings,
    cells=u_cells,
    periods=u_periods,
    cols={'individuals': 'sum'})

#### ...from a ```Sections``` object

In [13]:
u_samples_effort = Samples.grid_se(
    sections=u_sections,
    cells=u_cells,
    periods=u_periods,
    esw=None)

#### ...by merging multiple ```Samples``` objects

In [14]:
u_samples = Samples.merge(
    sightings=u_samples_sightings,
    effort=u_samples_effort)


Note: samples generated with the grid approach


### Access a ```Samples``` object's attributes
A ```Samples``` object, regardless of how it was made, will have three attributes (```name```, ```parameters```, and ```samples```) that we can access as follows.

In [15]:
u_samples.name

'samples-sightings+effort-x-cells-h5000m-x-periods-8d'

In [16]:
u_samples.parameters

{'names': 'samples-datapoints-sightings-x-cells-h5000m-x-periods-8d+samples-sections-sections-x-cells-h5000m-x-periods-8d',
 'approach': 'grid',
 'resampled': 'datapoints; effort',
 'datapoints_name': 'datapoints-sightings; nan',
 'datapoints_filepath': './input/sightings.gpkg; nan',
 'datapoints_crs': 'EPSG:32619; nan',
 'datapoints_tz': 'UTC-05:00; nan',
 'datapoints_data_cols': 'individuals; nan',
 'cells_name': 'cells-h5000m',
 'cells_crs': 'EPSG:32619',
 'cells_extent': '493765.49253164633, 4689798.086839909, 671300.7003074563, 4759798.086839909',
 'cells_extent_source': 'Sections - sections-sections',
 'cells_var': 'hexagonal',
 'cells_side': '5000',
 'cells_unit': 'metre',
 'cells_buffer': '2000',
 'periods_name': 'periods-8d',
 'periods_tz': 'UTC-05:00',
 'periods_extent': '2019-01-25 00:00:00-2019-03-05 23:59:59',
 'periods_extent_source': 'Sections - sections-sections',
 'periods_number': '8',
 'periods_unit': 'day',
 'cols': "{'individuals': 'sum'}; nan",
 'sections_name': '

In [17]:
u_samples.samples

Unnamed: 0,cell_id,polygon,centroid,period_id,datetime_beg,datetime_mid,datetime_end,individuals,se_length
0,c013-h5000m,"POLYGON ((602018.668 4699798.087, 606348.795 4...",POINT (602018.668 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-05 23:59:59-05:00,2019-02-09 23:59:59-05:00,,4000.286022
1,c014-h5000m,"POLYGON ((610678.922 4699798.087, 615009.049 4...",POINT (610678.922 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-05 23:59:59-05:00,2019-02-09 23:59:59-05:00,4.0,8662.075856
2,c015-h5000m,"POLYGON ((619339.176 4699798.087, 623669.303 4...",POINT (619339.176 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-05 23:59:59-05:00,2019-02-09 23:59:59-05:00,2.0,8661.517653
3,c016-h5000m,"POLYGON ((627999.43 4699798.087, 632329.557 46...",POINT (627999.43 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-05 23:59:59-05:00,2019-02-09 23:59:59-05:00,,8577.463371
4,c017-h5000m,"POLYGON ((636659.684 4699798.087, 640989.811 4...",POINT (636659.684 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-05 23:59:59-05:00,2019-02-09 23:59:59-05:00,,8234.234607
...,...,...,...,...,...,...,...,...,...
105,c147-h5000m,"POLYGON ((554387.271 4752298.087, 558717.398 4...",POINT (554387.271 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-28 23:59:59-05:00,2019-02-01 23:59:59-05:00,2.0,1543.406690
106,c148-h5000m,"POLYGON ((563047.525 4752298.087, 567377.652 4...",POINT (563047.525 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-28 23:59:59-05:00,2019-02-01 23:59:59-05:00,,1704.820261
107,c149-h5000m,"POLYGON ((571707.779 4752298.087, 576037.906 4...",POINT (571707.779 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-28 23:59:59-05:00,2019-02-01 23:59:59-05:00,,1869.663454
108,c150-h5000m,"POLYGON ((580368.033 4752298.087, 584698.16 47...",POINT (580368.033 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-28 23:59:59-05:00,2019-02-01 23:59:59-05:00,,724.066435


### Save a ```Samples``` object
```Samples``` objects have an inbuilt ```save``` method to save the samples as a CSV or GPKG.

In [18]:
u_samples.save(
    folder=output_folder,
    filetype='gpkg'
)

## Samples - segment approach

### Make a ```Samples``` object...
If using the segment approach, we can make a ```Samples``` object from a ```DataPoints``` object with the ```Samples.segment()``` class method.
<br>We can also make a ```Samples``` object with measures of survey effort from a ```Segments``` object with the ```Samples.segment_se()``` class method.
<br>Additionally, we can merge multiple ```Samples``` objects into a single new ```Samples``` object with the ```Samples.merge()``` class method.

#### ...from a ```DataPoints``` object

In [19]:
u_samples_sightings = Samples.segment(
    datapoints=u_sightings,
    segments=u_segments,
    cols={'individuals': 'sum'},
    how='midpoint')

#### ...from a ```Sections``` object

In [20]:
u_samples_effort = Samples.segment_se(
    segments=u_segments,
    esw=2000)

#### ...by merging multiple ```Samples``` objects

In [21]:
u_samples = Samples.merge(
    sightings=u_samples_sightings,
    effort=u_samples_effort)


Note: samples generated with the segment approach


### Access a ```Samples``` object's attributes
A ```Samples``` object, regardless of how it was made, will have three attributes (```name```, ```parameters```, and ```samples```) that we can access as follows.

In [22]:
u_samples.name

'samples-sightings+effort-x-segments-s10000m'

In [23]:
u_samples.parameters

{'names': 'samples-datapoints-sightings-x-segments-s10000m+samples-sections-sections-x-segments-s10000m',
 'approach': 'segment',
 'resampled': 'datapoints; effort',
 'datapoints_name': 'datapoints-sightings; nan',
 'datapoints_filepath': './input/sightings.gpkg; nan',
 'datapoints_crs': 'EPSG:32619; nan',
 'datapoints_tz': 'UTC-05:00; nan',
 'datapoints_data_cols': 'individuals; nan',
 'segments_name': 'segments-s10000m',
 'sections_name': 'sections-sections',
 'segments_crs': 'EPSG:32619',
 'segments_var': 'simple',
 'segments_randomise': 'True',
 'segments_target': '10000',
 'segments_unit': 'metre',
 'cols': "{'individuals': 'sum'}; nan",
 'effort_esw': 'nan; 2000.0',
 'effort_audf': 'nan; None',
 'effort_euc-geo': 'nan; euclidean'}

In [24]:
u_samples.samples

Unnamed: 0,segment_id,line,midpoint,date,section_id,dfbsec_beg,dfbsec_end,individuals,se_length,se_area
0,s01-s10000m,"LINESTRING (580092.757 4742883.408, 579997.135...",POINT (575093.311 4742845.962),2019-01-25,s1,0.000000,10000.000000,1.0,10000.0,4.000000e+07
1,s02-s10000m,"LINESTRING (570094.222 4742829.916, 569917.081...",POINT (565094.73 4742799.725),2019-01-25,s1,10000.000000,20000.000000,,10000.0,4.000000e+07
2,s03-s10000m,"LINESTRING (560095.148 4742773.163, 559864.339...",POINT (555095.521 4742749.119),2019-01-25,s1,20000.000000,30000.000000,2.0,10000.0,4.000000e+07
3,s04-s10000m,"LINESTRING (550095.667 4742710.935, 549838.842...",POINT (545095.787 4742676.383),2019-01-25,s1,30000.000000,40000.000000,5.0,10000.0,4.000000e+07
4,s05-s10000m,"LINESTRING (540095.882 4742645.456, 539704.822...",POINT (535096.832 4742648.569),2019-01-25,s1,40000.000000,50000.000000,,10000.0,4.000000e+07
...,...,...,...,...,...,...,...,...,...,...
68,s69-s10000m,"LINESTRING (652338.76 4697682.02, 652197.878 4...",POINT (647339.987 4697611.472),2019-02-05,s4,274105.324868,284105.324868,,10000.0,4.000000e+07
69,s70-s10000m,"LINESTRING (642341.15 4697526.723, 641994.147 ...",POINT (637342.194 4697445.552),2019-02-05,s4,284105.324868,294105.324868,,10000.0,4.000000e+07
70,s71-s10000m,"LINESTRING (632343.125 4697349.07, 632326.134 ...",POINT (627344.065 4697254.461),2019-02-05,s4,294105.324868,304105.324868,,10000.0,4.000000e+07
71,s72-s10000m,"LINESTRING (622345.629 4697147.973, 622000.67 ...",POINT (617346.364 4697062.225),2019-02-05,s4,304105.324868,314105.324868,2.0,10000.0,4.000000e+07


### Save a ```Samples``` object
```Samples``` objects have an inbuilt ```save``` method to save the samples as a CSV or GPKG.

In [25]:
u_samples.save(
    folder=output_folder,
    filetype='gpkg'
)

## Samples - point approach

### Make a ```Samples``` object...
If using the point approach, we can make a ```Samples``` object from a ```DataPoints``` object with the ```Samples.point()``` class method.

#### ...from a ```DataPoints``` object

In [26]:
u_samples = Samples.point(
    datapoints=u_sightings,
    presences=u_presences,
    absences=u_absences,
    cols=['individuals'])

### Access a ```Samples``` object's attributes
A ```Samples``` object, regardless of how it was made, will have three attributes (```name```, ```parameters```, and ```samples```) that we can access as follows.

In [27]:
u_samples.name

'samples-presences-sightings-+-absences-a-10000m-5day'

In [28]:
u_samples.parameters

{'approach': 'point',
 'resampled': 'datapoints',
 'presences_name': 'presences-sightings',
 'presences_crs': 'EPSG:32619',
 'presences_sp_threshold': 10000,
 'presences_tm_threshold': 5,
 'presences_tm_unit': 'day',
 'absences_name': 'absences-a-10000m-5day',
 'absences_var': 'along',
 'absences_target': 20,
 'absences_crs': 'EPSG:32619',
 'absences_sp_threshold': 10000,
 'absences_tm_threshold': 5,
 'absences_tm_unit': 'day'}

In [29]:
u_samples.samples

Unnamed: 0,point_id,point,date,p-a,individuals,datapoint_id
0,p01,POINT (579166.78 4742872.701),2019-01-25,1,1.0,d01
1,p03,POINT (548599.876 4742700.214),2019-01-25,1,5.0,d03
2,p04,POINT (520909.741 4714855.058),2019-02-02,1,1.0,d04
3,p05,POINT (532548.249 4714899.835),2019-02-02,1,2.0,d05
4,p06,POINT (512817.407 4705582.465),2019-02-02,1,1.0,d06
5,p08,POINT (654449.136 4716189.584),2019-02-05,1,5.0,d08
6,p10,POINT (643532.681 4716066.52),2019-02-05,1,1.0,d10
7,p11,POINT (629124.489 4706545.106),2019-02-05,1,3.0,d11
8,p13,POINT (611976.857 4696974.111),2019-02-05,1,4.0,d13
9,a02,POINT (504588.106 4742514.228),2019-01-25,0,,


### Save a ```Samples``` object
```Samples``` objects have an inbuilt ```save``` method to save the samples as a CSV or GPKG.

In [30]:
u_samples.save(
    folder=output_folder,
    filetype='gpkg'
)