# sampley exemplar: Stage 3
Before going through this exemplar, please consult the Introduction to sampley exemplars (```intro.ipynb```).
<br>This exemplar illustrates the classes that can be made in Stage 3, their class methods, their attributes, and their methods. 

## Setup

### Import the package

In [1]:
from sampley import *

### Set the input folder
To run this exemplar, download the mock data files, put them in a folder, and set the path to the folder below.

In [2]:
input_folder = './input/'

### Set the output folder
To run this exemplar, make a folder to save the outputs in and set the path to the folder below.

In [3]:
output_folder = './output/'

### Make DataPoints, Sections, and delimiters (Stages 1 and 2)
Before making any objects in Stage 3, we, of course, have to conduct Stages 1 and 2. For the sake of this exemplar, we make one ```Sections``` and one ```DataPoints``` object below that are then used to make all the subsequent delimiters. See the Stage 1 and 2 exemplars and the User Manual for more details on Stages 1 and 2.

In [4]:
u_sightings = DataPoints.from_file(
    filepath=input_folder+'sightings.gpkg',
    crs_working='EPSG:32619',
    datetime_col='datetime',
    tz_input='UTC-05:00'
)

Success: file opened.
Success: reprojected to CRS "EPSG:32619"
Success: column "datetime" reformatted to datetimes.
Success: timezone of column "datetime" set to "UTC-05:00".
Success: datapoint IDs generated.


In [5]:
u_sections = Sections.from_file(
    filepath=input_folder+'sections.gpkg',
    crs_working='EPSG:32619',
    datetime_col='datetime_beg',
    tz_input='UTC-05:00'
)

Success: file opened.
Success: reprojected to CRS "EPSG:32619"
Success: column "datetime_beg" reformatted to datetimes.
Success: timezone of column "datetime_beg" set to "UTC-05:00".
Note: column "datetime_beg" renamed to "datetime".
Success: section IDs generated.


In [6]:
u_periods = Periods.delimit(
    extent=u_sections,
    unit='day',
    num=8)

In [7]:
u_cells = Cells.delimit(
    extent=u_sections,
    var='hexagonal',
    side=5000,
    buffer=2000)

In [8]:
u_segments = Segments.delimit(
    sections=u_sections,
    var='simple',
    target=10000,
    rand=True)

In [9]:
u_presences = Presences.delimit(
    datapoints=u_sightings,
    presence_col='individuals')
u_presences.thin(
    sp_threshold=10000,
    tm_threshold=5,
    tm_unit='day')

In [10]:
u_presencezones = PresenceZones.delimit(
    sections=u_sections,
    presences=u_presences,
    sp_threshold=10000,
    tm_threshold=5,
    tm_unit='day',
)

In [11]:
u_absences = Absences.delimit(
    sections=u_sections,
    presencezones=u_presencezones,
    var='along',
    target=20,
    dfls=None)
u_absences.thin(
    sp_threshold=10000,
    tm_threshold=5,
    tm_unit='day',
    target=9)

## Samples - grid approach

### Make a ```Samples``` object via the grid approach...
If using the grid approach, we can make a ```Samples``` object from a ```DataPoints``` object with the ```Samples.grid()``` class method.
<br>We can also make a ```Samples``` object with measures of survey effort from a ```Sections``` object with the ```Samples.grid_se()``` class method.
<br>Additionally, we can merge multiple ```Samples``` objects into a single new ```Samples``` object with the ```Samples.merge()``` class method

#### ...from a ```DataPoints``` object

In [12]:
u_samples_sightings = Samples.grid(
    datapoints=u_sightings,
    cells=u_cells,
    periods=u_periods,
    cols={'individuals': 'sum'})

#### ...from a ```Sections``` object

In [13]:
u_samples_effort = Samples.grid_se(
    sections=u_sections,
    cells=u_cells,
    periods=u_periods,
    esw=None)

#### ...by merging multiple ```Samples``` objects

In [14]:
u_samples = Samples.merge(
    sightings=u_samples_sightings,
    effort=u_samples_effort)


Note: samples generated with the grid approach


### Access a ```Samples``` object's attributes
A ```Samples``` object, regardless of how it was made, will have three attributes (```name```, ```parameters```, and ```samples```) that we can access as follows.

In [15]:
u_samples.name

'samples-sightings+effort-x-cells-h5000m-x-periods-8d'

In [16]:
u_samples.parameters

{'name': 'samples-sightings+effort-x-cells-h5000m-x-periods-8d',
 'names': 'samples-datapoints-sightings-x-cells-h5000m-x-periods-8d+samples-sections-sections-x-cells-h5000m-x-periods-8d',
 'approach': 'grid',
 'resampled': 'datapoints; effort',
 'datapoints_name': 'datapoints-sightings; nan',
 'datapoints_filepath': './input/sightings.gpkg; nan',
 'datapoints_crs': 'EPSG:32619; nan',
 'datapoints_tz': 'UTC-05:00; nan',
 'datapoints_data_cols': 'individuals; nan',
 'cells_name': 'cells-h5000m',
 'cells_crs': 'EPSG:32619',
 'cells_extent': '493765.49253164633, 4689798.086839909, 671300.7003074563, 4759798.086839909',
 'cells_extent_source': 'Sections - sections-sections',
 'cells_var': 'hexagonal',
 'cells_side': '5000',
 'cells_unit': 'metre',
 'cells_buffer': '2000',
 'periods_name': 'periods-8d',
 'periods_tz': 'UTC-05:00',
 'periods_extent': '2019-01-25-2019-03-05',
 'periods_extent_source': 'Sections - sections-sections',
 'periods_number': '8',
 'periods_unit': 'day',
 'cols': "{'

In [17]:
u_samples.samples

Unnamed: 0,cell_id,polygon,centroid,period_id,date_beg,date_mid,date_end,individuals,se_length
0,c013-h5000m,"POLYGON ((602018.668 4699798.087, 606348.795 4...",POINT (602018.668 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,4000.286022
1,c014-h5000m,"POLYGON ((610678.922 4699798.087, 615009.049 4...",POINT (610678.922 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,4.0,8662.075856
2,c015-h5000m,"POLYGON ((619339.176 4699798.087, 623669.303 4...",POINT (619339.176 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,2.0,8661.517653
3,c016-h5000m,"POLYGON ((627999.43 4699798.087, 632329.557 46...",POINT (627999.43 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,8577.463371
4,c017-h5000m,"POLYGON ((636659.684 4699798.087, 640989.811 4...",POINT (636659.684 4694798.087),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,8234.234607
...,...,...,...,...,...,...,...,...,...
105,c147-h5000m,"POLYGON ((554387.271 4752298.087, 558717.398 4...",POINT (554387.271 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,2.0,1543.406690
106,c148-h5000m,"POLYGON ((563047.525 4752298.087, 567377.652 4...",POINT (563047.525 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,1704.820261
107,c149-h5000m,"POLYGON ((571707.779 4752298.087, 576037.906 4...",POINT (571707.779 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,1869.663454
108,c150-h5000m,"POLYGON ((580368.033 4752298.087, 584698.16 47...",POINT (580368.033 4747298.087),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,724.066435


### Modify a ```Samples``` object
Before we save our ```Samples``` object, there are a few things that we might want to modify.

#### Reproject
We can reproject our ```Samples``` object to a CRS of our choosing with the ```Samples.reproject()``` method which takes a single argument: the target CRS.
<br>In our example below, we reproject our ```Samples``` object to EPSG:4326 and then print it to see that the ```polygon``` and ```centroid``` columns are now in lat-lon coordinates.

In [18]:
u_samples.reproject(crs_target='EPSG:4326')
u_samples.samples

Success: additional geometry column "centroid" reprojected to CRS "EPSG:4326"
Success: reprojected to CRS "EPSG:4326"


Unnamed: 0,cell_id,polygon,centroid,period_id,date_beg,date_mid,date_end,individuals,se_length
0,c013-h5000m,"POLYGON ((-67.7595 42.4438, -67.70733 42.42071...",POINT (-67.76039 42.39878),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,4000.286022
1,c014-h5000m,"POLYGON ((-67.65423 42.44261, -67.60209 42.419...",POINT (-67.65519 42.39759),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,4.0,8662.075856
2,c015-h5000m,"POLYGON ((-67.54896 42.44133, -67.49686 42.418...",POINT (-67.54999 42.39631),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,2.0,8661.517653
3,c016-h5000m,"POLYGON ((-67.44369 42.43995, -67.39164 42.416...",POINT (-67.44481 42.39493),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,8577.463371
4,c017-h5000m,"POLYGON ((-67.33844 42.43847, -67.28643 42.415...",POINT (-67.33963 42.39346),p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,8234.234607
...,...,...,...,...,...,...,...,...,...
105,c147-h5000m,"POLYGON ((-68.33359 42.92137, -68.2808 42.8985...",POINT (-68.33407 42.87634),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,2.0,1543.406690
106,c148-h5000m,"POLYGON ((-68.22748 42.9207, -68.17473 42.8978...",POINT (-68.22805 42.87568),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,1704.820261
107,c149-h5000m,"POLYGON ((-68.12138 42.91993, -68.06867 42.897...",POINT (-68.12202 42.87491),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,1869.663454
108,c150-h5000m,"POLYGON ((-68.01529 42.91907, -67.96262 42.896...",POINT (-68.016 42.87405),p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,724.066435


#### Extract coordinates
The coordinates of the ```centroid``` column are now in lat-lon but they are still in a ```shapely.geometry``` which may complicate things later, if we want to, for example, get values for environmental variables from a satellite dataset at the coordinates of each centroid. So, we can use the ```Samples.coords()``` method to make two new columns: ```centroid_lat``` and ```centroid_lon```.
<br>We then print the samples to see these new columns.

In [19]:
u_samples.coords()
u_samples.samples

Unnamed: 0,cell_id,polygon,centroid,centroid_lon,centroid_lat,period_id,date_beg,date_mid,date_end,individuals,se_length
0,c013-h5000m,"POLYGON ((-67.7595 42.4438, -67.70733 42.42071...",POINT (-67.76039 42.39878),-67.760392,42.398780,p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,4000.286022
1,c014-h5000m,"POLYGON ((-67.65423 42.44261, -67.60209 42.419...",POINT (-67.65519 42.39759),-67.655190,42.397594,p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,4.0,8662.075856
2,c015-h5000m,"POLYGON ((-67.54896 42.44133, -67.49686 42.418...",POINT (-67.54999 42.39631),-67.549995,42.396312,p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,2.0,8661.517653
3,c016-h5000m,"POLYGON ((-67.44369 42.43995, -67.39164 42.416...",POINT (-67.44481 42.39493),-67.444806,42.394933,p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,8577.463371
4,c017-h5000m,"POLYGON ((-67.33844 42.43847, -67.28643 42.415...",POINT (-67.33963 42.39346),-67.339626,42.393457,p2019-02-02-8d,2019-02-02 00:00:00-05:00,2019-02-06 00:00:00-05:00,2019-02-09 00:00:00-05:00,,8234.234607
...,...,...,...,...,...,...,...,...,...,...,...
105,c147-h5000m,"POLYGON ((-68.33359 42.92137, -68.2808 42.8985...",POINT (-68.33407 42.87634),-68.334075,42.876343,p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,2.0,1543.406690
106,c148-h5000m,"POLYGON ((-68.22748 42.9207, -68.17473 42.8978...",POINT (-68.22805 42.87568),-68.228046,42.875677,p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,1704.820261
107,c149-h5000m,"POLYGON ((-68.12138 42.91993, -68.06867 42.897...",POINT (-68.12202 42.87491),-68.122022,42.874913,p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,1869.663454
108,c150-h5000m,"POLYGON ((-68.01529 42.91907, -67.96262 42.896...",POINT (-68.016 42.87405),-68.016001,42.874051,p2019-01-25-8d,2019-01-25 00:00:00-05:00,2019-01-29 00:00:00-05:00,2019-02-01 00:00:00-05:00,,724.066435


### Save a ```Samples``` object
```Samples``` objects have an inbuilt ```save``` method to save the samples as a CSV or GPKG. The name of the saved file will be the name of the ```Samples``` object.

In [20]:
print(f'The saved file will be called \'{u_samples.name}.csv\'')
u_samples.save(
    folder=output_folder,
    filetype='gpkg'
)

The saved file will be called 'samples-sightings+effort-x-cells-h5000m-x-periods-8d.csv'


## Samples - segment approach

### Make a ```Samples``` object...
If using the segment approach, we can make a ```Samples``` object from a ```DataPoints``` object with the ```Samples.segment()``` class method.
<br>We can also make a ```Samples``` object with measures of survey effort from a ```Segments``` object with the ```Samples.segment_se()``` class method.
<br>Additionally, we can merge multiple ```Samples``` objects into a single new ```Samples``` object with the ```Samples.merge()``` class method.

#### ...from a ```DataPoints``` object

In [21]:
u_samples_sightings = Samples.segment(
    datapoints=u_sightings,
    segments=u_segments,
    cols={'individuals': 'sum'},
    how='midpoint')

#### ...from a ```Sections``` object

In [22]:
u_samples_effort = Samples.segment_se(
    segments=u_segments,
    esw=2000)

#### ...by merging multiple ```Samples``` objects

In [23]:
u_samples = Samples.merge(
    sightings=u_samples_sightings,
    effort=u_samples_effort)


Note: samples generated with the segment approach


### Access a ```Samples``` object's attributes
A ```Samples``` object, regardless of how it was made, will have three attributes (```name```, ```parameters```, and ```samples```) that we can access as follows.

In [24]:
u_samples.name

'samples-sightings+effort-x-segments-s10000m'

In [25]:
u_samples.parameters

{'name': 'samples-sightings+effort-x-segments-s10000m',
 'names': 'samples-datapoints-sightings-x-segments-s10000m+samples-sections-sections-x-segments-s10000m',
 'approach': 'segment',
 'resampled': 'datapoints; effort',
 'datapoints_name': 'datapoints-sightings; nan',
 'datapoints_filepath': './input/sightings.gpkg; nan',
 'datapoints_crs': 'EPSG:32619; nan',
 'datapoints_tz': 'UTC-05:00; nan',
 'datapoints_data_cols': 'individuals; nan',
 'segments_name': 'segments-s10000m',
 'sections_name': 'sections-sections',
 'segments_crs': 'EPSG:32619',
 'segments_var': 'simple',
 'segments_rand': 'True',
 'segments_target': '10000',
 'segments_unit': 'metre',
 'cols': "{'individuals': 'sum'}; nan",
 'effort_esw': 'nan; 2000.0',
 'effort_audf': 'nan; None',
 'effort_euc-geo': 'nan; euclidean'}

In [26]:
u_samples.samples

Unnamed: 0,segment_id,line,midpoint,date,section_id,dfbsec_beg,dfbsec_end,individuals,se_length,se_area
0,s01-s10000m,"LINESTRING (580092.757 4742883.408, 579997.135...",POINT (575093.311 4742845.962),2019-01-25,s1,0.0,10000.000000,1.0,10000.000000,4.000000e+07
1,s02-s10000m,"LINESTRING (570094.222 4742829.916, 569917.081...",POINT (565094.73 4742799.725),2019-01-25,s1,10000.0,20000.000000,,10000.000000,4.000000e+07
2,s03-s10000m,"LINESTRING (560095.148 4742773.163, 559864.339...",POINT (555095.521 4742749.119),2019-01-25,s1,20000.0,30000.000000,2.0,10000.000000,4.000000e+07
3,s04-s10000m,"LINESTRING (550095.667 4742710.935, 549838.842...",POINT (545095.787 4742676.383),2019-01-25,s1,30000.0,40000.000000,5.0,10000.000000,4.000000e+07
4,s05-s10000m,"LINESTRING (540095.882 4742645.456, 540091.24 ...",POINT (540093.561 4742645.442),2019-01-25,s1,40000.0,40004.642260,,4.642260,1.856904e+04
...,...,...,...,...,...,...,...,...,...,...
68,s69-s10000m,"LINESTRING (646445.51 4697592.665, 646369.277 ...",POINT (641446.66 4697508.557),2019-02-05,s4,280000.0,290000.000000,,10000.000000,4.000000e+07
69,s70-s10000m,"LINESTRING (636447.691 4697428.025, 636124.889...",POINT (631448.609 4697332.185),2019-02-05,s4,290000.0,300000.000000,,10000.000000,4.000000e+07
70,s71-s10000m,"LINESTRING (626450.303 4697221.055, 626265.927...",POINT (621451.09 4697132.369),2019-02-05,s4,300000.0,310000.000000,2.0,10000.000000,4.000000e+07
71,s72-s10000m,"LINESTRING (616451.814 4697047.259, 616228.106...",POINT (611453.214 4696946.859),2019-02-05,s4,310000.0,320000.000000,4.0,10000.000000,4.000000e+07


### Modify a ```Samples``` object
Before we save our ```Samples``` object, there are a few things that we might want to modify.

#### Reproject
We can reproject our ```Samples``` object to a CRS of our choosing with the ```Samples.reproject()``` method which takes a single argument: the target CRS.
<br>In our example below, we reproject our ```Samples``` object to EPSG:4326 and then print it to see that the ```line``` and ```midpoint``` columns are now in lat-lon coordinates.

In [27]:
u_samples.reproject(crs_target='EPSG:4326')
u_samples.samples

Success: additional geometry column "midpoint" reprojected to CRS "EPSG:4326"
Success: reprojected to CRS "EPSG:4326"


Unnamed: 0,segment_id,line,midpoint,date,section_id,dfbsec_beg,dfbsec_end,individuals,se_length,se_area
0,s01-s10000m,"LINESTRING (-68.02 42.83433, -68.02117 42.8343...",POINT (-68.08117 42.8345),2019-01-25,s1,0.0,10000.000000,1.0,10000.000000,4.000000e+07
1,s02-s10000m,"LINESTRING (-68.14233 42.83483, -68.1445 42.83...",POINT (-68.2035 42.835),2019-01-25,s1,10000.0,20000.000000,,10000.000000,4.000000e+07
2,s03-s10000m,"LINESTRING (-68.26468 42.83517, -68.2675 42.83...",POINT (-68.32585 42.83533),2019-01-25,s1,20000.0,30000.000000,2.0,10000.000000,4.000000e+07
3,s04-s10000m,"LINESTRING (-68.38703 42.83533, -68.39017 42.8...",POINT (-68.44821 42.83533),2019-01-25,s1,30000.0,40000.000000,5.0,10000.000000,4.000000e+07
4,s05-s10000m,"LINESTRING (-68.50938 42.83533, -68.50944 42.8...",POINT (-68.50941 42.83533),2019-01-25,s1,40000.0,40004.642260,,4.642260,1.856904e+04
...,...,...,...,...,...,...,...,...,...,...
68,s69-s10000m,"LINESTRING (-67.22007 42.41683, -67.221 42.416...",POINT (-67.28082 42.417),2019-02-05,s4,280000.0,290000.000000,,10000.000000,4.000000e+07
69,s70-s10000m,"LINESTRING (-67.34158 42.41717, -67.3455 42.41...",POINT (-67.40234 42.41717),2019-02-05,s4,290000.0,300000.000000,,10000.000000,4.000000e+07
70,s71-s10000m,"LINESTRING (-67.46309 42.417, -67.46533 42.417...",POINT (-67.52385 42.417),2019-02-05,s4,300000.0,310000.000000,2.0,10000.000000,4.000000e+07
71,s72-s10000m,"LINESTRING (-67.58461 42.417, -67.58733 42.417...",POINT (-67.64537 42.41683),2019-02-05,s4,310000.0,320000.000000,4.0,10000.000000,4.000000e+07


#### Extract coordinates
The coordinates of the ```midpoint``` column are now in lat-lon but they are still in a ```shapely.geometry``` which may complicate things later, if we want to, for example, get values for environmental variables from a satellite dataset at the coordinates of each centroid. So, we can use the ```Samples.coords()``` method to make two new columns: ```midpoint_lat``` and ```midpoint_lon```.
<br>We then print the samples to see these new columns.

In [28]:
u_samples.coords()
u_samples.samples

Unnamed: 0,segment_id,line,midpoint,midpoint_lon,midpoint_lat,date,section_id,dfbsec_beg,dfbsec_end,individuals,se_length,se_area
0,s01-s10000m,"LINESTRING (-68.02 42.83433, -68.02117 42.8343...",POINT (-68.08117 42.8345),-68.081169,42.83450,2019-01-25,s1,0.0,10000.000000,1.0,10000.000000,4.000000e+07
1,s02-s10000m,"LINESTRING (-68.14233 42.83483, -68.1445 42.83...",POINT (-68.2035 42.835),-68.203503,42.83500,2019-01-25,s1,10000.0,20000.000000,,10000.000000,4.000000e+07
2,s03-s10000m,"LINESTRING (-68.26468 42.83517, -68.2675 42.83...",POINT (-68.32585 42.83533),-68.325849,42.83533,2019-01-25,s1,20000.0,30000.000000,2.0,10000.000000,4.000000e+07
3,s04-s10000m,"LINESTRING (-68.38703 42.83533, -68.39017 42.8...",POINT (-68.44821 42.83533),-68.448206,42.83533,2019-01-25,s1,30000.0,40000.000000,5.0,10000.000000,4.000000e+07
4,s05-s10000m,"LINESTRING (-68.50938 42.83533, -68.50944 42.8...",POINT (-68.50941 42.83533),-68.509413,42.83533,2019-01-25,s1,40000.0,40004.642260,,4.642260,1.856904e+04
...,...,...,...,...,...,...,...,...,...,...,...,...
68,s69-s10000m,"LINESTRING (-67.22007 42.41683, -67.221 42.416...",POINT (-67.28082 42.417),-67.280824,42.41700,2019-02-05,s4,280000.0,290000.000000,,10000.000000,4.000000e+07
69,s70-s10000m,"LINESTRING (-67.34158 42.41717, -67.3455 42.41...",POINT (-67.40234 42.41717),-67.402335,42.41717,2019-02-05,s4,290000.0,300000.000000,,10000.000000,4.000000e+07
70,s71-s10000m,"LINESTRING (-67.46309 42.417, -67.46533 42.417...",POINT (-67.52385 42.417),-67.523850,42.41700,2019-02-05,s4,300000.0,310000.000000,2.0,10000.000000,4.000000e+07
71,s72-s10000m,"LINESTRING (-67.58461 42.417, -67.58733 42.417...",POINT (-67.64537 42.41683),-67.645368,42.41683,2019-02-05,s4,310000.0,320000.000000,4.0,10000.000000,4.000000e+07


### Save a ```Samples``` object
```Samples``` objects have an inbuilt ```save``` method to save the samples as a CSV or GPKG. The name of the saved file will be the name of the ```Samples``` object.

In [29]:
print(f'The saved file will be called \'{u_samples.name}.csv\'')
u_samples.save(
    folder=output_folder,
    filetype='gpkg'
)

The saved file will be called 'samples-sightings+effort-x-segments-s10000m.csv'


## Samples - point approach

### Make a ```Samples``` object...
If using the point approach, we can make a ```Samples``` object from a ```DataPoints``` object with the ```Samples.point()``` class method.

#### ...from a ```DataPoints``` object

In [30]:
u_samples = Samples.point(
    datapoints=u_sightings,
    presences=u_presences,
    absences=u_absences,
    cols=['individuals'])

### Access a ```Samples``` object's attributes
A ```Samples``` object, regardless of how it was made, will have three attributes (```name```, ```parameters```, and ```samples```) that we can access as follows.

In [31]:
u_samples.name

'samples-presences-sightings-+-absences-as-10000m-5day'

In [32]:
u_samples.parameters

{'approach': 'point',
 'resampled': 'datapoints',
 'presences_name': 'presences-sightings',
 'presences_crs': 'EPSG:32619',
 'presences_sp_threshold': 10000,
 'presences_tm_threshold': 5,
 'presences_tm_unit': 'day',
 'absences_name': 'absences-as-10000m-5day',
 'absences_var': 'along',
 'absences_target': 20,
 'presencezones_crs': 'EPSG:32619',
 'presencezones_sp_threshold': 10000,
 'presencezones_tm_threshold': 5,
 'presencezones_tm_unit': 'day',
 'absences_sp_threshold': 10000,
 'absences_tm_threshold': 5,
 'absences_tm_unit': 'day'}

In [33]:
u_samples.samples

Unnamed: 0,point_id,point,date,datapoint_id,p-a,individuals
0,p01,POINT (579166.78 4742872.701),2019-01-25,d01,1,1.0
1,p03,POINT (548599.876 4742700.214),2019-01-25,d03,1,5.0
2,p04,POINT (520909.741 4714855.058),2019-02-02,d04,1,1.0
3,p05,POINT (532548.249 4714899.835),2019-02-02,d05,1,2.0
4,p07,POINT (504710.41 4705553.392),2019-02-02,d07,1,3.0
5,p08,POINT (654449.136 4716189.584),2019-02-05,d08,1,5.0
6,p10,POINT (643532.681 4716066.52),2019-02-05,d10,1,1.0
7,p11,POINT (629124.489 4706545.106),2019-02-05,d11,1,3.0
8,p12,POINT (620560.818 4697116.949),2019-02-05,d12,1,2.0
9,a02,POINT (523082.116 4742605.174),2019-01-25,,0,


### Modify a ```Samples``` object
Before we save our ```Samples``` object, there are a few things that we might want to modify.

#### Reproject
We can reproject our ```Samples``` object to a CRS of our choosing with the ```Samples.reproject()``` method which takes a single argument: the target CRS.
<br>In our example below, we reproject our ```Samples``` object to EPSG:4326 and then print it to see that the ```point``` column is now in lat-lon coordinates.

In [34]:
u_samples.reproject(crs_target='EPSG:4326')
u_samples.samples

Success: reprojected to CRS "EPSG:4326"


Unnamed: 0,point_id,point,date,datapoint_id,p-a,individuals
0,p01,POINT (-68.03133 42.83433),2019-01-25,d01,1,1.0
1,p03,POINT (-68.40533 42.83533),2019-01-25,d03,1,5.0
2,p04,POINT (-68.74517 42.58583),2019-02-02,d04,1,1.0
3,p05,POINT (-68.60333 42.58583),2019-02-02,d05,1,2.0
4,p07,POINT (-68.94267 42.50233),2019-02-02,d07,1,3.0
5,p08,POINT (-67.11783 42.58267),2019-02-05,d08,1,5.0
6,p10,POINT (-67.25083 42.58367),2019-02-05,d10,1,1.0
7,p11,POINT (-67.4285 42.5005),2019-02-05,d11,1,3.0
8,p12,POINT (-67.53467 42.417),2019-02-05,d12,1,2.0
9,a02,POINT (-68.71756 42.83567),2019-01-25,,0,


#### Extract coordinates
The coordinates of the ```point``` column are now in lat-lon but they are still in a ```shapely.geometry``` which may complicate things later, if we want to, for example, get values for environmental variables from a satellite dataset at the coordinates of each centroid. So, we can use the ```Samples.coords()``` method to make two new columns: ```point_lat``` and ```point_lon```.
<br>We then print the samples to see these new columns.

In [35]:
u_samples.coords()
u_samples.samples

Unnamed: 0,point_id,point,point_lon,point_lat,date,datapoint_id,p-a,individuals
0,p01,POINT (-68.03133 42.83433),-68.03133,42.83433,2019-01-25,d01,1,1.0
1,p03,POINT (-68.40533 42.83533),-68.40533,42.83533,2019-01-25,d03,1,5.0
2,p04,POINT (-68.74517 42.58583),-68.74517,42.58583,2019-02-02,d04,1,1.0
3,p05,POINT (-68.60333 42.58583),-68.60333,42.58583,2019-02-02,d05,1,2.0
4,p07,POINT (-68.94267 42.50233),-68.94267,42.50233,2019-02-02,d07,1,3.0
5,p08,POINT (-67.11783 42.58267),-67.11783,42.58267,2019-02-05,d08,1,5.0
6,p10,POINT (-67.25083 42.58367),-67.25083,42.58367,2019-02-05,d10,1,1.0
7,p11,POINT (-67.4285 42.5005),-67.4285,42.5005,2019-02-05,d11,1,3.0
8,p12,POINT (-67.53467 42.417),-67.53467,42.417,2019-02-05,d12,1,2.0
9,a02,POINT (-68.71756 42.83567),-68.717564,42.83567,2019-01-25,,0,


### Save a ```Samples``` object
```Samples``` objects have an inbuilt ```save``` method to save the samples as a CSV or GPKG. The name of the saved file will be the name of the ```Samples``` object.

In [36]:
print(f'The saved file will be called \'{u_samples.name}.csv\'')
u_samples.save(
    folder=output_folder,
    filetype='gpkg'
)

The saved file will be called 'samples-presences-sightings-+-absences-as-10000m-5day.csv'
