# 03 - Spatial buffer bird occurrences


### Description
The purpose of this notebook is to combine land/crop cover in areas of bird occurrences, based on a point spatial buffer

### Inputs

- eBird occurrences produced by Notebook 01: `eBird_sample.csv`

### Outputs

- eBird data with new columns containing number of pixels and area for each landuse/landcover (including crops)

## 1. Read eBird data (occurrences, taxonomy)
- eBird data

In [None]:
# import modules
import pandas as pd
import numpy as np
from pandas import DataFrame
import geopandas as gpd
from pyproj import Proj, CRS,transform
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import warnings
warnings.filterwarnings('ignore')

Read the eBird occurrence data:

In [None]:
# eBird selected occurrence data
eBird_sample = pd.read_csv('../process_data/eBird_sample.csv', low_memory=False)
eBird_sample["OBSERVATION DATE"] = pd.to_datetime(eBird_sample["OBSERVATION DATE"])
eBird_sample.head()

## 2. Map selected records

Represent eBird data on the map. Unique point locations are identified by the sampling event identifier.

In [None]:
# create points geopandas from eBird, group by sampling event identifier

eb = eBird_sample.groupby(['SAMPLING EVENT IDENTIFIER']).first()

crs = CRS('EPSG:4326')
points = gpd.GeoDataFrame(
    eb, geometry=gpd.points_from_xy(eb['LONGITUDE'], eb['LATITUDE']), crs=crs)

points['geometry'].explore()

In [None]:
# do a quick preview of the point table
eb

In [None]:
# convert the coordinate reference system to metric coordinates
points = points.to_crs("EPSG:5070")

We will create buffers for each point location with a 1 km radius distance. The buffer distance was thought to include the average range of movement of the species in the region, in terms of feeding behaviour.

In [None]:
# create a buffer with 1km radius
points_buf = points
points_buf['geometry'] = points_buf.geometry.buffer(1000)
points_buf["OBSERVATION DATE"] = points_buf["OBSERVATION DATE"].astype('string')
points_buf.dtypes
points_buf.explore()

## 3. Determine land or crop cover for each point

This step will calculate the percentage of land cover for each land use, in the buffer of each bird occurrence point.

In [None]:
# import additional modules
import rasterio
import rasterstats
from rasterio.plot import show
from rasterstats import zonal_stats

# define input crop cover file
crop_cover = '../process_data/2022_30m_cdls_clip.tif'
src = rasterio.open(crop_cover)

In [None]:
# plot buffers on the area
fig, ax = plt.subplots(1,1)
show(src, ax = ax)
points_buf.plot(ax = ax)

In [None]:
# determine counts of pixels of each crops in buffer areas
stats = zonal_stats(points_buf, crop_cover, categorical=True, geom_type='point')

In [None]:
# quick preview to check if everything is in place
stats[0]

In [None]:
# add the stats for each buffer to the table 

points_buf['stats_cover'] = stats

In [None]:
# function to calculate areas in hectares 

def sum_dict(d):
    return sum(d.values())*900/10**4

In [None]:
# add a column with the total area of the buffer in hectares, based on the amount of pixels
points_buf['area_buff'] = points_buf['stats_cover'].apply(sum_dict)

In [None]:
# save to csv
out_file = '../process_data/points_buf_stats.csv'
df1 = pd.DataFrame(points_buf[['LOCALITY', 'LOCALITY ID','geometry','stats_cover', 'area_buff']])
df1.to_csv(out_file) 

In [None]:
# do a wuick preview of the pint table
df1

## 4. Combine crop cover stats with eBird data

Combine the values calculated for each point in the initial bird occurrence table.  

In [None]:
# merge bird occurrences with crop cover stats at the buffer level
bird_data = pd.merge(eBird_sample, df1, left_on=['SAMPLING EVENT IDENTIFIER'], right_on=['SAMPLING EVENT IDENTIFIER'])


In [None]:
bird_data

In [None]:
# save merged data to csv
out_file = '../process_data/eBird_sel.csv'
df1 = pd.DataFrame(bird_data)
df1.to_csv(out_file)