# CCFRP data: convert grid-level data to DwC

Draws code from earlier conversion draft: CCFRP_conversion.ipynb

### Resources
- https://dwc.tdwg.org/terms/
- https://tools.gbif.org/dwca-validator/extension.do?id=dwc:Occurrence
- https://www.gbif.org/data-quality-requirements-occurrences

### Preprocessing
CCFRP data were originally shared as an excel file (CPUE_IDcell_Summary_Tables_2019.xls) with multiple sheets. Saved each sheet as a .csv:
1. CPUE.per.IDcell_2019 --> Grid-level_CPUE.csv
2. Counts.per.IDcell_2019 --> Grid-level_Counts.csv

This could be automated if desired.


In [1]:
## Imports

import pandas as pd
import numpy as np
import random

from datetime import datetime # for handline dates

In [2]:
## Ensure my general functions for the MPA data integration project can be imported, and import them

import sys
sys.path.insert(0, "C:\\Users\\dianalg\\PycharmProjects\\PythonScripts\\MPA data integration")

import WoRMS # functions for querying WoRMS REST API

### Load data

In [3]:
## Load grid-level count data

path = 'C:\\Users\\dianalg\\Documents\\Work\\MBARI\\MPA Data Integration\\CCFRP\\'
filename = 'Grid-level_Counts.csv'
data = pd.read_csv(path+filename)

data.head()

Unnamed: 0,ID.Cell.per.Trip,Date,Area,Site,Year,Total.Angler.Hours,Grid.Cell.ID,Lat Center Point,Lon Center Point,Lat 1,...,Vermilion Rockfish,White Croaker,White Seabass,Widow Rockfish,Wolf Eel,Yelloweye Rockfish,Yellowfin Croaker,Yellowtail Jack,Yellowtail Rockfish,Total
0,AIM09181901,9/18/2019,Anacapa Island,MPA,2019,5.75,AI01,34.0215,-119.3668,34.0189,...,0,0,0,0,0,0,0,0,0,123
1,AIM09191901,9/19/2019,Anacapa Island,MPA,2019,6.033333,AI01,34.0215,-119.3668,34.0189,...,0,0,0,0,0,0,0,0,0,121
2,AIM10251701,10/25/2017,Anacapa Island,MPA,2017,7.329722,AI01,34.0215,-119.3668,34.0189,...,0,0,0,0,0,0,0,0,0,161
3,AIM10291801,10/29/2018,Anacapa Island,MPA,2018,4.416667,AI01,34.0215,-119.3668,34.0189,...,0,0,0,0,0,0,0,0,0,33
4,AIM10181802,10/18/2018,Anacapa Island,MPA,2018,4.916667,AI02,34.0204,-119.3723,34.022,...,0,0,0,0,0,0,0,0,0,59


In [4]:
## Load scientific names

path = 'C:\\Users\\dianalg\\PycharmProjects\\PythonScripts\\MPA data integration\\CCFRP\\'
filename = 'CCFRP_common_to_scientific.csv'
species = pd.read_csv(path+filename)

species.head()

Unnamed: 0,common_names,scientific_names
0,Bigmouth Sole,Hippoglossina stomata
1,Longfin Sanddab,Citharichthys xanthostigma
2,Pacific Halibut,Hippoglossus stenolepis
3,Pelagic Stingray,Pteroplatytrygon violacea
4,Northern Anchovy,Engraulis mordax


In [5]:
## Load CPUE data

path = 'C:\\Users\\dianalg\\Documents\\Work\\MBARI\\MPA Data Integration\\CCFRP\\'
filename = 'Grid-level_CPUE.csv'
cpue = pd.read_csv(path+filename)

cpue.head()

Unnamed: 0,ID.Cell.per.Trip,Date,Area,Site,Year,Total.Angler.Hours,Grid.Cell.ID,Lat Center Point,Lon Center Point,Lat 1,...,Vermilion Rockfish,White Croaker,White Seabass,Widow Rockfish,Wolf Eel,Yelloweye Rockfish,Yellowfin Croaker,Yellowtail Jack,Yellowtail Rockfish,Total
0,AIM09181901,9/18/2019,Anacapa Island,MPA,2019,5.75,AI01,34.0215,-119.3668,34.0189,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.391304
1,AIM09191901,9/19/2019,Anacapa Island,MPA,2019,6.033333,AI01,34.0215,-119.3668,34.0189,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,20.055249
2,AIM10251701,10/25/2017,Anacapa Island,MPA,2017,7.329722,AI01,34.0215,-119.3668,34.0189,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.965362
3,AIM10291801,10/29/2018,Anacapa Island,MPA,2018,4.416667,AI01,34.0215,-119.3668,34.0189,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.471698
4,AIM10181802,10/18/2018,Anacapa Island,MPA,2018,4.916667,AI02,34.0204,-119.3723,34.022,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,12.0


### Convert from wide to long format

In [6]:
## Drop Total column

data.drop('Total', axis=1, inplace=True)

In [7]:
## Melt data

data_long = pd.melt(data, id_vars=data.columns[0:17].tolist(), var_name='species_common_name', value_name='count')
data_long.head()

Unnamed: 0,ID.Cell.per.Trip,Date,Area,Site,Year,Total.Angler.Hours,Grid.Cell.ID,Lat Center Point,Lon Center Point,Lat 1,Lon 1,Lat 2,Lon 2,Lat 3,Lon 3,Lat 4,Lon 4,species_common_name,count
0,AIM09181901,9/18/2019,Anacapa Island,MPA,2019,5.75,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0
1,AIM09191901,9/19/2019,Anacapa Island,MPA,2019,6.033333,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0
2,AIM10251701,10/25/2017,Anacapa Island,MPA,2017,7.329722,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0
3,AIM10291801,10/29/2018,Anacapa Island,MPA,2018,4.416667,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0
4,AIM10181802,10/18/2018,Anacapa Island,MPA,2018,4.916667,AI02,34.0204,-119.3723,34.022,-119.3757,34.0232,-119.3705,34.0189,-119.369,34.0176,-119.3742,Barred Sand Bass,0


### Join to obtain scientific names

In [8]:
## Merge

data_long = data_long.merge(species, how='left', left_on='species_common_name', right_on='common_names')
data_long.head()

Unnamed: 0,ID.Cell.per.Trip,Date,Area,Site,Year,Total.Angler.Hours,Grid.Cell.ID,Lat Center Point,Lon Center Point,Lat 1,...,Lat 2,Lon 2,Lat 3,Lon 3,Lat 4,Lon 4,species_common_name,count,common_names,scientific_names
0,AIM09181901,9/18/2019,Anacapa Island,MPA,2019,5.75,AI01,34.0215,-119.3668,34.0189,...,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0,Barred Sand Bass,Paralabrax nebulifer
1,AIM09191901,9/19/2019,Anacapa Island,MPA,2019,6.033333,AI01,34.0215,-119.3668,34.0189,...,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0,Barred Sand Bass,Paralabrax nebulifer
2,AIM10251701,10/25/2017,Anacapa Island,MPA,2017,7.329722,AI01,34.0215,-119.3668,34.0189,...,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0,Barred Sand Bass,Paralabrax nebulifer
3,AIM10291801,10/29/2018,Anacapa Island,MPA,2018,4.416667,AI01,34.0215,-119.3668,34.0189,...,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0,Barred Sand Bass,Paralabrax nebulifer
4,AIM10181802,10/18/2018,Anacapa Island,MPA,2018,4.916667,AI02,34.0204,-119.3723,34.022,...,34.0232,-119.3705,34.0189,-119.369,34.0176,-119.3742,Barred Sand Bass,0,Barred Sand Bass,Paralabrax nebulifer


In [9]:
## Drop unnecessary columns

data_long.drop(['species_common_name', 'common_names'], axis=1, inplace=True)
data_long.head()

Unnamed: 0,ID.Cell.per.Trip,Date,Area,Site,Year,Total.Angler.Hours,Grid.Cell.ID,Lat Center Point,Lon Center Point,Lat 1,Lon 1,Lat 2,Lon 2,Lat 3,Lon 3,Lat 4,Lon 4,count,scientific_names
0,AIM09181901,9/18/2019,Anacapa Island,MPA,2019,5.75,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,0,Paralabrax nebulifer
1,AIM09191901,9/19/2019,Anacapa Island,MPA,2019,6.033333,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,0,Paralabrax nebulifer
2,AIM10251701,10/25/2017,Anacapa Island,MPA,2017,7.329722,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,0,Paralabrax nebulifer
3,AIM10291801,10/29/2018,Anacapa Island,MPA,2018,4.416667,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,0,Paralabrax nebulifer
4,AIM10181802,10/18/2018,Anacapa Island,MPA,2018,4.916667,AI02,34.0204,-119.3723,34.022,-119.3757,34.0232,-119.3705,34.0189,-119.369,34.0176,-119.3742,0,Paralabrax nebulifer


### Assemble count data

In [10]:
### Build eventID and put it in a new data frame

eventID = data_long['ID.Cell.per.Trip']
converted = pd.DataFrame({'eventID':eventID})
converted.head()

Unnamed: 0,eventID
0,AIM09181901
1,AIM09191901
2,AIM10251701
3,AIM10291801
4,AIM10181802


In [11]:
## Format dates and add eventDate

eventDate = [datetime.strptime(dt, '%m/%d/%Y').date().isoformat() for dt in data_long['Date']]
converted['eventDate'] = eventDate
converted.head()

Unnamed: 0,eventID,eventDate
0,AIM09181901,2019-09-18
1,AIM09191901,2019-09-19
2,AIM10251701,2017-10-25
3,AIM10291801,2018-10-29
4,AIM10181802,2018-10-18


In [12]:
## Add locality and habitat

converted['locality'] = data_long['Area']
converted['habitat'] = data_long['Site']

# Change MPA and REF to something more interpretable
habitat_dict = {
    'REF':'fished area',
    'MPA':'marine protected area'
}
converted['habitat'].replace(habitat_dict, inplace=True)
converted.head()

Unnamed: 0,eventID,eventDate,locality,habitat
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area


In [13]:
%%time

## Add the bounding box of the grid cell as footprintWKT

bb = ['POLYGON ((' + str(data_long['Lat 3'].iloc[i]) + ' ' + str(data_long['Lon 3'].iloc[i]) + ', ' + \
        str(data_long['Lat 1'].iloc[i]) + ' ' + str(data_long['Lon 1'].iloc[i]) + ', ' + \
        str(data_long['Lat 2'].iloc[i]) + ' ' + str(data_long['Lon 2'].iloc[i]) + ', ' + \
        str(data_long['Lat 4'].iloc[i]) + ' ' + str(data_long['Lon 4'].iloc[i]) + ', ' + \
        str(data_long['Lat 3'].iloc[i]) + ' ' + str(data_long['Lon 3'].iloc[i]) + '))' for i in range(data_long.shape[0])]
converted['footprintWKT'] = bb
converted.head()

Wall time: 16.8 s


Unnamed: 0,eventID,eventDate,locality,habitat,footprintWKT
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689..."
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689..."
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689..."
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689..."
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area,"POLYGON ((34.0189 -119.369, 34.022 -119.3757, ..."


In [14]:
## Add decimal latitude and decimal longitude

converted['decimalLatitude'] = data_long['Lat Center Point']
converted['decimallongitude'] = data_long['Lon Center Point']
converted.head()

Unnamed: 0,eventID,eventDate,locality,habitat,footprintWKT,decimalLatitude,decimallongitude
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area,"POLYGON ((34.0189 -119.369, 34.022 -119.3757, ...",34.0204,-119.3723


In [15]:
## Add coordinateUncertaintyInMeters

converted['coordinateUncertaintyInMeters'] = 354 # Change to 361 if a GPS error of +- 5 m is added
converted.head()

Unnamed: 0,eventID,eventDate,locality,habitat,footprintWKT,decimalLatitude,decimallongitude,coordinateUncertaintyInMeters
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area,"POLYGON ((34.0189 -119.369, 34.022 -119.3757, ...",34.0204,-119.3723,354


#### Access WoRMS to add species information

In [16]:
## Get unique scientific names, remove nan's

sci_names = data_long['scientific_names'].dropna().unique()

In [17]:
## Replace NaN values in scientific_names with Teleostei

data_long[data_long['scientific_names'].isnull() == True] = 'Teleostei'

In [18]:
%%time

## Call run_get_worms_from_scientific_name

name_id_dict, name_name_dict, name_taxid_dict = WoRMS.run_get_worms_from_scientific_name(sci_names)

Url didn't work for Chromis punctipinnus checking:  Chromis
Url didn't work for Sebastes serranoides or flavidus checking:  Sebastes
Wall time: 59.5 s


In [19]:
## Add scientific name-related columns

converted['scientificName'] = data_long['scientific_names']

converted['scientificNameID'] = data_long['scientific_names']
converted['scientificNameID'].replace(name_id_dict, inplace=True)

converted['taxonID'] = data_long['scientific_names']
converted['taxonID'].replace(name_taxid_dict, inplace=True)
converted.head()

Unnamed: 0,eventID,eventDate,locality,habitat,footprintWKT,decimalLatitude,decimallongitude,coordinateUncertaintyInMeters,scientificName,scientificNameID,taxonID
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area,"POLYGON ((34.0189 -119.369, 34.022 -119.3757, ...",34.0204,-119.3723,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059


In [20]:
## Create occurrenceRemarks to handle Sebastes serranoides or flavidus species name

occurrenceRemarks = ['Sebastes serranoides or Sebastes flavidus' if name == 'Sebastes serranoides or flavidus' else np.nan for name in converted['scientificName']]

In [21]:
## Replace scientificName using name_name_dict

converted['scientificName'].replace(name_name_dict, inplace=True)

In [22]:
## Add final name-related columns

converted['nameAccordingTo'] = 'WoRMS'
converted['occurrenceStatus'] = 'present'
converted['basisOfRecord'] = 'HumanObservation'
converted['occurrenceRemarks'] = occurrenceRemarks

converted.head()

Unnamed: 0,eventID,eventDate,locality,habitat,footprintWKT,decimalLatitude,decimallongitude,coordinateUncertaintyInMeters,scientificName,scientificNameID,taxonID,nameAccordingTo,occurrenceStatus,basisOfRecord,occurrenceRemarks
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area,"POLYGON ((34.0189 -119.369, 34.022 -119.3757, ...",34.0204,-119.3723,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,


#### Add count data

In [23]:
## Add count data

converted['individualCount'] = data_long['count']
converted.head()

Unnamed: 0,eventID,eventDate,locality,habitat,footprintWKT,decimalLatitude,decimallongitude,coordinateUncertaintyInMeters,scientificName,scientificNameID,taxonID,nameAccordingTo,occurrenceStatus,basisOfRecord,occurrenceRemarks,individualCount
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,,0
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,,0
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,,0
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,,0
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area,"POLYGON ((34.0189 -119.369, 34.022 -119.3757, ...",34.0204,-119.3723,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,present,HumanObservation,,0


In [24]:
## Update occurrenceStatus based on count

converted.loc[converted['individualCount'] == 0, ['occurrenceStatus']] = 'absent'

#### Add CPUE data

In [25]:
## Perform initial processing steps and convert to long-form

# Drop species 'Total'
cpue.drop('Total', axis=1, inplace=True)

## Melt data
cpue_long = pd.melt(cpue, id_vars=data.columns[0:17].tolist(), var_name='species_common_name', value_name='cpue')
cpue_long.head()

Unnamed: 0,ID.Cell.per.Trip,Date,Area,Site,Year,Total.Angler.Hours,Grid.Cell.ID,Lat Center Point,Lon Center Point,Lat 1,Lon 1,Lat 2,Lon 2,Lat 3,Lon 3,Lat 4,Lon 4,species_common_name,cpue
0,AIM09181901,9/18/2019,Anacapa Island,MPA,2019,5.75,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0.0
1,AIM09191901,9/19/2019,Anacapa Island,MPA,2019,6.033333,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0.0
2,AIM10251701,10/25/2017,Anacapa Island,MPA,2017,7.329722,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0.0
3,AIM10291801,10/29/2018,Anacapa Island,MPA,2018,4.416667,AI01,34.0215,-119.3668,34.0189,-119.3689,34.0233,-119.37,34.0242,-119.3647,34.0197,-119.3636,Barred Sand Bass,0.0
4,AIM10181802,10/18/2018,Anacapa Island,MPA,2018,4.916667,AI02,34.0204,-119.3723,34.022,-119.3757,34.0232,-119.3705,34.0189,-119.369,34.0176,-119.3742,Barred Sand Bass,0.0


In [26]:
## Add CPUE data to converted

converted['measurmentType'] = 'catch per unit effort'
converted['measurementValue'] = cpue_long['cpue']
converted['measurementUnit'] = 'number of fish per angler hour'

converted.head()

Unnamed: 0,eventID,eventDate,locality,habitat,footprintWKT,decimalLatitude,decimallongitude,coordinateUncertaintyInMeters,scientificName,scientificNameID,taxonID,nameAccordingTo,occurrenceStatus,basisOfRecord,occurrenceRemarks,individualCount,measurmentType,measurementValue,measurementUnit
0,AIM09181901,2019-09-18,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,absent,HumanObservation,,0,catch per unit effort,0.0,number of fish per angler hour
1,AIM09191901,2019-09-19,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,absent,HumanObservation,,0,catch per unit effort,0.0,number of fish per angler hour
2,AIM10251701,2017-10-25,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,absent,HumanObservation,,0,catch per unit effort,0.0,number of fish per angler hour
3,AIM10291801,2018-10-29,Anacapa Island,marine protected area,"POLYGON ((34.0242 -119.3647, 34.0189 -119.3689...",34.0215,-119.3668,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,absent,HumanObservation,,0,catch per unit effort,0.0,number of fish per angler hour
4,AIM10181802,2018-10-18,Anacapa Island,marine protected area,"POLYGON ((34.0189 -119.369, 34.022 -119.3757, ...",34.0204,-119.3723,354,Paralabrax nebulifer,urn:lsid:marinespecies.org:taxname:282059,282059,WoRMS,absent,HumanObservation,,0,catch per unit effort,0.0,number of fish per angler hour


### Save

In [27]:
## Save

converted.to_csv('CCFRP_grid-level_converted.csv', index=False, na_rep='NaN')

### Remaining questions:

1. Why are the center and corner coordinates different within grid cells? **It's an accident - Rachel sent updated data which are now incorporated**
2. Are coordinates GPS estimates? If so, updated coordinateUncertaintyInMeters. **Nope, they're not.**