# 07 - Calculate daily food intake by birds

#### Description

The goal of this notebook is to calculate the Daily Food Intake (DFI) by birds.

It implements the following steps:

- determine the Daily Energy Expenditure (DEE) for each bird species
- determine the Daily Food Intake (DFI)

#### Inputs

- table with crop fraction for which birds provide service, created in step 06: `eBird_area_data_percent-step3.csv`
- table moisture and energy content of invertebrates, compiled from Crocker et al. (2002)
- table with assimilation efficiency of birds, compiled from Crocker et al. (2002)
   

#### Outputs

- table with DFI and consumption at the nesting/growing season

## 1. Calculate bird Daily Food Intake (DFI)

To calculate DFI, we will use the following expression (Crocker et al, 2002):
```
Daily foor intake (g) = DEE (kj) / Energy in Food (kj/g) x (1 - Moisture) x Assimilation Efficiency
```
where DEE is the Daily Energy Expenditure, which is calculated by the following:
```
Log(DEE) = Log a + b × (log Body weight).
loga = 1.0220
b = 0.6745
```


### 1.1 Prepare tables

The determination of DFI is done in steps:

- load data from csv files, to save time (avoid repeating steps 1 to 4)
- calculate DEE
- calculate assimilation efficiency
- prepare tables, to gain efficiency
    - explode crop data, to create simple tables for temporary and for permanent crops.

In [1]:
# import modules
import pandas as pd
import numpy as np
from pandas import DataFrame
import geopandas as gpd
from pyproj import Proj, CRS,transform
import matplotlib.pyplot as plt

Read data with the fraction that each crop represents in the buffer area:

In [2]:
# read table of occurrences containing crop fraction in the buffer

bird_data_name = pd.read_csv('../process_data/eBird_area_data_percent-step3.csv', low_memory = False)

In [3]:
# increase the numebr of columns that display
pd.set_option('display.max_columns', None)

In [4]:
# function to calculate DEE, uses values for All Birds in DEFRA

def calc_dee(mass):
    intercept = 1.0220
    coef = 0.6745
    logDEE = intercept + (coef * np.log10(mass))
    dee = np.power(logDEE, 10)
    return dee
    

Add Daily Energy Expenditure (DEE) for each species to the occurrence table:

In [5]:
# add DEE to the bird occurrence table
bird_data_name['DEE'] = bird_data_name.apply(lambda x: calc_dee(x['Mass']), axis=1)

We will read a table containing the moisture percentage, energy and fresh weight of invertebrates. This table was extracted from Crocker et al. (2002).

In [6]:
# load energy and moisture table 
energy_moist = pd.read_csv('../external_data/DEFRA/energy_moisture.csv')
energy_moist.head()

Unnamed: 0,ID,pest_taxa,eppocode_order,Energy_food_kj_g,moisture_perc,fresh_weight_kj_g,ref_source
0,1,Coleoptera,1COLEO,21.9,70.5,6.4605,Arthropods
1,2,Lepidoptera,1LEPIO,21.7,79.4,4.4702,Caterpillars
2,3,Isoptera,1ISOPO,21.9,70.5,6.4605,Arthropods
3,4,Orthoptera,1ORTHO,21.9,70.5,6.4605,Arthropods
4,5,Odonata,1ODONO,21.9,70.5,6.4605,Arthropods


We will read a table containing the assimilation efficiency of several bird orders. This table was extracted from Crocker et al. (2002).

In [7]:
# load assimilation table
assim_eff = pd.read_csv('../external_data/DEFRA/assimilation_efficienc.csv')
assim_eff.head()

Unnamed: 0,order,assim_eff,note
0,Struthioniformes,0.76,median
1,Gruiformes,0.34,
2,Ralliformes,0.76,median
3,Charadriiformes,0.69,
4,Lariformes,0.79,


Combine both tables, in order to combinate of energy and moisture content of invertebrates with assimilation efficiency of birds.

In [8]:
# merge energy table with assimilation table
energy_food = pd.merge(energy_moist, assim_eff, how='cross')

# calculate energy in food
energy_food['energy_food'] = energy_food['fresh_weight_kj_g'] * energy_food['assim_eff']
energy_food

Unnamed: 0,ID,pest_taxa,eppocode_order,Energy_food_kj_g,moisture_perc,fresh_weight_kj_g,ref_source,order,assim_eff,note,energy_food
0,1,Coleoptera,1COLEO,21.9,70.5,6.4605,Arthropods,Struthioniformes,0.76,median,4.909980
1,1,Coleoptera,1COLEO,21.9,70.5,6.4605,Arthropods,Gruiformes,0.34,,2.196570
2,1,Coleoptera,1COLEO,21.9,70.5,6.4605,Arthropods,Ralliformes,0.76,median,4.909980
3,1,Coleoptera,1COLEO,21.9,70.5,6.4605,Arthropods,Charadriiformes,0.69,,4.457745
4,1,Coleoptera,1COLEO,21.9,70.5,6.4605,Arthropods,Lariformes,0.79,,5.103795
...,...,...,...,...,...,...,...,...,...,...,...
677,34,Acarida,1ACARO,21.9,70.5,6.4605,Arthropods,Opisthocomiformes,0.76,median,4.909980
678,34,Acarida,1ACARO,21.9,70.5,6.4605,Arthropods,Trochiliformes,0.76,median,4.909980
679,34,Acarida,1ACARO,21.9,70.5,6.4605,Arthropods,Coliiformes,0.76,median,4.909980
680,34,Acarida,1ACARO,21.9,70.5,6.4605,Arthropods,Piciformes,0.64,,4.134720


In [9]:
# rename columns, to facilitate merge
energy_food.rename(columns={"eppocode_order": "eppo_pest", "order": "Order2"}, inplace=True)

In [10]:
%%time
# ensure that columns of crop_pest are lists. Can take 1 min to run.

import ast

bird_data_name['consum_t_ord'] = bird_data_name['consum_t_ord'].apply(ast.literal_eval)
bird_data_name['consum_p_ord'] = bird_data_name['consum_p_ord'].apply(ast.literal_eval) 

CPU times: user 2min 6s, sys: 1.87 s, total: 2min 7s
Wall time: 2min 8s


In [11]:
bird_data_name.columns



Index(['Unnamed: 0.2', 'Unnamed: 0.1', 'Unnamed: 0',
       'GLOBAL UNIQUE IDENTIFIER', 'LAST EDITED DATE', 'TAXONOMIC ORDER_x',
       'CATEGORY', 'TAXON CONCEPT ID_x', 'COMMON NAME', 'SCIENTIFIC NAME_x',
       ...
       'fract_c_e', 'fract_t_e', 'fract_p_e', 'consum_t', 'consum_t_ord',
       'consum_p', 'consum_p_ord', 'keep_t', 'keep_p', 'DEE'],
      dtype='object', length=136)

Next steps are to prepare table structure so that calculations of DFI can be done. This needs to be done separately for temporary and permanent crops. 

In [12]:
# explode the dataframe to create new rows, create table for temporary crops

table_df_t = bird_data_name[['GLOBAL UNIQUE IDENTIFIER', 'Order2', 'Avibase.ID2', 'SAMPLING EVENT IDENTIFIER', \
                             'SCIENTIFIC NAME_x', 'consum_t_ord', 'DEE', 'Annual_crops', 'Permanent_crops', \
                             'Proportion_invertebrates_diet']]
table_df_t = table_df_t.explode('consum_t_ord')


In [13]:
table_df_t

Unnamed: 0,GLOBAL UNIQUE IDENTIFIER,Order2,Avibase.ID2,SAMPLING EVENT IDENTIFIER,SCIENTIFIC NAME_x,consum_t_ord,DEE,Annual_crops,Permanent_crops,Proportion_invertebrates_diet
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,"[AVESA, 1COLEO]",194.086559,1.0,1.0,1.0
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,"[AVESA, 1LEPIO]",194.086559,1.0,1.0,1.0
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,"[AVESA, 1COLEO]",194.086559,1.0,1.0,1.0
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,"[AVESA, 1LEPIO]",194.086559,1.0,1.0,1.0
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,"[AVESA, 1LEPIO]",194.086559,1.0,1.0,1.0
...,...,...,...,...,...,...,...,...,...,...
78937,URN:CornellLabOfOrnithology:EBIRD:OBS394851198,Caprimulgiformes,AVIBASE-9690689D,S29187642,Chordeiles acutipennis,,2194.704516,1.0,1.0,1.0
78938,URN:CornellLabOfOrnithology:EBIRD:OBS394868806,Caprimulgiformes,AVIBASE-9690689D,S29188777,Chordeiles acutipennis,,2194.704516,1.0,1.0,1.0
78939,URN:CornellLabOfOrnithology:EBIRD:OBS395381323,Caprimulgiformes,AVIBASE-9690689D,S29222010,Chordeiles acutipennis,,2194.704516,1.0,1.0,1.0
78940,URN:CornellLabOfOrnithology:EBIRD:OBS320781051,Caprimulgiformes,AVIBASE-9690689D,S23504887,Chordeiles acutipennis,,2194.704516,1.0,1.0,1.0


In [14]:
# funtion to split crops and pests
def split_values(pair):
    if type(pair) != list: # wrong case, returns nan
        a = np.nan
        b = np.nan
    else:
        a = pair[0]
        b = pair[1]
    return a, b

In [15]:
%%time
# split crops and pests in two columns, temporary crops. Can take 40 sec to run
table_df_t['eppo_crop'], table_df_t['eppo_pest'] = \
zip(*table_df_t.apply( lambda x: split_values(x['consum_t_ord']), axis=1))

CPU times: user 1min 6s, sys: 3.69 s, total: 1min 10s
Wall time: 1min 10s


In [16]:
# remove old column, to save memory space
table_df_t.drop(columns=['consum_t_ord'], inplace=True)
table_df_t

Unnamed: 0,GLOBAL UNIQUE IDENTIFIER,Order2,Avibase.ID2,SAMPLING EVENT IDENTIFIER,SCIENTIFIC NAME_x,DEE,Annual_crops,Permanent_crops,Proportion_invertebrates_diet,eppo_crop,eppo_pest
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1COLEO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1LEPIO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1COLEO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1LEPIO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1LEPIO
...,...,...,...,...,...,...,...,...,...,...,...
78937,URN:CornellLabOfOrnithology:EBIRD:OBS394851198,Caprimulgiformes,AVIBASE-9690689D,S29187642,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,
78938,URN:CornellLabOfOrnithology:EBIRD:OBS394868806,Caprimulgiformes,AVIBASE-9690689D,S29188777,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,
78939,URN:CornellLabOfOrnithology:EBIRD:OBS395381323,Caprimulgiformes,AVIBASE-9690689D,S29222010,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,
78940,URN:CornellLabOfOrnithology:EBIRD:OBS320781051,Caprimulgiformes,AVIBASE-9690689D,S23504887,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,


In [17]:
# explode the dataframe to create new rows, create table for permanent crops

table_df_p = bird_data_name[['GLOBAL UNIQUE IDENTIFIER', 'Order2', 'Avibase.ID2', 'SAMPLING EVENT IDENTIFIER', \
                             'SCIENTIFIC NAME_x', 'consum_p_ord', 'DEE', 'Annual_crops', 'Permanent_crops', \
                             'Proportion_invertebrates_diet']]
table_df_p = table_df_p.explode('consum_p_ord')


In [18]:
%%time
# split crops and pests in two columns, permanent crops
table_df_p['eppo_crop'], table_df_p['eppo_pest'] = \
zip(*table_df_p.apply( lambda x: split_values(x['consum_p_ord']), axis=1))

CPU times: user 41.6 s, sys: 1.87 s, total: 43.5 s
Wall time: 43.5 s


In [19]:
# remove old column, to save memory space
table_df_p.drop(columns=['consum_p_ord'], inplace=True)
table_df_p

Unnamed: 0,GLOBAL UNIQUE IDENTIFIER,Order2,Avibase.ID2,SAMPLING EVENT IDENTIFIER,SCIENTIFIC NAME_x,DEE,Annual_crops,Permanent_crops,Proportion_invertebrates_diet,eppo_crop,eppo_pest
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1COLEO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1LEPIO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1LEPIO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1DIPTO
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1HEMIO
...,...,...,...,...,...,...,...,...,...,...,...
78937,URN:CornellLabOfOrnithology:EBIRD:OBS394851198,Caprimulgiformes,AVIBASE-9690689D,S29187642,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,
78938,URN:CornellLabOfOrnithology:EBIRD:OBS394868806,Caprimulgiformes,AVIBASE-9690689D,S29188777,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,
78939,URN:CornellLabOfOrnithology:EBIRD:OBS395381323,Caprimulgiformes,AVIBASE-9690689D,S29222010,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,
78940,URN:CornellLabOfOrnithology:EBIRD:OBS320781051,Caprimulgiformes,AVIBASE-9690689D,S23504887,Chordeiles acutipennis,2194.704516,1.0,1.0,1.0,,


### 1.2. Calculate DFI

The DFI is the ratio between the DEE and energy in food, in this case, pests. We will calculate that for temporary and permanent crops. In the end, both values will combined, weighted by the proportion that each type of crop represents in the point buffer.

In [20]:
# merge tables with energy_food

table_df_t = pd.merge(table_df_t, energy_food, on=['Order2','eppo_pest'])
table_df_p = pd.merge(table_df_p, energy_food, on=['Order2','eppo_pest'])

In [21]:
# calculate DFI
table_df_t['DFI'] = table_df_t['DEE'] / table_df_t['energy_food']
table_df_p['DFI'] = table_df_p['DEE'] / table_df_p['energy_food']

In [22]:
table_df_p

Unnamed: 0,GLOBAL UNIQUE IDENTIFIER,Order2,Avibase.ID2,SAMPLING EVENT IDENTIFIER,SCIENTIFIC NAME_x,DEE,Annual_crops,Permanent_crops,Proportion_invertebrates_diet,eppo_crop,eppo_pest,ID,pest_taxa,Energy_food_kj_g,moisture_perc,fresh_weight_kj_g,ref_source,assim_eff,note,energy_food,DFI
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1COLEO,1,Coleoptera,21.9,70.5,6.4605,Arthropods,0.76,,4.909980,39.528992
1,URN:CornellLabOfOrnithology:EBIRD:OBS1446126023,Passeriformes,AVIBASE-151C2B3F,S111628640,Aphelocoma californica,4631.432930,0.0,1.0,0.4,VITSS,1COLEO,1,Coleoptera,21.9,70.5,6.4605,Arthropods,0.76,,4.909980,943.269205
2,URN:CornellLabOfOrnithology:EBIRD:OBS1446126013,Passeriformes,AVIBASE-69544B59,S111628640,Corvus brachyrhynchos,30783.241737,0.0,1.0,0.4,VITSS,1COLEO,1,Coleoptera,21.9,70.5,6.4605,Arthropods,0.76,,4.909980,6269.524873
3,URN:CornellLabOfOrnithology:EBIRD:OBS1446126017,Passeriformes,AVIBASE-58E25701,S111628640,Pipilo maculatus,1647.591120,0.0,1.0,0.4,VITSS,1COLEO,1,Coleoptera,21.9,70.5,6.4605,Arthropods,0.76,,4.909980,335.559640
4,URN:CornellLabOfOrnithology:EBIRD:OBS1449752065,Passeriformes,AVIBASE-603194D3,S111982653,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1COLEO,1,Coleoptera,21.9,70.5,6.4605,Arthropods,0.76,,4.909980,39.528992
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10939720,URN:CornellLabOfOrnithology:EBIRD:OBS390190243,Gruiformes,AVIBASE-2CC21529,S28870893,Porzana carolina,3896.376028,1.0,1.0,0.4,PRNPS,1RHABO,31,Rhabditida,19.3,84.6,2.9722,Soil invertebrates,0.34,,1.010548,3855.706041
10939721,URN:CornellLabOfOrnithology:EBIRD:OBS390190243,Gruiformes,AVIBASE-2CC21529,S28870893,Porzana carolina,3896.376028,1.0,1.0,0.4,CIDSS,1RHABO,31,Rhabditida,19.3,84.6,2.9722,Soil invertebrates,0.34,,1.010548,3855.706041
10939722,URN:CornellLabOfOrnithology:EBIRD:OBS390190243,Gruiformes,AVIBASE-2CC21529,S28870893,Porzana carolina,3896.376028,1.0,1.0,0.4,CIDSS,1RHABO,31,Rhabditida,19.3,84.6,2.9722,Soil invertebrates,0.34,,1.010548,3855.706041
10939723,URN:CornellLabOfOrnithology:EBIRD:OBS390190243,Gruiformes,AVIBASE-2CC21529,S28870893,Porzana carolina,3896.376028,1.0,1.0,0.4,CIDSS,1RHABO,31,Rhabditida,19.3,84.6,2.9722,Soil invertebrates,0.34,,1.010548,3855.706041


In [23]:
# add column with crop type

table_df_t['crop_type'] = 'temp'
table_df_p['crop_type'] = 'perm'

In [24]:
table_df_p.columns

Index(['GLOBAL UNIQUE IDENTIFIER', 'Order2', 'Avibase.ID2',
       'SAMPLING EVENT IDENTIFIER', 'SCIENTIFIC NAME_x', 'DEE', 'Annual_crops',
       'Permanent_crops', 'Proportion_invertebrates_diet', 'eppo_crop',
       'eppo_pest', 'ID', 'pest_taxa', 'Energy_food_kj_g', 'moisture_perc',
       'fresh_weight_kj_g', 'ref_source', 'assim_eff', 'note', 'energy_food',
       'DFI', 'crop_type'],
      dtype='object')

In [25]:
#concatenate the two tables
table_DFI = pd.concat([table_df_t, table_df_p])

In [26]:
# drop columns not needed anymore
table_DFI.drop(columns=['ID', 'Energy_food_kj_g', 'moisture_perc', 'fresh_weight_kj_g', 'assim_eff', 'energy_food'], inplace=True)


In [27]:
table_DFI = table_DFI.drop_duplicates()

In [28]:
table_DFI

Unnamed: 0,GLOBAL UNIQUE IDENTIFIER,Order2,Avibase.ID2,SAMPLING EVENT IDENTIFIER,SCIENTIFIC NAME_x,DEE,Annual_crops,Permanent_crops,Proportion_invertebrates_diet,eppo_crop,eppo_pest,pest_taxa,ref_source,note,DFI,crop_type
0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1COLEO,Coleoptera,Arthropods,,39.528992,temp
5,URN:CornellLabOfOrnithology:EBIRD:OBS1446126023,Passeriformes,AVIBASE-151C2B3F,S111628640,Aphelocoma californica,4631.432930,0.0,1.0,0.4,AVESA,1COLEO,Coleoptera,Arthropods,,943.269205,temp
10,URN:CornellLabOfOrnithology:EBIRD:OBS1446126013,Passeriformes,AVIBASE-69544B59,S111628640,Corvus brachyrhynchos,30783.241737,0.0,1.0,0.4,AVESA,1COLEO,Coleoptera,Arthropods,,6269.524873,temp
15,URN:CornellLabOfOrnithology:EBIRD:OBS1446126017,Passeriformes,AVIBASE-58E25701,S111628640,Pipilo maculatus,1647.591120,0.0,1.0,0.4,AVESA,1COLEO,Coleoptera,Arthropods,,335.559640,temp
20,URN:CornellLabOfOrnithology:EBIRD:OBS1449752065,Passeriformes,AVIBASE-603194D3,S111982653,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1COLEO,Coleoptera,Arthropods,,39.528992,temp
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10939715,URN:CornellLabOfOrnithology:EBIRD:OBS745674065,Gruiformes,AVIBASE-2CC21529,S55455800,Porzana carolina,3896.376028,1.0,1.0,0.4,OLVEU,1RHABO,Rhabditida,Soil invertebrates,,3855.706041,perm
10939717,URN:CornellLabOfOrnithology:EBIRD:OBS390190243,Gruiformes,AVIBASE-2CC21529,S28870893,Porzana carolina,3896.376028,1.0,1.0,0.4,PRNSS,1RHABO,Rhabditida,Soil invertebrates,,3855.706041,perm
10939718,URN:CornellLabOfOrnithology:EBIRD:OBS390190243,Gruiformes,AVIBASE-2CC21529,S28870893,Porzana carolina,3896.376028,1.0,1.0,0.4,PRNPS,1RHABO,Rhabditida,Soil invertebrates,,3855.706041,perm
10939719,URN:CornellLabOfOrnithology:EBIRD:OBS390190243,Gruiformes,AVIBASE-2CC21529,S28870893,Porzana carolina,3896.376028,1.0,1.0,0.4,VITSS,1RHABO,Rhabditida,Soil invertebrates,,3855.706041,perm


In [29]:
table_DFI.to_csv('../process_data/table_DFI_1.csv')

### 1.3. Determine how many pests each species eats, to determine fractions

Calculate number of references, i.e., how many pests can each species eat, per species list. Serves to determine the fraction of 
the diet that applies to each pest.

In [30]:
# read table that resulted from 1.1 and 1.2
table_DFI = pd.read_csv('../process_data/table_DFI_1.csv', low_memory=False)

In [31]:
# select columns to use for the analysis

df = table_DFI[['SCIENTIFIC NAME_x', 'Avibase.ID2', 'SAMPLING EVENT IDENTIFIER', 'eppo_pest']]

In [32]:
df2 = df.drop_duplicates()

In [33]:
# create a table with the count number of pests consumed by each bird species in each list
df2['count_eat_ref'] = df2.groupby(['SCIENTIFIC NAME_x', 'SAMPLING EVENT IDENTIFIER'])['eppo_pest'].transform('count')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df2['count_eat_ref'] = df2.groupby(['SCIENTIFIC NAME_x', 'SAMPLING EVENT IDENTIFIER'])['eppo_pest'].transform('count')


In [34]:
# remove duplicates 
df3 = df2[['Avibase.ID2', 'SAMPLING EVENT IDENTIFIER', 'count_eat_ref']].drop_duplicates()


In [35]:
# merge with DFI table
table_DFI = pd.merge(table_DFI, df3, left_on=['Avibase.ID2', 'SAMPLING EVENT IDENTIFIER'], right_on=['Avibase.ID2', 'SAMPLING EVENT IDENTIFIER'])

In [36]:
# calculate fraction of consumption per pest, by dividing by the count of preyed pests
# Also considers the proportion of the diet
table_DFI['frac_DFI'] = table_DFI['DFI'] / table_DFI['count_eat_ref'] * table_DFI['Proportion_invertebrates_diet']

In [37]:
# calculate total consumption per season, by multiplying by the number of days of the season (01st Apr and 30th Jun)
table_DFI['consum_season'] = table_DFI['frac_DFI'] * 91

In [38]:
table_DFI

Unnamed: 0.1,Unnamed: 0,GLOBAL UNIQUE IDENTIFIER,Order2,Avibase.ID2,SAMPLING EVENT IDENTIFIER,SCIENTIFIC NAME_x,DEE,Annual_crops,Permanent_crops,Proportion_invertebrates_diet,eppo_crop,eppo_pest,pest_taxa,ref_source,note,DFI,crop_type,count_eat_ref,frac_DFI,consum_season
0,0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1COLEO,Coleoptera,Arthropods,,39.528992,temp,4,9.882248,899.284564
1,4137116,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1LEPIO,Lepidoptera,Caterpillars,,57.128775,temp,4,14.282194,1299.679640
2,9806702,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1DIPTO,Diptera,Arthropods,,39.528992,temp,4,9.882248,899.284564
3,12395944,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,AVESA,1HEMIO,Hemiptera,Arthropods,,39.528992,temp,4,9.882248,899.284564
4,0,URN:CornellLabOfOrnithology:EBIRD:OBS1446126022,Passeriformes,AVIBASE-603194D3,S111628640,Thryomanes bewickii,194.086559,1.0,1.0,1.0,VITSS,1COLEO,Coleoptera,Arthropods,,39.528992,perm,4,9.882248,899.284564
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3192436,10905600,URN:CornellLabOfOrnithology:EBIRD:OBS408081443,Strigiformes,AVIBASE-89F8B6F5,S29963429,Megascops kennicottii,8565.995447,0.0,1.0,1.0,OLVEU,1DIPTO,Diptera,Arthropods,,1721.951770,perm,4,430.487943,39174.402775
3192437,10908027,URN:CornellLabOfOrnithology:EBIRD:OBS408081443,Strigiformes,AVIBASE-89F8B6F5,S29963429,Megascops kennicottii,8565.995447,0.0,1.0,1.0,OLVEU,1HEMIO,Hemiptera,Arthropods,,1721.951770,perm,4,430.487943,39174.402775
3192438,10904448,URN:CornellLabOfOrnithology:EBIRD:OBS152618537,Strigiformes,AVIBASE-89F8B6F5,S10730732,Megascops kennicottii,8565.995447,0.0,1.0,1.0,PIAVE,1LEPIO,Lepidoptera,Caterpillars,,2488.629013,perm,3,829.543004,75488.413382
3192439,10905661,URN:CornellLabOfOrnithology:EBIRD:OBS152618537,Strigiformes,AVIBASE-89F8B6F5,S10730732,Megascops kennicottii,8565.995447,0.0,1.0,1.0,PIAVE,1HYMEO,Hymenoptera,Arthropods,,1721.951770,perm,3,573.983923,52232.537033


In [39]:
table_DFI.to_csv('../process_data/table_DFI_2.csv')