In [1]:
import pandas as pd
import numpy as np

# Analysis of Zealandia Nesting Data

In [2]:
df = pd.read_csv("data/zealandia_nesting_data.csv")

### Nest success metrics

We'll start by calculating the number of pairs, and the number of offspring of each pair, for each year. For example, in the 2022/2023 year, there were 68 pairs which nested at least once, and they produced a total of 41 offspring (viable fledgelings).

In [3]:
nesting_outcomes = (
    df.groupby("nsbsid")
        .agg({"nsoffspring": "sum", "nsbpid": "nunique"})
        .rename(columns={"nsoffspring": "number_of_offspring", "nsbpid": "number_of_pairs"})
        .sort_index()
)

nesting_outcomes.loc["2022/2023"]

number_of_offspring    68
number_of_pairs        41
Name: 2022/2023, dtype: int64

We can calculate some additional metrics. I've added "offspring per pair", which is just the number of offspring divided by the number of pairs, and "proportion with 1+ offspring", which counts the proportion of pairs in each year which produced at least one fledgeling in that year. For example, in the 2021/2022 nesting year, 82% of pairs produced at least one viable fledgeling. 

In [4]:
nesting_outcomes["offspring_per_pair"] = (
        nesting_outcomes["number_of_offspring"]/
        nesting_outcomes["number_of_pairs"]
).round(2)

nesting_outcomes["proportion_with_1+_offspring"] = (
    df.groupby(["nsbsid", "nsbpid"])
        ["nsoffspring"].sum()
        .gt(0)
        .groupby("nsbsid")
        .mean()
        .round(2)
)

nesting_outcomes.loc["2021/2022"]["proportion_with_1+_offspring"]

0.82

### Returning birds

Another possible metric is the number of birds who form a pair in one year who return to form a pair (with any other bird) in the subsequent year. For example, in the 2020/2021 nesting season, 75% of the birds who nested in that season were also recorded as nesting in the 2021/2022 season.

In [5]:
def prev_year(nsbsid):
    return "/".join([str(int(year)-1) for year in nsbsid.split("/")])

df["prev_nsbsid"] = df["nsbsid"].apply(prev_year)

year = pd.concat([(
        df.groupby(["nsbsid", bird_sex])
            .size()
            .rename("year")
    ) for bird_sex in ["nsmalename", "nsfemalename"]], axis=0)

subsequent_year = pd.concat([(
        df.groupby(["prev_nsbsid", bird_sex])
            .size()
            .rename("subsequent_year")
    ) for bird_sex in ["nsmalename", "nsfemalename"]], axis=0)

present_in_year = pd.concat([year, subsequent_year], axis=1).notna()
present_in_year = present_in_year.loc[present_in_year["year"]]


nesting_outcomes["proportion_returning_next_year"] = (
    present_in_year.groupby(
        present_in_year
            .index
            .get_level_values(0))
        ["subsequent_year"].mean()
        .round(2)
        .replace({0:np.nan})
)

nesting_outcomes.loc["2020/2021"]["proportion_returning_next_year"]


0.75

### Final table

Here's the final table of data:

In [6]:
nesting_outcomes

Unnamed: 0_level_0,number_of_offspring,number_of_pairs,offspring_per_pair,proportion_with_1+_offspring,proportion_returning_next_year
nsbsid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2014/2015,44,30,1.47,0.8,0.69
2015/2016,43,33,1.3,0.76,0.67
2016/2017,53,37,1.43,0.78,0.72
2017/2018,55,34,1.62,0.94,0.68
2018/2019,58,32,1.81,0.84,0.68
2019/2020,42,31,1.35,0.77,0.63
2020/2021,56,31,1.81,0.77,0.75
2021/2022,77,40,1.92,0.82,0.71
2022/2023,68,41,1.66,0.78,
