# Download data

The Xeno-Canto website provides an online form for querying the audio database and downloading the results as a csv file, which includes the id's of the recordings (but not the recordings as such). 

The database was queried for recordings of goshawks as the foreground species, with a quality rating of A. The result set has 186 entries (https://xeno-canto.org/set/8005).

In [1]:
import pandas as pd

# download the query results
url = 'https://xeno-canto.org/explore/csv?query=accipiter%20gentilis%20q:A'
df_original = pd.read_csv(url)

# save the original rresults to file
df_original.to_csv('./data/Accipiter_gentilis_xeno_canto_20220921.csv')

## Augment the data
The downloaded data was augmented with columns for month, stage in the breeding cycle (table 1), sex and life stage (adult/juvenile/fledgling).

In [35]:
import datetime

# add columns with the month and the corresponding stage in the breeding cycle, as in table 1
df = df_original

def get_breeding_stage(date):
    breeding_stages = {
        9:'Non-breeding', 10:'Non-breeding', 11:'Non-breeding', 12:'Non-breeding', 1:'Non-breeding',
        2:'Territory-building, courtship', 3:'Territory-building, courtship',
        4:'Incubation and nesting', 5:'Incubation and nesting', 6:'Incubation and nesting',
        7:'Fledging', 8:'Fledging'
    }
    month = get_month(date)
    if month:
        return breeding_stages[month]
    else:
        return None

def get_month(date):
    try:
        month = datetime.datetime.strptime(date, "%Y-%m-%d").month
        return month
    except:
        return None

df['month'] = df['Date'].apply(get_month)
df['breeding stage'] = df['Date'].apply(get_breeding_stage)

In [39]:
# add a columns with life stage and sex
def get_sex(songtype):
    if 'male' in songtype:
        return 'male'
    elif 'female' in songtype:
        return 'female'
    else:
        return None 

def get_life_stage(songtype):
    if 'adult' in songtype:
        return 'adult'
    elif 'juvenile' in songtype:
        return 'juvenile'
    elif 'fledgling' in songtype:
        return 'fledgling'
    
df['sex'] = df['Songtype'].apply(get_sex)
df['life_stage'] = df['Songtype'].apply(get_life_stage)

In [40]:
df.head()

Unnamed: 0,Common name,Scientific name,Subspecies,Recordist,Date,Time,Location,Country,Latitude,Longitude,Elevation,Songtype,Remarks,Back_latin,Catalogue number,License,breeding stage,month,sex,life_stage
0,Northern Goshawk,Accipiter gentilis,,Jack Berteau,2007-07-08,14:30,"Arrondissement de Poitiers (near Chauvigny), ...",France,46.5832,0.7056,140,"adult, call, sex uncertain",Mixed forest.\n\nbird-seen:no\n\nplayback-used:no,,742843,//creativecommons.org/licenses/by-nc-sa/4.0/,Fledging,7.0,,adult
1,Northern Goshawk,Accipiter gentilis,,Beatrix Saadi-Varchmin,2010-06-10,08:27,"forest S, Hagenheim (near Hofstetten), Landsb...",Germany,47.993,10.9495,700,"begging call, juvenile, sex uncertain","at least three young Accipiter gentilis, ""Ästl...","Phylloscopus collybita,Fringilla coelebs,Turdu...",738553,//creativecommons.org/licenses/by-nc-sa/4.0/,Incubation and nesting,6.0,,juvenile
2,Northern Goshawk,Accipiter gentilis,,Sławomir Karpicki-Ignatowski,2022-07-20,19:00,"Poznań, Poznań County, Greater Poland Voivodeship",Poland,52.3848,17.0485,80,"call, life stage uncertain",\n\nbird-seen:yes\n\nplayback-used:no,,738380,//creativecommons.org/licenses/by-nc-sa/4.0/,Fledging,7.0,,
3,Northern Goshawk,Accipiter gentilis,,B Whyte,2022-06-02,19:30,"Clunas, Highland Council, Scotland",United Kingdom,57.4782,-3.876,260,"adult, alarm call, begging call, sex uncertain",\n\nbird-seen:yes\n\nplayback-used:no,,728885,//creativecommons.org/licenses/by-nc-sa/4.0/,Incubation and nesting,6.0,,adult
4,Northern Goshawk,Accipiter gentilis,,Antonio Xeira,2018-05-14,08:00,"Cockley Cley Woods, Norfolk, England",United Kingdom,52.6205,0.6458,50,"adult, call, sex uncertain","Editing: High-pass filter, some amplification....",,715009,//creativecommons.org/licenses/by-nc-nd/4.0/,Incubation and nesting,5.0,,adult
