## Project extensions

Convert your findings into a research report examining the relationship between El Niño events and word usage levels for different species of fish.

In [11]:
import pandas as pd 
import matplotlib.pyplot as plt

df = pd.read_csv('animal-word-trends-peru.csv')

species_nb = df['word'].unique().shape[0]
print(f'-------------REPORT-------------\n* Number of species : {species_nb}')

period_min = df['year'].min()
period_max = df['year'].max()
print(f'* The period begins in {period_min} \n* Ending in {period_max}')

global_frequency_avg = df['frequency'].mean()
print(f'* Frequency of fish cited (average) : {global_frequency_avg:.2} (per million)')

most_cited = df.query('frequency > @global_frequency_avg')['word'].unique()
less_cited = ', '.join(most_cited)
print(f'* Fishes less frequent : {less_cited}')

events = [1965, 1973, 1983, 1987]
most_cited_during_event = []
for event in events:
    selected_date = df.query('year == @event').groupby('word')['frequency'].max().reset_index()
    most_cited = selected_date.query('frequency == frequency.max()')
    top_word = most_cited.iloc[0]['word']
    if top_word not in most_cited_during_event:
        most_cited_during_event.append(top_word)
result_most_cited = ', '.join(most_cited_during_event)
print(f'* Most cited fishes during events en El Niño : {result_most_cited}')

-------------REPORT-------------
* Number of species : 5
* The period begins in 1960 
* Ending in 1990
* Frequency of fish cited (average) : 0.73 (per million)
* Fishes less frequent : mackerel, sardine, hake
* Most cited fishes during events en El Niño : sardine, mackerel
