# Sardine's Relationship with the Ocean's Temperature

In [26]:
#importing required packages
import numpy as np
import pandas as pd
import plotly.express as px
from sklearn.linear_model import LinearRegression
from scipy import stats
from plotly.subplots import make_subplots
import plotly.graph_objects as go

The change in ocean temperature was originally thought to be the main factor for the decline of the pacific sardine population. This is because many organisms, not just fish, are very sensetive to any changes within the environment, especially the temperature they live in. One common example would be the rising global temperature which leads to a decline of polar bear population. Other times, temperature change may simply have no effect on a population or even, benefit a species. Thus, we are going to explore whether the changes in our ocean's temperature had any effect on the sardine population. First, we are going to visualize ocean temperature over the years:

In [27]:
bottlecast_grouped = pd.read_csv("data/bottlecast_grouped.csv")
fig = px.line(bottlecast_grouped, x='Year', y = 'Ocean degC', title = 'Ocean Temperature Over Years')
fig.show()

As expected, global temperatures have been on a rise, and this has affected ocean temperatures as well. Using this information, we are going to predict that the rise in temerature over the years would actually BENEFIT the sardine population. A [study](https://link.springer.com/article/10.1007/s10641-016-0473-1) conducted by researchers from The National Marine Fisheries Service (also known as NOAA Fisheries) found that pacific sardine tend to prefer ocean temperatures at around 11C-19C. Using this information, I want to note a striking detail from our visualization which showcases that it was after the 1990s where ocean temperatures exceed 11 degrees celcius and reached the confort zone for the pacific sardine population. This is very similar to the time where the sardine population began to recover. Thus, let's visualize to see if the sardine population was affected by temperature in any way:

In [31]:
bottlecast_sardine = pd.read_csv('data/bottlecast_sardine.csv')
bottlecast_sardine['Count'] = bottlecast_sardine['Count'].div(4).round(0)
bottlecast_sardine = bottlecast_sardine.rename(columns={'Ocean degC':'Ocean Degrees (Celcius)'})
x_data = bottlecast_sardine['Ocean Degrees (Celcius)']
y_data = bottlecast_sardine['Count']

log_y_data = np.log(y_data)

curve_fit = np.polyfit(x_data, log_y_data, 1)

x_val = np.arange(8.5,12,.1)
y_val = np.exp(curve_fit[1]) * np.exp(curve_fit[0]*x_val)

fig = px.scatter(bottlecast_sardine, x='Ocean Degrees (Celcius)', y='Count', title = 'Ocean Temperature vs Sardine Larvae')
fig.add_traces(go.Scatter(x=x_val, y=y_val, name='Regression Fit'))
fig.show()

Visually we see that our model is imperfect and has a lot of errors. While it may not be possible to accuracy predict sardine larvae amount based on ocean temperature, we can see that there is a positive correlation between ocean temperature and sardine larvae count. This reinforces NOAA's article as the larvae population begins to significantly increase at around 11 degrees celcius and higher.