# Sardine's Relationship with the Ocean's Temperature

In [108]:
import numpy as np
import pandas as pd
import plotly.express as px
from sklearn.linear_model import LinearRegression
from scipy import stats
from plotly.subplots import make_subplots
import plotly.graph_objects as go

The change in ocean temperature was originally thought to be the main factor for the decline of the pacific sardine population. This is because many organisms, not just fish, are very sensetive to any changes within the environment, especially the temperature they live in. One common example would be the rising global temperature which leads to a decline of polar bear population. Other times, temperature change may simply have no effect on a population or even, benefit a species. Thus, we are going to explore whether the changes in our ocean's temperature had any effect on the sardine population. First, we are going to visualize ocean temperature over the years:

In [116]:
bottlecast_grouped = pd.read_csv("data/bottlecast_grouped.csv")
fig = px.line(bottlecast_grouped, x='Year', y = 'Ocean degC', title = 'Ocean Temperature Over Years')
fig.show()

As expected, global temperatures have been on a rise, and this has affected ocean temperatures as well. Using this information, we are going to predict that the rise in temerature over the years would actually BENEFIT the sardine population. A [study](https://link.springer.com/article/10.1007/s10641-016-0473-1) conducted by researchers from The National Marine Fisheries Service (also known as NOAA Fisheries) found that pacific sardine tend to prefer ocean temperatures at around 11C-19C. Using this information, I want to note a striking detail from our visualization which showcases that it was after the 1990s where ocean temperatures exceed 11 degrees celcius and reached the confort zone for the pacific sardine population. This is very similar to the time where the sardine population began to recover. Thus, let's visualize to see if the sardine population was affected by temperature in any way:

In [120]:
bottlecast_sardine = pd.read_csv("data/bottlecast_sardine.csv")

X = bottlecast_sardine['Ocean degC'].values.reshape(-1,1)
Y = bottlecast_sardine['Sardine Larvae lbs'].values.reshape(-1,1)
linear_regressor = LinearRegression()
linear_regressor.fit(X, Y)
Y_pred = linear_regressor.predict(X)
Y = np.array(Y).reshape(-1,)
X = np.array(X).reshape(-1,)

fig = px.scatter(bottlecast_sardine, x='Ocean degC', y='Sardine Larvae lbs', trendline="ols", title = 'Ocean Temperature vs Sardine Larvae')
fig.show()
print("Pearson Correlation:", stats.pearsonr(X, Y))

Pearson Correlation: (0.29245297721295976, 0.033584728187699565)


Here we see the pearson correlation valuee of .2924 which implies a very weak correlation between ocean temperature and sardine larvae count. However, our pearson significance value of .0335 means that our results are significant, and thus, implying that there is a positive linear correlation between ocean temperature and sardine larvae. However, this model may press issue for ocean temperatures below 9 degrees celcius, as our prediction model would predict a negative amount of sardine larvae. Realistically, this cannot happen in the world, to have a negative population count. Thus, it may be beneficial to change our model to check for any expotential correlation instead of linear.

In [123]:
import plotly.graph_objects as go
from sklearn.linear_model import LinearRegression

x_data = bottlecast_sardine['Ocean degC']
y_data = bottlecast_sardine['Sardine Larvae lbs']

log_x_data = np.log(x_data)
log_y_data = np.log(y_data)

curve_fit = np.polyfit(x_data, log_y_data, 1)

x_val = np.arange(8.5,12,.1)
y_val = np.exp(curve_fit[1]) * np.exp(curve_fit[0]*x_val)

fig = px.scatter(bottlecast_sardine, x='Ocean degC', y='Sardine Larvae lbs', title = 'Ocean Temperature vs Sardine Larvae')
fig.add_traces(go.Scatter(x=x_val, y=y_val, name='Regression Fit'))
fig.show()

This model may be more realistic in that it cannot produce a negative sardine larvae value. However, visually we see that our model is imperfect and has a lot of errors. While it may not be possible to accuracy predict sardine larvae amount based on ocean temperature, we can conclude that the two variables are significant enought to be related with one another in a positive correlation. 