# Homework: Predicting NVIDIA Stock Price with Least Squares Fitting

In this Jupyter notebook, you will analyze NVIDIA's stock price data and determine the line of best fit. This exercise will help you apply linear regression techniques to predict future stock prices based on historical data.

Specifically, your task is to:

- **Fit a line of best fit** to NVIDIA's stock price data.
- **Predict the stock's value on January 1st, 2025**, assuming the trends continue.
- **Critically analyse the reliability of this prediction**, considering any caveats or potential reasons why the current trend might not continue.

By the end of this notebook, you should have a better understanding of least squares fitting, its real world applications and the limitations of extrapolating stock data.

In [1]:
# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

In [None]:
# Step 1: Load the NVIDIA stock data from a CSV file
# The CSV file should have columns: Date, Open, High, Low, Close, Volume
df = pd.read_csv("./nvidia_stock_data.csv")

# Convert the 'Date' column to datetime format for better plotting
df['Date'] = pd.to_datetime(df['Date'])

df['Date_ordinal'] = df['Date'].map(pd.Timestamp.toordinal)

In [None]:
# Step 2: Plot just the data (scatter plot of Date vs Close prices)
plt.figure(figsize=(10, 6))
plt.scatter(df['Date_ordinal'], df['Close'], color='royalblue', s=100, label="NVIDIA Stock Data", edgecolor='black')
plt.xlabel("Date", fontsize=14)
plt.ylabel("Closing Price [USD]", fontsize=14)
plt.xticks(rotation=45)
plt.title("NVIDIA Stock Closing Prices", fontsize=16)

In [None]:
# Step 3: Fit a line of best fit using the least squares method

slope = FIXME #code up the formula for m from the lecture handout
intercept = FIXME #code up the formula for c from the lecture handout

In [None]:
# Uncomment the line below if you got stuck calculating slope and intercept in the previous cell.
#slope, intercept = np.polyfit(df['Date_ordinal'], df['Close'], 1)

# Generate the y-values for the best-fit line based on the slope and intercept
best_fit_line = slope * df['Date_ordinal'] + intercept

# Visualise data points as scatter plot
plt.scatter(df['Date_ordinal'], df['Close'], color='royalblue', s=100, label="NVIDIA Stock Data", edgecolor='black')
# Plot the best-fit line on the same scatter plot
plt.plot(df['Date'], best_fit_line, color='coral', linewidth=3, label=f"Best Fit Line (y = {slope:.2f}x + {intercept:.2f})")

plt.xlabel("Date", fontsize=14)
plt.ylabel("Closing Price [USD]", fontsize=14)
plt.xticks(rotation=45)
plt.title("NVIDIA Stock Closing Prices", fontsize=16)

# Add a legend
plt.legend()

# Display the plot
plt.show()