# Task 2 — Predict Future Stock Prices (Short-Term)

This notebook uses `yfinance` to fetch historical data and trains a **Linear Regression** model to predict the **next day's closing price**.

**How to run (locally):**
1. Install requirements: `pip install yfinance scikit-learn matplotlib pandas`
2. Choose a ticker (e.g., `AAPL`)
3. Run all cells.

**Skills practiced:** time series handling, regression modeling, API data fetching, and plotting predictions vs. real data.


In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
import yfinance as yf


In [None]:
# Parameters
TICKER = 'AAPL'  # change to e.g., 'TSLA', 'MSFT'
PERIOD = '3y'    # last 3 years
INTERVAL = '1d'


In [None]:
# Fetch data
data = yf.download(TICKER, period=PERIOD, interval=INTERVAL)
data = data.dropna()
data.head()

In [None]:
# Create supervised learning target: next day's Close
df = data[['Open','High','Low','Close','Volume']].copy()
df['Close_next'] = df['Close'].shift(-1)
df = df.dropna()

X = df[['Open','High','Low','Volume']]
y = df['Close_next']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)

model = LinearRegression()
model.fit(X_train, y_train)
preds = model.predict(X_test)

mae = mean_absolute_error(y_test, preds)
r2 = r2_score(y_test, preds)
print('MAE:', mae)
print('R^2:', r2)

In [None]:
# Plot actual vs predicted closing prices (test set)
plt.figure()
plt.plot(y_test.values, label='Actual')
plt.plot(preds, label='Predicted')
plt.title(f'{TICKER}: Actual vs Predicted Next-Day Close')
plt.xlabel('Test Sample Index')
plt.ylabel('Price')
plt.legend()
plt.show()

## Notes & Next Steps
- Try a **RandomForestRegressor** or **GradientBoostingRegressor** for potentially better performance.
- Add lag features (e.g., previous 3-day returns) and technical indicators.
