# Topic 1 – Basic Stock Price Time Series (S&P 500)

**Level:** Easy  
**Goal:** Univariate forecasting of daily stock prices (close price) for one S&P 500 company.

## Dataset
- **Source:** S&P 500 stock data – Kaggle
- **Link:** https://www.kaggle.com/datasets/camnugent/sandp500

## Download Instructions
1. Open https://www.kaggle.com/datasets/camnugent/sandp500
2. Log in to Kaggle
3. Click "Download"
4. Extract ZIP to `data/` folder
5. Use `all_stocks_5yr.csv`


## Installation

Install required packages:


In [None]:
# Install required packages (uncomment if needed)
# !pip install pandas numpy matplotlib seaborn statsmodels scikit-learn


## Data Loading

Load the S&P 500 stock data and select a company for analysis.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

# Load the dataset
df = pd.read_csv("data/all_stocks_5yr.csv")

# Display basic information
print(f"Dataset shape: {df.shape}")
print(f"\nColumns: {df.columns.tolist()}")
print(f"\nFirst few rows:")
df.head()


In [None]:
# Check available companies
print("Available companies:")
print(df["Name"].unique()[:20])  # Show first 20 companies
print(f"\nTotal number of companies: {df['Name'].nunique()}")


In [None]:
# Select a company (e.g., AAPL, MSFT, GOOGL)
company_name = "AAPL"  # Change this to select a different company

# Filter data for selected company
stock_data = df[df["Name"] == company_name].copy()

# Convert date column to datetime and set as index
stock_data["date"] = pd.to_datetime(stock_data["date"])
stock_data = stock_data.set_index("date").sort_index()

# Display basic information
print(f"Data for {company_name}:")
print(f"Date range: {stock_data.index.min()} to {stock_data.index.max()}")
print(f"Number of observations: {len(stock_data)}")
print(f"\nFirst few rows:")
stock_data.head()


In [None]:
# Check for missing values
print("Missing values:")
print(stock_data.isnull().sum())

# Basic statistics
print(f"\nBasic statistics for {company_name}:")
stock_data.describe()


In [None]:
# Plot the close price over time
plt.figure(figsize=(12, 6))
plt.plot(stock_data.index, stock_data["close"], linewidth=1.5)
plt.title(f"{company_name} Stock Price (Close) Over Time", fontsize=14, fontweight='bold')
plt.xlabel("Date", fontsize=12)
plt.ylabel("Close Price ($)", fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
