# Stock Analysis and Visualization Script

This script fetches historical stock data, processes it, and visualizes key metrics.

### Imports and Settings

In [2]:
import zoneinfo
from zoneinfo import ZoneInfo

import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime, timedelta, tzinfo

# Pandas display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

# Force yfinance to use browser-like headers
yf.set_tz_cache_location("custom_cache")

### Configurable Parameters

In [3]:
# Stock ticker symbol
TICKER = "^GSPC"  # S&P 500 Index

DURATION_DAYS = 366
END_DATE = datetime.now().astimezone(zoneinfo.ZoneInfo("Europe/Paris"))
START_DATE = END_DATE - timedelta(days=DURATION_DAYS)

In [4]:
print("=" * 70)
print(f"S&P 500 DATA RETRIEVAL")
print("=" * 70)
print(f"Ticker: {TICKER}")
print(f"Period: {START_DATE.strftime('%Y-%m-%d')} to {END_DATE.strftime('%Y-%m-%d')}")
print(f"Duration: {DURATION_DAYS} days")
print("=" * 70)

S&P 500 DATA RETRIEVAL
Ticker: ^GSPC
Period: 2024-11-11 to 2025-11-12
Duration: 366 days


### Fetching the data thanks to Yahoo Finance

In [5]:
sp500 = yf.download(TICKER, start=START_DATE, end=END_DATE, interval='1d', progress=False)

  sp500 = yf.download(TICKER, start=START_DATE, end=END_DATE, interval='1d', progress=False)


In [6]:
sp500.head(10)

Price,Close,High,Low,Open,Volume
Ticker,^GSPC,^GSPC,^GSPC,^GSPC,^GSPC
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2024-11-11,6001.350098,6017.310059,5986.689941,6008.859863,4333000000
2024-11-12,5983.990234,6009.919922,5960.080078,6003.600098,4243400000
2024-11-13,5985.379883,6008.189941,5965.910156,5985.75,4220180000
2024-11-14,5949.169922,5993.879883,5942.279785,5989.680176,4184570000
2024-11-15,5870.620117,5915.319824,5853.009766,5912.790039,4590960000
2024-11-18,5893.620117,5908.120117,5865.950195,5874.169922,3983860000
2024-11-19,5916.97998,5923.509766,5855.290039,5870.049805,4036940000
2024-11-20,5917.109863,5920.669922,5860.560059,5914.339844,3772620000
2024-11-21,5948.709961,5963.319824,5887.259766,5940.580078,4230120000
2024-11-22,5969.339844,5972.899902,5944.359863,5944.359863,4141420000


### Understanding the S&P 500 Data Columns

When we download the S&P 500 data from Yahoo Finance, we get the following columns:

- **Open** : The price at which the index opened at the beginning of the trading day.
- **High** : The highest price reached during the trading day.
- **Low** : The lowest price reached during the trading day.
- **Close** : The price at which the index closed at the end of the trading day.
- **Volume** : The total number of shares/contracts traded during the day.
- **Date** : The date of the year

> Note: For indices like the S&P 500, there is typically no `Adj Close` column by default. 
> Markets are close during weekends and some days during the yeah, this is why we won't have 365 rows.


In [7]:
sp500.shape[0] # 252

252

### Close vs Adjusted Close

- **Close**: The raw closing price of the asset at the end of the trading day.
- **Adjusted Close (Adj Close)**: The closing price **adjusted for dividends and stock splits**, reflecting the "true" value for investors who reinvest dividends.

Why Adjusted Close matters ?

For individual stocks:
- Dividends and splits affect the nominal closing price.
- Using `Close` alone can misrepresent actual returns.
- `Adj Close` corrects for these events, giving the real return over time.

For the S&P 500 Index:

- Dividends and splits are already reflected in the index value.
- Therefore, **Close ≈ Adjusted Close** for the S&P 500.
- To be consistent with future analyses, we will rename `Close` to `Adj Close`:

In [8]:
sp500.rename(columns={"Close": "AdjClose"}, inplace=True)

In [9]:
sp500.columns = sp500.columns.get_level_values(0)
sp500.columns.name = None
sp500.columns

Index(['AdjClose', 'High', 'Low', 'Open', 'Volume'], dtype='object')

### Daily Returns (Rendements Journaliers)

To analyze the S&P 500, we often compute **daily returns**, which measure the relative change in price from one day to the next.

Formula

The **simple daily return** is calculated as:
$
R_t = \frac{P_t}{P_{t-1}} - 1
$
Where:

- \(R_t\) : daily return at day \(t\)  
- \(P_t\) : `Adj Close` price at day \(t\)  
- \(P_{t-1}\) : `Adj Close` price at the previous day

Why this formula?

- It represents the **percentage change** in price from one day to the next.  
- It is the basis for many statistical analyses such as **volatility**, **Sharpe ratio**, and technical indicators like **RSI**.


In [10]:
index = 0
sp500_length = sp500.shape[0]

sp500_return_per_day = [0.0]*sp500_length

def calculate_return(pt, pt_minus_one):
    return (pt / pt_minus_one) - 1

for i in range(sp500_length - 1):
    sp500_return_per_day[i+1] = (calculate_return(sp500.iloc[i+1].AdjClose, sp500.iloc[i].AdjClose))

sp500["DailyReturn"] = sp500_return_per_day

In [11]:
sp500.head()

Unnamed: 0_level_0,AdjClose,High,Low,Open,Volume,DailyReturn
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2024-11-11,6001.350098,6017.310059,5986.689941,6008.859863,4333000000,0.0
2024-11-12,5983.990234,6009.919922,5960.080078,6003.600098,4243400000,-0.002893
2024-11-13,5985.379883,6008.189941,5965.910156,5985.75,4220180000,0.000232
2024-11-14,5949.169922,5993.879883,5942.279785,5989.680176,4184570000,-0.00605
2024-11-15,5870.620117,5915.319824,5853.009766,5912.790039,4590960000,-0.013203


In [12]:
sp500_pct_change = sp500["AdjClose"].pct_change()
sp500_manual = sp500["DailyReturn"][1:]
sp500_pandas = sp500_pct_change[1:]

print(f"Vectors are the same : {sp500_manual.equals(sp500_pandas)}")

Vectors are the same : True


### Derived metrics using Daily Return

Daily Return calculation allows us to:

- Compute statistical metrics such as mean return, volatility, and cumulative return.
- Quantify risk-adjusted performance using metrics like the Sharpe ratio and maximum drawdown.
- Serve as a foundation for further technical or quantitative analysis.


In [13]:
# Mean Return
daily_return_sum = 0
for i in range(sp500_length):
    daily_return_sum+=sp500.iloc[i].DailyReturn
mean_return = daily_return_sum / sp500_length
mean_return = round(mean_return, 6)

mean_return

0.000594

In [14]:
sp500_pandas_mean_return = round(sp500.DailyReturn.mean(), 6)
print(sp500_pandas_mean_return)
print(f"Mean Daily Return are the same : {mean_return == sp500_pandas_mean_return}")

0.000594
Mean Daily Return are the same : True


In [15]:
# Standard deviation of the Daily Return 

def compute_standard_deviation(data, dof=1):
    mean_data = round(sum(data) / len(data), 6)
    squared_diffs = [(x - mean_data)**2 for x in data]
    variance = sum(squared_diffs) / (len(squared_diffs) - dof)
    std = variance**0.5
    return std

sp500_std = round(compute_standard_deviation(sp500.DailyReturn.tolist()), 6)
sp500_std

0.011803

Now that we have the mean daily return and the daily volatility, we can compute the Annual return and the Annual volatility.

The **annualized (yearly) return** represents the average growth rate of an investment over one year, assuming daily compounding of returns.


$
R_{\text{annual}} = \left( \prod_{t=1}^{N} (1 + r_t) \right)^{\frac{252}{N}} - 1
$

Where:  
- \( r_t \) = daily return on day \( t \)  
- \( N \) = number of trading days in your dataset  
- 252 = standard number of trading days in a year  

The **annualized volatility** measures how much daily returns fluctuate over one year.  
It is calculated by scaling the daily standard deviation by the square root of the number of trading days.

$
\sigma_{\text{annual}} = \sigma_{\text{daily}} \times \sqrt{252}
$

Where:  
- \( \sigma_{\text{daily}} \) = standard deviation of daily returns  
- \( \sqrt{252} \) = square root of the number of trading days in a year


In [69]:
N = len(sp500['DailyReturn']) - 1
daily_returns = sp500['DailyReturn'].iloc[1:]

annualized_return = ( (1 + daily_returns).prod() ) ** (252 / N) - 1

print(f'Exact annualized return for 252 trading days: {round(annualized_return*100, 2)}%')

Exact annualized return for 252 trading days: 14.22%


In [70]:
annualized_volatility = sp500_std * np.sqrt(252)
print(f'Exact annualized volatility for 252 trading days is {round(annualized_volatility*100, 6)}%')

Exact annualized volatility for 252 trading days is 18.736682%


In [82]:
# Using Empyrical to check results

import empyrical as emp

daily_returns = sp500.DailyReturn
annual_return = emp.annual_return(daily_returns)
annual_volatility = emp.annual_volatility(daily_returns)

print(f'Annualized return (with empyrical) is {round(annual_return*100, 2)}%')
print(f'Annualized volatility (with empyrical) is {round(annual_volatility*100, 2)}%')

Annualized return (with empyrical) is 14.16%
Annualized volatility (with empyrical) is 18.74%


## Results and Interpretation

- **Mean Daily Return:** 0.000588  
  Represents the average daily gain of the S&P 500 over the past year, equivalent to about **0.0588%** per trading day.  
  This indicates a small but consistent positive growth in the index on average each day.

- **Daily Volatility:** 0.01185  
  Measures the average daily fluctuation of returns around their mean, equal to about **1.185%** per day.  
  This shows that daily movements of the S&P 500 typically vary within a range of ±1.185%.

- **Annualized Return:** 15.94%  
  Represents the expected yearly growth rate if daily returns were compounded over 252 trading days.  
  This means that, on average, the index increased by about **15.94% over the year**.

- **Annualized Volatility:** 18.62%  
  Measures the expected yearly variability of returns, assuming daily fluctuations accumulate over the year.  
  This indicates that the **S&P 500’s yearly performance typically fluctuates within a band of ±18.62%** around its average return.


### Sharpe Ratio

The **Sharpe Ratio** measures the excess return of an investment relative to the risk taken.  
It helps investors understand how much return they are earning per unit of risk.

Formula

$
S = \frac{R_{\text{annual}} - R_f}{\sigma_{\text{annual}}}
$

Where:  
- \(R_{\text{annual}}\) = annualized return of the investment  
- \(\sigma_{\text{annual}}\) = annualized volatility (standard deviation of returns)  
- \(R_f\) = risk-free rate (e.g., US 10-year Treasury yield, typically around 4%)

---

Steps to Calculate

1. Choose the **risk-free rate** \(R_f\).  
2. Use the **annualized return** and **annualized volatility** already calculated.  
3. Apply the formula:

$
Sharpe = \frac{R_{\text{annual}} - R_f}{\sigma_{\text{annual}}}
$

---

Interpretation

- **Sharpe > 1** → Good risk-adjusted return  
- **Sharpe ≈ 0** → Return similar to risk-free rate  
- **Sharpe < 0** → Performance worse than risk-free rate  

The higher the Sharpe ratio, the better the investment’s return per unit of risk.


In [83]:
risk_free_rate = 0.05

sharpe_ratio = (annual_return - risk_free_rate) / annual_volatility

print(f'Sharpe Ratio for S&P 500 using US T Bonds is {sharpe_ratio}')

Sharpe Ratio for S&P 500 using US T Bonds is 0.4886684897841193


In [84]:
# Using Empyrical to check results

sharpe_ratio = emp.sharpe_ratio(daily_returns, risk_free=risk_free_rate/252)

print(f'Sharpe Ratio (with Empyrical) for S&P 500 using US T Bonds is {sharpe_ratio}')

Sharpe Ratio (with Empyrical) for S&P 500 using US T Bonds is 0.532584605656548


Calculation Notes

- Small differences may appear if you compare a **manual calculation** with a module like `empyrical`, because:
  - `empyrical` automatically handles the annualization of volatility and excess returns  
  - Rounding or the exact number of trading days can slightly affect the final value  

---

## Interpretation

For the S&P 500 (using US T-Bonds as the risk-free rate of 0.05):  

- **Sharpe Ratio ≈ 0.5**  
- This means that for **1 unit of risk taken**, the S&P 500 has generated **0.5 unit of excess return** over a risk-free investment.  
- Generally:
  - < 1 : moderate risk-adjusted performance  
  - 1–2 : good performance  
  - > 2 : excellent performance  

A value around 0.5 is **typical for broad stock indices**, reflecting moderate reward for the risk taken over the year.
