Financial Time-Series Analysis
Scenario: You are a quantitative analyst given a CSV file with daily stock price data for a tech company.

Your Task: Your objective is to perform a basic financial analysis. You must load the data, calculate the daily price change (return), and then compute the 30-day rolling volatility of those returns. Finally, identify "high volume" trading days, defined as days where the trading volume was more than 1.5 standard deviations above the average volume. The final DataFrame should contain the original data plus new columns for daily_return, 30-day_volatility, and a boolean is_high_volume flag.

In [None]:
import numpy as np
import pandas as pd

stock_data = pd.read_csv('stock_data.csv')
# Normalize the column names, convert to daytime type and set as index
stock_data.columns = stock_data.columns.str.lower().str.replace(' ', '_')
stock_data['date'] = pd.to_datetime(stock_data['date'])
stock_data = stock_data.set_index('date')

print(stock_data.info())

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 546 entries, 2024-01-02 to 2025-06-30
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   open    546 non-null    float64
 1   high    546 non-null    float64
 2   low     546 non-null    float64
 3   close   546 non-null    float64
 4   volume  546 non-null    int64  
dtypes: float64(4), int64(1)
memory usage: 25.6 KB
None
              open    high     low  close    volume
date                                               
2025-06-21  157.75  158.25  156.50  157.0  24000000
2025-06-22  156.75  158.00  156.00  157.5  26000000
2025-06-23  157.25  158.75  156.75  158.0  30000000
2025-06-24  157.75  158.25  156.50  157.0  28000000
2025-06-25  156.75  158.00  156.00  157.5  27000000
2025-06-26  157.25  158.75  156.75  158.0  30000000
2025-06-27  157.75  158.25  156.50  157.0  24000000
2025-06-28  156.75  158.00  156.00  157.5  26000000
2025-06-29  157.25  158.75  156.75  158.0 

Daily return is closing price - previous closing price over the previous closing price (to get the %)

In [23]:
stock_data['daily_return'] = (stock_data['close'] - stock_data['close'].shift(1)) / stock_data['close'].shift(1)
print(stock_data['daily_return'][:10])

date
2024-01-02         NaN
2024-01-03    0.008237
2024-01-04   -0.003268
2024-01-05   -0.016393
2024-01-06    0.010000
2024-01-07    0.021452
2024-01-08    0.008078
2024-01-09   -0.004808
2024-01-10    0.001610
2024-01-11    0.008039
Name: daily_return, dtype: float64


Calculate the standard deviation of previous 30 days' returns, how volatile the price was. (first 29 entries can't have a value, as they don't have full 30 days that preceed them)

In [24]:
stock_data['30_day_volatility'] = stock_data['daily_return'].rolling(30).std()
print(stock_data['30_day_volatility'].sample(10))

date
2025-04-15    0.004559
2024-01-13         NaN
2024-12-01    0.004559
2024-02-11    0.004548
2024-03-19    0.004559
2025-03-12    0.004559
2024-04-17    0.004559
2024-10-26    0.004559
2025-03-27    0.004559
2024-11-13    0.004559
Name: 30_day_volatility, dtype: float64


Define days with big stock volume

In [None]:
stock_data['is_high_volume'] = (stock_data['volume'] - stock_data['volume'].mean()) > stock_data['volume'].std() * 1.5
print('Volume records for high-volume days: ')
print(stock_data.loc[stock_data['is_high_volume'], 'volume'])
print('Mean volume is: ', stock_data['volume'].mean(), ' And max volume is: ', stock_data['volume'].max())

Volume records for high-volume days: 
date
2024-01-05    31000000
2024-01-07    35000000
2024-01-08    33000000
2024-01-14    32000000
2024-01-18    31000000
Name: volume, dtype: int64
Mean volume is:  27523809.523809522  And max volume is:  35000000


Conclusion: The analysis was successful. The final DataFrame now includes the calculated daily returns, 30-day volatility, and a boolean flag identifying high-volume trading days, providing key metrics for further financial modeling.