In [1]:
import pandas as pd

META = '/kaggle/input/meta-stock-data-2025/META stocks.csv'
df = pd.read_csv(filepath_or_buffer=META, parse_dates=['Date'])
df['year'] = df['Date'].dt.year
df.head()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,year
0,2012-05-18,42.049999,45.0,38.0,38.23,38.084515,573576400,2012
1,2012-05-21,36.529999,36.66,33.0,34.029999,33.900501,168192700,2012
2,2012-05-22,32.610001,33.59,30.940001,31.0,30.882032,101786600,2012
3,2012-05-23,31.370001,32.5,31.360001,32.0,31.878227,73600000,2012
4,2012-05-24,32.950001,33.209999,31.77,33.029999,32.904308,50237200,2012


Let's start by looking at the price/volume correlations.

In [2]:
df[['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume']].corr()

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
Open,1.0,0.999824,0.999806,0.999603,0.999604,-0.381216
High,0.999824,1.0,0.999781,0.999817,0.999815,-0.378256
Low,0.999806,0.999781,1.0,0.999833,0.999832,-0.384853
Close,0.999603,0.999817,0.999833,1.0,0.999998,-0.381852
Adj Close,0.999604,0.999815,0.999832,0.999998,1.0,-0.381578
Volume,-0.381216,-0.378256,-0.384853,-0.381852,-0.381578,1.0


What do we see? We see that none of our prices are perfectly correlated, so none of our data is duplicated, and we see that prices and volume are negatively correlated. This latter isn't surprising, as volumes tend to decline as prices rise. Let's look at the price time series.

In [3]:
from plotly import express

express.scatter(data_frame=df, x='Date', y='Adj Close', color='year', )

What do we see? We see that prices appear to have risen steadily through late 2021, then fallen substantially, then recovered.

In [4]:
express.scatter(data_frame=df, x='Date', y='Adj Close', color='year', log_y=True)

If instead we look at the time series for the log of the price we see that there was also a lot of volatility during 2012-2014.

Let's take a look at the volume time series; volume is not serially correlated in the same way prices are, so we expect to see a lot more volatility in volume than in prices.

In [5]:
express.scatter(data_frame=df, x='Date', y='Volume', color='year', log_y=True)

If we squint we can sort of see how volume has on average declined over time. Let's try plotting the price and volume together.

In [6]:
express.scatter(data_frame=df, x='Adj Close', y='Volume', color='year', log_x=True, log_y=True)

The price and volume volatility require us to use a log plot in both directions, but when we do we do see how volume has declined as a function of price over time.