In [1]:
import pandas as pd

BTC = '/kaggle/input/bitcoin-usd-btc-usd/BTC-USD new.csv'
# the adjusted close values are always the same as the Close values so we can drop them
df = pd.read_csv(filepath_or_buffer=BTC, parse_dates=['Date'], thousands='.').drop(columns=['Adj Close'])
# we need to scale to get our quotes in USD
df[['Open', 'High', 'Low', 'Close']] = 1.0/1000000.0 * df[['Open', 'High', 'Low', 'Close']]
df['month'] = df['Date'].dt.month
df['Daily'] = df['Close'].diff()
df['abs Daily'] = df['Daily'].abs()
df.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,month,Daily,abs Daily
0,2023-02-16,24307.349609,25134.117188,23602.523438,23623.474609,39316664596,2,,
1,2023-02-17,23621.283203,24924.041016,23460.755859,24565.601563,41358451255,2,942.126954,942.126954
2,2023-02-18,24565.296875,24798.835938,24468.373047,24641.277344,19625427158,2,75.675781,75.675781
3,2023-02-19,24640.027344,25093.054688,24327.642578,24327.642578,25555105670,2,-313.634766,313.634766
4,2023-02-20,24336.623047,25020.458984,23927.910156,24829.148438,28987376573,2,501.50586,501.50586


In [2]:
from plotly.express import line
line(data_frame=df, x='Date', y=df.columns[1:5], )

What jumps out at us from this plot?
* BTC has roughly doubled during the period of interest
* BTC was in a trading range through roughly mid-October, then broke out
* We can see more daily volatility during the last few weeks compared to earlier in the period of interest

Let's add a trendline to see if we can see the change in trend numerically.

In [3]:
from plotly.express import scatter
for trendline in ['ols', 'lowess']:
    scatter(data_frame=df, x='Date', y='Close', trendline=trendline).show()

In the upper scatter plot we see that since December 3 or so BTC has been trading above the OLS line almost every day. In the lower scatter plot we see how the LOWESS trendline bent upward in late October.

In [4]:
from plotly.express import histogram
histogram(data_frame=df, x='Close')

Here our trading range and breakout show themselves again: the left mode is the early trading range, while the right mode is the new trading range.

In [5]:
histogram(data_frame=df, x='Daily', marginal='box', nbins=len(df)//5)

In a free market we expect daily price changes to be more or less Gaussian. We do see something here that looks Gaussian, with some interesting skew. Does our daily change distribution lean more positive or more negative?

In [6]:
(df['Daily'] == df['abs Daily']).sum(), (df['Daily'] == -df['abs Daily']).sum(), len(df),

(181, 184, 366)

We have roughly as many days with gains as losses.

Let's look at the relationships between price, volume, and time.

In [7]:
from plotly.express import imshow
imshow(img=df[['month', 'Close', 'Volume']].corr())

We generally expect prices and volume to be weakly correlated to uncorrelated and that is in fact what we see. 

In [8]:
scatter(data_frame=df, x='Volume', y='Daily', color='month')

If we look at the relationship between volume and daily volatility we see that daily changes - that is, the rolling one-day difference, tends to be large in magnitude when volume is high; this is what we would expect to see in the absence of large amounts of algorithmic trading: daily price changes are a function of open interest, which is finite.

In [9]:
scatter(data_frame=df, x='Volume', y='abs Daily', color='month', trendline='ols')

If we look at the relationship between trading volume and absolute daily change we see again that they are weekly correlated; the R2 of our OLS trendline is roughly 0.375.

In [10]:
scatter(data_frame=df, y='Volume', x='Close', color='month')

This graph is a less easy to understand version of the insight we had above: prices tend to cluster within months, which is due to the fact that for much of the period of interest prices moved in fairly discrete trading ranges; we also see here that volume is effectively an independent variable, that is, not dependent on the month or the price.