In [None]:
import pandas as pd
import plotly.express as px
import polars as pl

pd.options.plotting.backend = "plotly"

df = pd.read_csv('block-times-oct-27-2023.zip', index_col=0)

In [None]:
min_time = df.min(axis=1)
max_time = df.max(axis=1)
diff = max_time - min_time
fig = px.line(diff)
fig.update_layout(
    xaxis_title='Block',
    yaxis_title='Timsetamp delta(s)',
    showlegend=False,
)

fig.show()


While there was a significant delta of 400s in one of the initial blocks this wasn't repeated so is probably safe to
ignore for now. There appears to be a somewhat consistently larger delta between ~800K blocks and 1.2M blocks.
It may be be a bit misleading because this graph is so dense.

Lets see if we can zoom in on those 800K to 1.2M block range.

In [None]:
slow_area = px.line(diff[800_000:1_200_000])
slow_area.update_layout(
    xaxis_title='Block',
    yaxis_title='Timestamp delta(s)',
    showlegend=False,
)

slow_area.show()

This still isn't clear enough, so we'll zoom in further to that high 200s point.

In [None]:
zoomed_slow_area = px.line(diff[979_200:979_400])
zoomed_slow_area.update_layout(
    xaxis_title='Block',
    yaxis_title='Timestamp delta(s)',
    showlegend=False,
)

zoomed_slow_area.show()


Zooming in to a range of 200 blocks gives us a better idea. It looks like there was an issue where ~30 consecutive
blocks took a bit longer.
If we look at the timestamps per node perhaps we can better understand what's going on.

In [None]:
node_times = px.line(df[979270:979300].astype(dtype="datetime64[s]"))

node_times.update_layout(
    xaxis_title='Block',
    yaxis_title='Time (UTC)',
    legend=dict(
        orientation="h",
        yanchor="top",
        y=4,
        xanchor="center",
        x=0.5
    )
)

node_times.show()

Looking at that output it appears that for whatever reason the IdeasBeyondBorders node was being delayed noticeably more
than the other nodes.
Hint: you can mouse over the graph to see the exact values and node name.

Let's spot check one other hotspot.

In [None]:
zoomed_slow_tail = px.line(diff[1_168_900:1_169_000])
zoomed_slow_tail.update_layout(
    xaxis_title='Block',
    yaxis_title='Timestamp delta(s)',
    showlegend=False,
)

zoomed_slow_tail.show()


In [None]:
node_times_2 = px.line(df[1_168_940:1_168_950].astype(dtype="datetime64[s]"))

node_times_2.update_layout(
    xaxis_title='Block',
    yaxis_title='Time (UTC)',
    legend=dict(
        orientation="h",
        yanchor="top",
        y=4,
        xanchor="center",
        x=0.5
    )
)

node_times_2.show()

It looks like the LongNowFoundation node was delayed in this instance. Just
based on these two examples it seems that we can't claim one node is delayed 
more than the others.

What we can do is plot the delta per node based on the median or mode of the timestamps.

In [None]:
# Switched to polars here because pandas was slow to do the median and deviation
# I need to learn how to plot with polars
median = df.median(axis=1)
deviation = pl.from_pandas(df) - pl.from_pandas(median)
pandas_deviation = deviation.to_pandas()

In [None]:
deviations = px.line(pandas_deviation[800_000:1_200_000])
deviations.update_layout(
    xaxis_title='Block',
    yaxis_title='Timestamp delta(s)',
    legend=dict(
        orientation="h",
        yanchor="top",
        y=4,
        xanchor="center",
        x=0.5
    ),
)

deviations.show()


This can be a bit hard to separate the nodes. Since there are only 10 nodes we can plot them all individually.

In [None]:
for node in df.columns:
    node_deviation = px.line(pandas_deviation[800_000:1_200_000][node])
    node_deviation.update_layout(
        xaxis_title='Block',
        yaxis_title='Timestamp delta(s)',
        legend=dict(
            orientation="h",
            yanchor="top",
            y=4,
            xanchor="center",
            x=0.5
        ),
    )
    node_deviation.show()

It's important to look at the scale on the y axis. Based on the graphs it appears that nodes for BlockDaemon, Ideas Beyond Borders, and The Long Now Foundation were all facing delays during that time. The other nodes were mostly within 1 second of each other.

Another interesting thing we can look at is the distribution of the deltas.


In [None]:
histogram = px.histogram(diff)
histogram.update_layout(
    yaxis_title='Number of Blocks',
    xaxis_title='Timestamp delta(s)',
    showlegend=False,
)
histogram.show()

This is a bit dense, one can see that a significant number of blocks have a smaller than a 10 second timestamp delta.

Perhaps this data is better shown as a quantiles.

In [None]:
quantiles = diff.quantile([0.99, 0.95, 0.9, 0.75, 0.50, 0.25])
quantiles

Looking here we can see that 99% of the timestamps are within 17 seconds of each other. 95% are within 7, etc.

However, this data is keeping the most delayed node in the set. How do this values look if we remove the most delayed
node?