Exploring some interactive plots for the ChrisCo company website pages.


In [None]:
!pip install hvplot

1. First, let's reate an
interactive visualisation showing line plots of all the pages on the site.

In [None]:
import holoviews as hv
import pandas as pd
import hvplot.pandas

data = pd.read_csv('https://tinyurl.com/ChrisCoDV/Pages/DailyHits.csv', index_col=0)
data.index = pd.to_datetime(data.index)
print(data.head())

plot = data.hvplot.line( frame_height=500, frame_width=500, xlabel='Date', ylabel='Hits', title='All Pages')
hv.extension('bokeh')
plot

2. Next create an interactive visualisation showing just the 2 high volume pages as line plots.

In [None]:
selected = ['001', '015']

plot = data[selected].hvplot.line(frame_height=500,
                                  frame_width=500, xlabel='Date', ylabel='Hits',
                                  title='High Volume Pages')
hv.extension('bokeh')
plot

3. Now, let's create an interactive visualisation showing superimposed histograms of the distributions of high
volume pages with customised bin sizes.

In [None]:
x_min = 900
x_max = 1600
bin_width = 10
n_bins = int((x_max - x_min) / bin_width)
print(str(n_bins) + ' bins')
bins = [(x_min + x * (x_max - x_min) / n_bins) for x in range(int(n_bins))]

selected = ['001', '015']
plot = data[selected].hvplot.hist(
    frame_height=500, frame_width=500,
    xlabel='Hits', ylabel='Frequency',
    title='High Volume Pages',
    alpha=0.5, muted_alpha=0, muted_fill_alpha=0, muted_line_alpha=0,
    tools=['pan', 'box_zoom', 'wheel_zoom', 'undo', 'redo', 'hover', 'save', 'reset'],
    bins=bins
)
hv.extension('bokeh')
plot

4. Create an interactive visualization displaying a heatmap illustrating the time-series correlations between all pages. Using the zoom feature we can find strong positive and negative correlations.

In [None]:
plot = data.corr().hvplot.heatmap(
    frame_height=500, frame_width=500,
    title='Page correlations',
    rot=90, cmap='coolwarm'
).opts(invert_yaxis=True, clim=(-1, 1))
hv.extension('bokeh')
plot

The 3 most closely correlated pages, in terms of page
hits, are 048, 155 and 156.

5. Now. let's Create an interactive visualisation containing 3 scatter plot comparisons for each pair of these pages.

In [None]:

xlimits = (20, 230)
ylimits = (20, 120)
plot = data.hvplot.scatter(
    frame_height=300, frame_width=300,
    x='048', y='155', title='048 vs 155',
    xlim=xlimits, ylim=ylimits, size=10
) + \
data.hvplot.scatter(
    frame_height=300, frame_width=300,
    x='155', y='156', title='155 vs 156',
    xlim=xlimits, ylim=ylimits, size=10
) + \
data.hvplot.scatter(
    frame_height=300, frame_width=300,
    x='048', y='156', title='048 vs 156',
    xlim=xlimits, ylim=ylimits, size=10
)
hv.extension('bokeh')
plot



6. Create an interactive visualization showcasing the pages 048, 155, and 156, comprising three line subplots.


In [None]:
selected = ['048', '155', '156']
plot = data[selected].hvplot.line(
    frame_height=300, frame_width=300,
    xlabel='Date', ylabel='Hits',
    subplots = True)

hv.extension('bokeh')
plot

7. Finally, let's generate a bubble plot using the summary data to depict annual hits vs revenue, where the bubble sizes are determined by viewing time.

In [None]:

# 1.Read the dataset from a CSV file into a pandas DataFrame
data = pd.read_csv('https://tinyurl.com/ChrisCoDV/Pages/DailyHits.csv', index_col=0)
exit_rate = pd.read_csv('https://tinyurl.com/ChrisCoDV/Pages/PageExitRate.csv', index_col=0)
revenue = pd.read_csv('https://tinyurl.com/ChrisCoDV/Pages/PageRevenue.csv', index_col=0)
size = pd.read_csv('https://tinyurl.com/ChrisCoDV/Pages/PageSize.csv', index_col=0)
speed = pd.read_csv('https://tinyurl.com/ChrisCoDV/Pages/PageSpeed.csv', index_col=0)
viewing_time = pd.read_csv('https://tinyurl.com/ChrisCoDV/Pages/PageViewingTime.csv', index_col=0)


summary_data = pd.DataFrame(index=data.columns)
summary_data['View Time'] = viewing_time.values
summary_data['Hits'] = data.sum().values
summary_data['Revenue'] = revenue.values
summary_data['Speed'] = speed.values
summary_data['Size'] = size.values
summary_data['Exit Rate'] = exit_rate.values
print(summary_data.describe())

In [None]:
summary_data['BubbleSize'] = summary_data['View Time'] * 0.2

plot = summary_data.hvplot.scatter(
    frame_height=400, frame_width=500,
    title='Annual Hits vs Revenue (vs View Time)',
    xlabel='Annual Hits', ylabel='Revenue',
    alpha=0.5, padding=0.1, hover_cols='all',
    tools=['pan', 'box_zoom', 'wheel_zoom', 'undo', 'redo', 'hover', 'save', 'reset'],
    x='Hits', y='Revenue', size='BubbleSize'
)
hv.extension('bokeh')
plot