# Timeseries Animation

[](http://)<img src="https://github.com/lewis-morris/progplot/blob/master/examples/deathsbycountrywithflag.gif?raw=true" alt="Example" style="float: left; margin-right: 10px;" />


# **progplot & BarWriter**

### Timeseries animations can be made using a mix of matplotlib and opencv to produce interesting and memorable visual representations of data. 

> ### I've been working on a package that can help with this. Originally just for my own use, but think some of you might benefit from it?
> ### Its still in the early stages, so you might have some issues, who knows? If you do, let me know on github.

In [None]:
import pandas as pd
import seaborn as sns
import numpy as np

# Download

* ## [Find on PyPi][1]

* ## [Find on github][2]

[1]: https://pypi.org/project/progplot/
[2]: https://github.com/lewis-morris/progplot

In [None]:
!pip install progplot

In [None]:
#import barwriter
from progplot import BarWriter

In [None]:
#create the barwriter object
bw = BarWriter()

# Set Data
-------------------------------

## BarWriter has 3 stages to data input.

1. The data is groupped by the set timeseries column by either sum or mean. (optional if you have already done this step in your dataframe).

> This is because as in this data there are multiple towns/cities under each state that we need to aggregate (we are going to sum them) 

2. The data is resampled to ensure no values dropout while running the animation by either sum or mean (optional if you have already done this step in your dataframe).

> You might also want to resample weekly/ yearly but we are going to leave it as daily (but we still need to tell the system - this is because if there are some missing dates it will fill in the blanks

3. The data is aggregatted using cumsum, rolling mean (optional if you have already done this step in your dataframe).

> We are going to choose none becuase the data has already been cumulatively summed.

## BarWriter has 2 stages prior video rendering.

1. Set the chart output details (and check visually)
2. Set the video output details


In [None]:
df = pd.read_csv("/kaggle/input/corona-virus-report/usa_county_wise.csv")
df["Date"] = pd.to_datetime(df["Date"])
df

### Looking at our data we need to make some decisions. I want to show the TOTAL DEATHS PER STATE

1. (groupby_agg) - There are multiple instances of the same region per datetime, So...we will need to sum the group.

2. (resample / resample_agg) Its in a daily format so we need to resample as such (we can also undersample to a week or a month if we wished, but incase of missing dates resample EVERY TIME even if you are keeping it the same (i.e days)

3. (output_agg) The data has already been cumulatively summed so we don't need to do this. But we could run cumsum or rolling average.

In [None]:
help(bw.set_data)

In [None]:
bw.set_data(data=df, category_col="Province_State", timeseries_col="Date", value_col="Deaths", groupby_agg="sum", resample="1d",resample_agg="sum", output_agg=None)

In [None]:
help(bw.set_display_settings)

# Video Settings (display settings)
-----------------------------------------

## Next we need to define the output settings of the video file to be created

1) (fps) The fps can be left default but you're free to change this.

2) (time_in_seconds) This is the length you want your file to be once rendered.  If in the case of there being MORE fps / seconds than there is UNIQUE DATES, BarWriter auto smothes the transition between dates so playback is not juddery.

3) (video_file_name) This can be x.mp4 for MP4V codec. 

4) (fourcccodecname) It is possible to change the fourcc codec if you are having video generation issues. NOT ADVISED AS WORKING ON KAGGLE / COLAB / MY LOCAL MACHINE FINE

In [None]:
bw.set_display_settings(time_in_seconds=45, video_file_name = "deathsbystate.mp4")

# Chart options 

-----------------------------------

## The most important step to get right. 

### There are a lot of options here so I suggest you have a play about with what you might like.

### Default options work fine, but for a more customized approach try limiting the values adding a title and formatting text.

### The docstring should explain as well as possible the options but please go to the end of this kernel for more examples. 

In [None]:
help(bw.set_chart_options)

### :default:

The default options will work fine, but customization is suggested.

In [None]:
bw.set_chart_options(use_data_labels=None)

# Testing
________________

## Finally view a test chart 

### I've chosen frame 100, but you can leave blank for a random position in the data.

In [None]:
bw.test_chart(100)

## There is a bit too much going on here we need to customize this a little 

### : custom :

In [None]:
#We've set the format of the ticks, the 

bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="Pastel1", 
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=30, display_top_x=15,
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end")
bw.test_chart(100)



## Looks good - we can always check what the palettes look like with the below

In [None]:
from progplot import palettes
palettes()

## Changing the pallet is easy

In [None]:
bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="bone", ### <-------------- Just change this value
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=30, display_top_x=15,
                     border_size=2, border_colour=(0.12,0.12,0.12),
                     font_scale=1.3,
                     use_data_labels="end")
bw.test_chart()

## Video Render Test.

### Before full render its best to test a sample first to make sure it renders correctly. As some users may not have the correct codecs etc.

### Lets limit the frame to 10 just to test.

In [None]:
help(bw.write_video)

In [None]:
bw.write_video(limit_frames=10)
bw.show_video()

# Rendering

## Video output was successful so lets do a full render.

## There is a lot happening to make the animation smooth as possible- rendering CAN take a while depending on settings. 

#### ( It runs a nearly double the speed on my local machine compared to on kaggle - around 6fps )


In [None]:
bw.write_video()
bw.show_video()

## If we want a GIF version of the output its as simple as 

In [None]:
bw.create_gif()

In [None]:
bw.show_gif()

# Output

## By default a HTML element it output to display on Jupyter - but all renders produce an output file that can be collected and used elsewhere.

## Files are stored in the current working directory IF the file name was set without a path. i.e "outputfile.mp4" or you can use "/home/myname/outputs/thisvideo.mp4" 

# New Data

## Now plotting cases is just as easy - we just need to change some parameters

1. Add new name "value_col" in bw.set_data().
2. Change the output file name
3. Change title

In [None]:
bw.set_data(data=df, category_col="Province_State", timeseries_col="Date", value_col="Confirmed", groupby_agg="sum", resample="1d",resample_agg="sum", output_agg=None)  # <--- value_col

bw.set_display_settings(time_in_seconds=45, video_file_name = "casesbystate.mp4") # <--- video_file_name

bw.set_chart_options(x_tick_format="{:,.0f}", dateformat="%Y-%m-%d", 
                     palette="copper", 
                     title="Top 15 States by Total Cases <mindatetime> to <currentdatetime>", y_label="State", # <--- title
                     use_top_x=30, display_top_x=15,
                     border_size=2, border_colour=(0.12,0.12,0.12),
                     font_scale=1.6, title_font_size=18,x_label_font_size=16,
                     use_data_labels="end")  
bw.test_chart(100)

In [None]:
bw.write_video()
bw.show_video()

# Picture Bars

## Its possible to map an image to the bars to give another layer of visulisation to the viewer 

## It is mapped from a dictionary in the format {Key - category :Value - filepath to image}



<span style="color:red">**WARNING** </span>

This can be SLOW to render. 

1) When I coded it i didn't have efficiency in mind. 

2) I haven't got the time to fix it right now.

<span style="color:red">**WARNING** </span>


NOTES:

> "Icons" or images used to populate the bars do not need to be resized or shaped. The system will automatically account for this

> PNG images used with transparency will be auto cropped to the content in the alpha channel automatically. 

> Missing images will be replaced with a default image.


In [None]:
df_country = pd.read_csv("/kaggle/input/corona-virus-report/covid_19_clean_complete.csv")
df_country["Date"] = pd.to_datetime(df_country["Date"])
df_country

## First we need a dictionary of images

### I'm going to try and map the names of the countries to flag images.

## Locate some icons to use

In [None]:
# download a zip of flag images
!wget "https://flagpedia.net/data/flags/w320.zip"

In [None]:
# unzip images
import zipfile
with zipfile.ZipFile("w320.zip", 'r') as zip_ref:
    zip_ref.extractall("./icons/flags/")

## Create your dictionary

In [None]:
codes = pd.read_html("https://www.iban.com/country-codes",attrs = {'id': 'myTable'})
codes[0]

In [None]:
df_country = df_country.merge(codes[0],left_on="Country/Region", right_on="Country")
df_country = df_country[["Country/Region","Date","Confirmed","Deaths","Alpha-2 code"]]
df_country

In [None]:
countries = list(df_country.loc[:,"Country/Region"].unique())
codes = list(df_country.loc[:,"Alpha-2 code"].unique())
                                  

In [None]:
image_dict = {country:f"./icons/flags/{str(code).lower()}.png" for country,code in zip(countries,codes)}
image_dict

## Once you have your dictionary of images - the process is just the same as before.

> Set the data

> Set the display settings

> Set the chart options, only this time pass in the image_dict.

In [None]:
bw.set_data(df_country, "Country/Region", "Date", "Deaths", resample="1d", groupby_agg="sum", resample_agg="sum",output_agg=None)

bw.set_display_settings(time_in_seconds=45, video_file_name = "deathsbycountrywithflag.mp4")

bw.set_chart_options(x_tick_format="{:,.0f}", dateformat="%Y-%m-%d", 
                     palette="summer", 
                     title="Top 15 Countries by Total Deaths <mindatetime> to <currentdatetime>",
                     use_top_x=15, display_top_x=15,
                     border_size=2, border_colour=(0.12,0.12,0.12),
                     font_scale=1.6,
                     use_data_labels="end", convert_bar_to_image=True,image_dict=image_dict)  # <--- Add image_dict and set convert_bar_to_image=True
bw.test_chart(100)

In [None]:
bw.write_video()
bw.show_video()

# Other Examples 

## If we want to see the mean cases per day its easy - but our data is currently in the cumsum format. Lets reverse this

In [None]:
df_mean = df.dropna()
df_mean

In [None]:
df_mean[(df_mean["Admin2"]=="Autauga") & (df_mean["Province_State"]=="Alabama")]

## The cumulative sum is by "Admin2" and "Province_State" so lets group by these and reverse

In [None]:
df_mean = df_mean.groupby(["Province_State","Date"]).sum().reset_index()
df_mean

In [None]:
## used to reverse the cumsum 

df_new = None
df_list = []

for i, (x,y) in enumerate(df_mean.groupby(["Province_State"])):
    y.reset_index(drop=True, inplace=True)
    
    print(f"\r{i}/{len(df_mean['Province_State'].drop_duplicates())}",end="")

    for itm in np.arange(len(y)-1,0,-1):
        y.loc[itm,"Deaths"] -= y.loc[itm-1,"Deaths"]
        y.loc[itm,"Confirmed"] -= y.loc[itm-1,"Confirmed"]
    df_list.append(y)
    
df_new = pd.concat(df_list)

In [None]:
df_new.tail(30)

## Now we can run the BarWriter with mean settings

1. We need to set the new DataFrame
2. We need to specify a rolling mean window to take the figure from. we will use "7rolling" which will be the average over the past 7 days.
3. We will sum the groupby (it groups by caregory and date) so it should not have any effect as there should be no duplicate states now but just in case.
4. We will resample daily just in case.
5. Change the output file name
6. Change title
7. Change the x_tick_format - being mean its likely to have decimals

In [None]:
bw.set_data(data=df_new, category_col="Province_State", timeseries_col="Date", value_col="Deaths", groupby_agg="sum", resample="1d",resample_agg="sum", output_agg="7rolling")


bw.set_display_settings(time_in_seconds=45, video_file_name = "rolling_daily.mp4")
bw.set_chart_options(x_tick_format="{:,.2f}", dateformat="%Y-%m-%d", 
                     palette="bone", 
                     title="Daily Rolling Average Deaths from <rollingdatetime> to <currentdatetime>", y_label="State", 
                     use_top_x=15, display_top_x=15,
                     border_size=2, border_colour=(0.12,0.12,0.12),
                     font_scale=1.3,
                     use_data_labels="end")

bw.test_chart(80)

In [None]:
bw.write_video()

# Other Chart Output examples

In [None]:
help(bw.set_chart_options)

In [None]:
bw.set_data(data=df, category_col="Province_State", timeseries_col="Date", value_col="Deaths", groupby_agg="sum", resample="1d",resample_agg="sum", output_agg=None)
bw.set_display_settings(time_in_seconds=45, video_file_name = "test.mp4")

## border size - HUGE WITH COLOUR

In [None]:

bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="Pastel1", 
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=30, display_top_x=15,
                     border_size=6, border_colour=(0.3,0.8,0.3), # <-- border size / colour
                     font_scale=1.3,
                     use_data_labels="end")
bw.test_chart(100)

## Data Labels at the base

In [None]:

bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="Pastel1", 
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=15, display_top_x=15,
                     border_size=3, border_colour=(0.3,0.3,0.3), # <-- border size / colour
                     font_scale=1.3,
                     use_data_labels="base")
bw.test_chart(100)

* ## Seaborn styles

In [None]:

bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="magma", 
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=15, display_top_x=15,
                     border_size=3, border_colour=(0.3,0.3,0.3),
                     font_scale=.7,
                     use_data_labels="base",
                     seaborn_style="darkgrid",
                     seaborn_context="paper") # <-- paper
bw.test_chart(100)

## Sorting - turned off the highest value isn't fixed to the bottom (positions don't switch)

In [None]:

bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="tab10_r", 
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=15, display_top_x=15,
                     border_size=3, border_colour=(0.3,0.3,0.3), # <-- border size / colour
                     font_scale=1.3,
                     use_data_labels="end",
                     sort=False)
bw.test_chart(100)

## Adjusting figsize and dpi

In [None]:

bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="gist_earth_r", 
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=15, display_top_x=15,
                     border_size=3, border_colour=(0.3,0.3,0.3), # <-- border size / colour
                     font_scale=1.5,
                     use_data_labels="end",
                     figsize=(10,12),
                     dpi=120)

bw.test_chart(100)

In [None]:

bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="terrain", 
                     title="Top 15 States by Total Deaths from <mindatetime> to <currentdatetime>",dateformat="%Y-%d-%m", 
                     y_label="State", 
                     use_top_x=15, display_top_x=15,
                     border_size=3, border_colour=(0.3,0.3,0.3), # <-- border size / colour
                     font_scale=1.5,
                     use_data_labels="end",
                     figsize=(20,10),
                     dpi=85)

bw.test_chart(100)