# **progplot & BarWriter**

### Timeseries animations can be made using a mix of matplotlib and opencv to produce interesting and memorable visual representations of data. 

> ### I've been working on a package that can help with this. Originally just for my own use, but think some one else might benefit from it.
> ### Just made in my own time, so you might have some issues, who knows? If you do, let me know on github.



<img src="https://raw.githubusercontent.com/lewis-morris/progplot/master/examples/total_crimes_by_type.gif" alt="Example"> 

In [None]:
import pandas as pd
import seaborn as sns
import numpy as np

In [None]:
df = pd.read_csv("/kaggle/input/crimes-in-boston/crime.csv",encoding='latin1')
df

In [None]:
df = df[["OFFENSE_CODE_GROUP","DISTRICT","OCCURRED_ON_DATE","STREET"]]
df["DATE"] = pd.to_datetime(df["OCCURRED_ON_DATE"],format="%Y-%m-%d").dt.date
df

## Download, Import an Initialize BarWriter

In [None]:
!pip install progplot

In [None]:
#import barwriter
from progplot import BarWriter

In [None]:
#create the barwriter object
bw = BarWriter()

## Lets see if we can take a look at Total Crimes By Type

# Set Data
-------------------------------

## BarWriter has 3 stages to data input.

1. The data is groupped by the set timeseries column by either sum, mean or count. (optional if you have already done this step in your dataframe).

> We will choose count - the total crimes will then be summed for each unique date. (NOTE: When using COUNT we not need to set a numerical column for the data.)

2. The data is resampled to ensure no values dropout while running the animation by either sum, mean or count (optional if you have already done this step in your dataframe).

> We are going to resample sum to 1 week.

3. The data is aggregatted using cumsum or rolling mean (optional if you have already done this step in your dataframe).

> We are going to use cumsum to get the total crimes

## BarWriter has 2 stages prior video rendering.

1. Set the chart output details (and check visually)
2. Set the video output details


In [None]:
help(bw.set_data)

In [None]:
df

In [None]:
bw.set_data(data=df, category_col="OFFENSE_CODE_GROUP", timeseries_col="DATE", value_col="DISTRICT", groupby_agg="count", resample_agg="sum", output_agg="cumsum", resample = "1w")


# Video Settings (display settings)
-----------------------------------------

## Next we need to define the output settings of the video file to be created

1) (fps) The fps can be left default but you're free to change this.

2) (time_in_seconds) This is the length you want your file to be once rendered.  If in the case of there being MORE fps / seconds than there is UNIQUE DATES, BarWriter auto smothes the transition between dates so playback is not juddery.

3) (video_file_name) This can be x.mp4 for MP4V codec. 

4) (fourcccodecname) It is possible to change the fourcc codec if you are having video generation issues. NOT ADVISED AS WORKING ON KAGGLE / COLAB / MY LOCAL MACHINE FINE

In [None]:
help(bw.set_display_settings)

In [None]:
bw.set_display_settings(time_in_seconds=30, video_file_name = "total_crimes_by_type.mp4")


# Chart options 

-----------------------------------

## The most important step to get right. 

### There are a lot of options here so I suggest you have a play about with what you might like.

### Default options work fine, but for a more customized approach try limiting the values adding a title and formatting text.

### The docstring should explain as well as possible the options but please go to the end of this kernel for more examples. 

In [None]:
help(bw.set_chart_options)

## Suggested options

* set the format of the ticks

* TITLE IS IMPORTANT

* TITLE DATE FORMAT 

* use_top_x and display_top_x

In [None]:
bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="magma", 
                     title="Top 10 Crimes by Total Offences from <mindatetime> to <currentdatetime>",dateformat="%Y-%m-%d", 
                     y_label="Offence", 
                     use_top_x=20, display_top_x=10,
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end")
bw.test_chart(30)

## We can change the colours if we're not happy with the current output

In [None]:
bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="bone", # <--------Change 
                     title="Top 10 Crimes by Total Offences from <mindatetime> to <currentdatetime>",dateformat="%Y-%m-%d", 
                     y_label="Offence", 
                     use_top_x=20, display_top_x=10,
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end")
bw.test_chart(30)

## Or if we want to keep the positions of each category static we can set "sort=False"

In [None]:
bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="bone",
                     title="Top 10 Crimes by Total Offences from <mindatetime> to <currentdatetime>",dateformat="%Y-%m-%d", 
                     y_label="Offence", 
                     use_top_x=10, display_top_x=10,
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end",
                     sort=False) # <-------- SORTED?
bw.test_chart(30)

# Video Rendering

## The easy step

In [None]:
bw.write_video()
bw.show_video()

## Then we can make a GIF just as easy

In [None]:
bw.create_gif()
bw.show_gif()

## Lets see the mean daily offences by street - just as easy.

### 1) Set the data
### 2) Set the display options
### 3) Set the chart options
### 4) Test
### 5) Write Video

In [None]:
# we will set the groupby to count. - this will count the crimes for each unique date.

# we will resample per week and calculate the MEAN to get the mean daily value per week

# we will use "4rolling" for the output_agg this will give us the mean over 4 windows to smooth the result.

# AS THERE IS A LOT OF DATA THIS COULD TAKE SOME TIME TO RUN.

bw.set_data(data=df, category_col="STREET", timeseries_col="DATE", value_col="DISTRICT", groupby_agg="count", resample = "1w", resample_agg="mean", output_agg="4rolling")
bw.set_display_settings(time_in_seconds=30, video_file_name = "mean_daily_crimes_by_street_.mp4")

## Check the chart settings

### Don't forget the formatting and titles. 

In [None]:
bw.set_chart_options(x_tick_format="{:,.2f}", #<---- add two decimals to the formatting as mean will most likely product floats
                     palette="bone",
                     title="Mean Daily Crimes by Street from <rollingdatetime> to <currentdatetime>",dateformat="%Y-%m-%d", ##   <-------- change 
                     y_label="Offence", 
                     use_top_x=15, display_top_x=10,
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end",
                     sort=True) # <-------- SORTED?
bw.test_chart(30)

## All seems well, lets write the video

In [None]:
bw.write_video()
bw.show_video()

## Play back is all overthe place. This is where SORT=False would come in handy.

In [None]:
bw.set_chart_options(x_tick_format="{:,.2f}", #<---- add two decimals to the formatting as mean will most likely product floats
                     palette="bone",
                     title="Mean Daily Crimes by Street from <rollingdatetime> to <currentdatetime>",dateformat="%Y-%m-%d", ##   <-------- change 
                     y_label="Offence", 
                     use_top_x=15, display_top_x=10,
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end",
                     sort=False) # <-------- SORTED?
bw.test_chart(30)

In [None]:
bw.write_video()
bw.show_video()

# Picture Bars

## We can add images to replace the bars with a simple dictionary of categories/images 

### Images can be .jpg or .png, they do not need to be scaled and they can include transparency.


In [None]:
bw.set_data(data=df, category_col="OFFENSE_CODE_GROUP", timeseries_col="DATE", value_col="DISTRICT", groupby_agg="count", resample_agg="sum", output_agg="cumsum", resample = "1w")
bw.set_display_settings(time_in_seconds=30, video_file_name = "total_crimes_by_type_picture.mp4")

In [None]:
# gather the images

!wget https://raw.githubusercontent.com/lewis-morris/progplot/master/icons/investigate.png
!wget https://raw.githubusercontent.com/lewis-morris/progplot/master/icons/medical.jpg
!wget https://raw.githubusercontent.com/lewis-morris/progplot/master/icons/theft.jpg
!wget https://raw.githubusercontent.com/lewis-morris/progplot/master/icons/accident.jpg

In [None]:
img_dict = {"Investigate Person":"./investigate.png",
"Medical Assistance":"./medical.jpg",
"Larceny":"./theft.jpg",
"Motor Vehicle Accident Response":"./accident.jpg"}

In [None]:
bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="bone", # <--------Change 
                     title="Top 4 Crimes by Total Offences from <mindatetime> to <currentdatetime>",dateformat="%Y-%m-%d", 
                     y_label="Offences", 
                     use_top_x=4, display_top_x=4,
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end",
                     convert_bar_to_image=True,  ## <-------- set to true
                     image_dict=img_dict  ## <------ input image dictionary
                    )
bw.test_chart(30)

## If an image is missing from the dictionary a default will be displayed

In [None]:
bw.set_chart_options(x_tick_format="{:,.0f}",
                     palette="bone",
                     title="Top 10 Crimes by Total Offences from <mindatetime> to <currentdatetime>",dateformat="%Y-%m-%d", 
                     y_label="Offence", 
                     use_top_x=10, display_top_x=10, 
                     border_size=2, border_colour=(0.3,0.3,0.3),
                     font_scale=1.3,
                     use_data_labels="end",
                     convert_bar_to_image=True,  ## <-------- set to true
                     image_dict=img_dict  ## <------ input image dictionary
                    )
bw.test_chart(30)

In [None]:
## Just to note - THIS IS SLOW
## writing with images was 1) a headache to code 2) an efficiency nightmare.
## Maybe oneday I will try and speed it up, but for now it is what it is.

bw.write_video()
bw.show_video()