# Imputed Congestion Map and Bargraph Animation
## Notebook 5/7

## Gabriel del Valle
## 07/21/24
## NYC DATA SCIENCE ACADEMY

### For any questions about this project or to request full map videos or datasets, please feel free to reach out on Linkedin: 

   www.linkedin.com/in/gabriel-del-valle-147616152

   gabrielxdelvalle@gmail.com


### This notebook is the same as the previous but using the imputed_congestion.csv dataset

This python jupyter notebook contains several functions that can be used together to produce a video of a Manhattan Congestion Zone Map, displaying average volume per street per datetime.
 
As well, this notebook follows a similar process to create a video of a bargraph which corresponds to the animated congestion map, producing a bar graph, labeled by street, measuring the traffic volume for each datetime frame. The colors of the bars represent congestion with the same colors used in the map.

The videos of the map and the bargraph which are the products of this notebook can be played side by side and will correspond as a single video.

This notebook will outline the steps for animating the entire dataset, which is 1 hour 39 minutes long. However, the functions can easily be used to produce a shorter slice of video given a start and an end row.

It takes hours to render the whole czone_October dataset, but on my macbook frames are produced almost once per second. 


## Functions:

### standardize_street_names( )
    
    Makes all street names 30 characters for the bargraph video, so image sizes stay standard

### animate_bargraph_img( )
    
    Given a datetime interval produces a bargraph displaying street names in a descending order bargraph of vol, bars colored with the congestion value. Generates single image per datetime.

### animate_bargraph( )

    Given a start row and a number of rows to generate, loops through the datetime range of the congestion_streets dataset and inputs them into animate_bragraph_img()

### plot_congestion_anim( )

    Given a datetime, generate a Manhattan Congestion Relief Zone map which displays the congestion values of each street recorded in congestion_streets, as a color from green to yellow to red (representing 0.0 to 1.0)

### animate_map( )

    Given a start row and a number of rows to generate, loops through the datetime range of the congestion_streets dataset and inputs them into plot_congestion_anim()
    
### image_names()

    Given a start row and an end row corresponding to the congestion_streets dataset, image_names() produces a list of strings with the same date and index based naming schemes as plot_congestion_anim() and animate_bargraph_img().
    
    This is useful to quickly generate a list of specific files for operations in the video making process that require file names.
    
    
###  multiply_frames( )

    For the purpose of creating videos with the CV2 library, which has a minimum framerate of 24 frames per second. 

    In order to see each frame for half a second, multiply frames by 12

    Exports new multiplied frames to a new directory, divides names with a letter suffix


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import geopandas as gpd
from shapely.geometry import MultiLineString, LineString
import re
import os
import cv2
from matplotlib.colors import Normalize
from matplotlib.cm import get_cmap

## Import datasets:

### congestion_streets.csv

    Created in the previous notebook which adds to traffic data a measure of congestion -- a relative measure of street business: 

    congestion = current street volume / maximum volume of street
    
    scale 0.0 to 1.0


### czone_json.geojson

    The geojson map file which is prepared to work with the congestion_streets dataset, sharing a naming scheme and corresponding streets. 
    
    Contains only streets in the congestion_streets dataset.


### base_map.geojson

    A full detail map of Manhattan to use underneath czone_json

In [2]:
# October Traffic Map Data

OTMD = pd.read_csv("imputed_congestion.csv")
OTMDdates = OTMD['datetime'].unique()

In [3]:
json_streets = gpd.read_file('czone_json.geojson')
base_map = gpd.read_file('base_map.geojson')

### Standardize_street_name   -- This function is needed when making the bargraph video. 

### If the length of a label changes so will the size of the image and the images will be incompatible to create video. 

### This function effectively fixes this by adding space to make all labels the length of the longest name.

In [4]:
def standardize_street_name(street):
    """Ensures all street names are 30 characters long."""
    return street.rjust(30)

OTMDstreets30 = OTMD.copy()

OTMDstreets30['street'] = OTMD['street'].apply(standardize_street_name)

In [5]:
import warnings

# Suppress all warnings
warnings.filterwarnings("ignore")
# Re-enable warnings
#warnings.filterwarnings("default")

### animate_bargraph_img( ) is designed to be called by animate_bargraph( )

### Produces a bargraph, in descending order of volume, colored with congestion value, per datetime.

### i is used for naming, you can start it at 0 if you wish to use if for a different slice

In [6]:
# Get congestion color map
cmap = get_cmap('RdYlGn_r')

#26 spaces so that NA label adds up to 30
spaces = "                          "
alphabetCap = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

def animate_bargraph_img(datetime, output_dir, i):
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    
    data = OTMDstreets30[OTMDstreets30['datetime'] == datetime]
    
    data20 = data.sort_values(by = 'Vol', ascending = False).head(20)
    
    
    if len(data20) < 20:
        needed = 20 - len(data20)
        dummy_streets = [f"{spaces}NA {alphabetCap[j]}" for j in range(needed)]
        dummy_data = pd.DataFrame({
            'street': dummy_streets,
            'Vol': [0] * needed,
            'congestion': [0] * needed
        })
        data20 = pd.concat([data20, dummy_data], ignore_index=True)
    
    


    
    #fig, ax = plt.subplots()
    fig, ax = plt.subplots(figsize=(10, 8))


    congestion = data20['congestion']
    streets = data20['street']
    ax.bar(streets, data20['Vol'], color=[cmap(v) for v in congestion])

    y_ticks = np.arange(0, 1501, 100)  # Creates ticks from 0 to 1500, spaced by 100
    ax.set_yticks(y_ticks)
    
    ax.set_title(datetime)
    ax.set_xlabel('Street')
    ax.set_ylabel('Traffic Volume')
    #max_vol_max = OTMD['max_volume'].max()
    ax.set_ylim(0, 1500)
    
    #Monofont keeps the horizontal spacing of x axis ticks the same
    #This is curcial for keeping horizontal width constant
    #Which is crucial for combining images into a video
    ax.set_xticklabels(ax.get_xticklabels(), fontdict={'fontname': 'monospace'})
    ax.tick_params(axis='x', rotation=90)  # Rotate x-axis labels

    # Save the figure as an image
    output_file = os.path.join(f'bargraph_{datetime}.png')
    output_path = f'{output_dir}/{output_file}'
    plt.savefig(output_path, bbox_inches='tight', pad_inches=0.1, dpi = 100)
    plt.close()

    return output_file

## The bargraph video is much faster to produce than the map video, and can be done all at once rather than in chunks at a time. 

## output_dir -- name of the folder that will be created in the current directory to save images to



# Animate_bargraph( ) return: file_names

## Names of files that were generated. 
## Useful later for duplicating frames and making video

In [7]:
def animate_bargraph(output_dir, start_row, num_rows):
        
    file_names = [0] * num_rows
    end_row = start_row + num_rows

    for i, datetime in enumerate(OTMDdates[start_row:end_row], start=start_row):
        if i >= end_row:
            break
        
        file_names[i-start_row] = animate_bargraph_img(datetime, output_dir, i)
        
    return file_names

## plot_congestion_anim( ) designed to be called by animate_map( )

In [41]:
def plot_congestion_anim(datetime, output_dir, index):
    cmap = plt.cm.get_cmap('RdYlGn_r')
    geom_pattern = r"[-+]?\d*\.\d+|\d+"
    base_map['color'] = 'gray'

    # Filter data for the specific datetime
    datetime_data = OTMD[OTMD['datetime'] == datetime]

    # Create a new figure
    fig, ax = plt.subplots()

    base_map.plot(ax=ax, color=base_map['color'], edgecolor='black', linewidth=0.5)

    # Iterate over each street segment
    for _, row in datetime_data.iterrows():
        street_name = row['street']
        congestion = row['congestion']

        # Get geometry of the street segment
        street_geometries = list(json_streets[json_streets['st_name'] == street_name]['geometry'])

        for street_geometry in street_geometries:
            # Extract coordinates from geometry
            geometry_string = str(street_geometry)
            coordinates = re.findall(geom_pattern, geometry_string)
            coordinates = [float(coord) for coord in coordinates]

            # Split coordinates into pairs (longitude, latitude)
            coordinates_pairs = [(coordinates[i], coordinates[i + 1]) for i in range(0, len(coordinates), 2)]

            # Plot the line segment
            for i in range(len(coordinates_pairs) - 1):
                segment = coordinates_pairs[i:i + 2]
                xs, ys = zip(*segment)  # Unzip the segment into x and y coordinates
                ax.plot(xs, ys, color=cmap(congestion), linewidth=1)

    ax.set_title(f'{datetime}')
    plt.axis('off')

    # Save the figure as an image
    output_file = os.path.join(f'{index:04d}_{datetime}.png')
    output_path = f'{output_dir}/{output_file}'
    plt.savefig(output_path, bbox_inches='tight', pad_inches=0.1)
    plt.close()
    return(output_file)

# Animate_bargraph( ) return: file_names

## Names of files that were generated. 
## Useful later for duplicating frames and making video

## output_dir -- name of the folder that will be created in the current directory to save images to

In [9]:
def animate_map(output_dir, start_row, num_rows):
    
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
        
    file_names = [0] * num_rows
    end_row = start_row + num_rows

    # Iterate over rows of congave_October starting from start_row
    for i, datetime in enumerate(OTMDdates[start_row:end_row], start=start_row):
        # Check if we've reached the desired number of rows
        if i >= end_row:
            break
        
        # Call the plot_congestion function for the current datetime
        file_names[i-start_row] = plot_congestion_anim(datetime, output_dir, i)
        
    return file_names

## Important Note: attempting to generate more than 1000 frames at a time using animate_map( ) can cause python kernel to crash. 1000 at a time worked reliably for my macbook m3 pro

### image_names( ) works for reproducing both bargraph and map image names

In [10]:
def image_names(start_row, num_rows):
    file_names = [0] * num_rows
    end_row = start_row + num_rows
    for i, datetime in enumerate(OTMDdates[start_row:end_row], start=start_row):
        # Check if we've reached the desired number of rows
        if i >= end_row:
            break
        
        output_file = os.path.join(f'{i:04d}_{datetime}.png')
        file_names[i-start_row] = output_file

    return file_names

## multiply_frames( )
## For the purpose of creating videos with the CV2 library, which has a minimum framerate of 24 frames per second. 

## In order to see each frame for half a second, multiply frames by 12

## Exports new multiplied frames to a new directory, divides names with a letter suffix

## factor = multiplication rate

## filenames = list of filenames, such as the result of running animate_map( ) or animate_bargraph( )

In [11]:
import shutil


alphabet = 'abcdefghijklmnopqrstuvwxyz'

def multiply_frames(factor, input_directory, output_directory, file_names):
    # Check if input directory exists
    if not os.path.exists(input_directory):
        raise FileNotFoundError(f"The directory {input_directory} does not exist.")
        
    num_files = len(file_names) * factor
    new_file_names = [0] * num_files
    index = 0
    
    # Create the output directory if it doesn't exist
    if not os.path.exists(output_directory):
        os.makedirs(output_directory)
    
    
    # Process each file in the file_names list
    for title in file_names:
        file_path = os.path.join(input_directory, title)
        if not os.path.exists(file_path):
            continue  # Skip if file does not exist
        
        # Create duplicates
        for i in range(factor):
            suffix = alphabet[i % len(alphabet)]
            new_title = f"{title.rsplit('.', 1)[0]}_{suffix}.{title.rsplit('.', 1)[1]}"
            new_file_path = os.path.join(output_directory, new_title)
            shutil.copy(file_path, new_file_path)
            new_file_names[index] = new_title
            index +=1
    
    return new_file_names

# Generate Congestion Map

Number of total frames = number of unique dates

In [12]:
len(OTMD['datetime'].unique())

11852

To generate 10716 frames it must be broken into 1000 frames to process at a time, or the python kernel could die. 

Depending on your machine consider lowering the number of frames per chunk.

In [16]:
map_batch01 = animate_map("ImputedMapFinal", 0, 1000)

In [17]:
map_batch02 = animate_map("ImputedMapFinal", 1000, 1000)

In [18]:
map_batch03 = animate_map("ImputedMapFinal", 2000, 1000)

In [19]:
map_batch04 = animate_map("ImputedMapFinal", 3000, 1000)

In [20]:
map_batch05 = animate_map("ImputedMapFinal", 4000, 1000)

In [21]:
map_batch06 = animate_map("ImputedMapFinal", 5000, 1000)

In [22]:
map_batch07 = animate_map("ImputedMapFinal", 6000, 1000)

In [23]:
map_batch08 = animate_map("ImputedMapFinal", 7000, 1000)

In [24]:
map_batch09 = animate_map("ImputedMapFinal", 8000, 1000)

In [25]:
map_batch10 = animate_map("ImputedMapFinal", 9000, 1000)

In [26]:
map_batch11 = animate_map("ImputedMapFinal", 10000, 1000)

In [27]:
map_batch12 = animate_map("ImputedMapFinal", 11000, 852)

### Use image_names( ) to produce a single list of all the files stored in the 11 different map batches

In [28]:
imp_batch_final = image_names(0, 11852)

### Use list of image names with multiply_frames( ) to create 12 of each of the original map frames

In [29]:
imp_batch_final_12frame = multiply_frames(12, "ImputedMapFinal", "ImputedMapFinal_12Frame", imp_batch_final)

### Join images together into a video at 24 frames per second using image  names list and cv2 library

In [31]:
fourcc = cv2.VideoWriter_fourcc(*'mp4v')

# Open the first image to get its dimensions
first_image_path = f'ImputedMapFinal_12Frame/{imp_batch_final_12frame[0]}'
first_image = cv2.imread(first_image_path)
frame_width = first_image.shape[1]
frame_height = first_image.shape[0]


#aspect_ratio = frame_width / frame_height

# Initialize the VideoWriter object with the calculated dimensions
output_video = cv2.VideoWriter('ImputedMapFinal.mp4', fourcc, 24, (frame_width, frame_height))

# Iterate over each image and add it to the video
for image in imp_batch_final_12frame:
    image_path = f'ImputedMapFinal_12Frame/{image}'
    img = cv2.imread(image_path)
    output_video.write(img)

# Release the VideoWriter object
output_video.release()

### Generate frames for the bargaph video, one per datetime:

In [60]:
graph_batch = animate_bargraph("ImputedBarGraph", 0, 11852)

### Use list of image names with multiply_frames( ) to create 12 of each of the bargraph frames

In [61]:
graph_batch_12frame = multiply_frames(12, "ImputedBarGraph", "ImputedBarGraph_12Frame", graph_batch)

### Join images together into a video at 24 frames per second using image  names list and cv2 library

In [62]:
fourcc = cv2.VideoWriter_fourcc(*'mp4v')

# Open the first image to get its dimensions
first_image_path = f'ImputedBarGraph_12Frame/{graph_batch_12frame[0]}'
first_image = cv2.imread(first_image_path)
frame_width = first_image.shape[1]
frame_height = first_image.shape[0]


#aspect_ratio = frame_width / frame_height

# Initialize the VideoWriter object with the calculated dimensions
output_video = cv2.VideoWriter('ImputedBarGraph.mp4', fourcc, 24, (frame_width, frame_height))

# Iterate over each image and add it to the video
for image in graph_batch_12frame:
    image_path = f'ImputedBarGraph_12Frame/{image}'
    img = cv2.imread(image_path)
    output_video.write(img)

# Release the VideoWriter object
output_video.release()