## Box Office Winner - Gantt Chart

This Python script **visualizes the top box office releases of 2023** using **Altair**. It loads a CSV dataset, sorts movies by date, and **creates timeline segments** showing when each movie was the top release. A **bar chart** represents these segments, with colors distinguishing different movies. The x-axis shows the **months**, while the y-axis lists **top releases** in ascending order. The chart includes **rounded bars** for aesthetics and **text labels** for clarity, displayed in a **1045x768 resolution**.

![Box Office Winner 2023 Data](https://private-user-images.githubusercontent.com/3606672/304127856-f404debd-b1bf-4a98-933e-d3b27e3b3921.svg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDc3NTAyNjAsIm5iZiI6MTcwNzc0OTk2MCwicGF0aCI6Ii8zNjA2NjcyLzMwNDEyNzg1Ni1mNDA0ZGViZC1iMWJmLTRhOTgtOTMzZS1kM2IyN2UzYjM5MjEuc3ZnP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDAyMTJUMTQ1OTIwWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZGZlZTU1OTc4YWNhMDEyNGM5MGM5NjcxOGY4NDM1ZGQ1YTQyZmM1MjY3MzYzNzhiNzUzODc3N2VjZGY0N2Q0MyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.Dj9k8qT8S0xRrVT6TM6OoNaAVerrwJLHRbzd9VAvbWs)

Our temporal axis, spanning from January 1st, 2023 to December 31st, 2023, is represented along the X-axis. Meanwhile, the Y-axis delineates the daily top release for each day. We employ rounded bars to visually signify the duration of a release's dominance at the box office. Each top release is distinguished by a unique color, accompanied by its title displayed preceding the corresponding bar. These releases are organized chronologically, following the order of their initial ascent to the top position.

In [13]:
import altair as alt
import pandas as pd

url = "https://github.com/qnzhou/practical_data_visualization_in_python/files/14239903/box_office_2023.csv"
df = pd.read_csv(url)

df

Unnamed: 0,Date,Holiday,Day of Week,Top 10 Gross,Number of Releases,Top Release,Gross
0,Dec 31 2023,New Year's Eve,Sunday,23078184,40,Wonka,5208897
1,Dec 30 2023,,Saturday,40050370,41,Wonka,8637841
2,Dec 29 2023,,Friday,37348409,41,Wonka,8630268
3,Dec 28 2023,,Thursday,33261609,43,Wonka,7988504
4,Dec 27 2023,,Wednesday,33892628,42,Wonka,8135639
...,...,...,...,...,...,...,...
360,Jan 5 2023,,Thursday,10864987,30,Avatar: The Way of Water,6830651
361,Jan 4 2023,,Wednesday,12131291,30,Avatar: The Way of Water,7475308
362,Jan 3 2023,,Tuesday,16965068,31,Avatar: The Way of Water,10544729
363,Jan 2 2023,,Monday,32548656,30,Avatar: The Way of Water,21411622


In [14]:
df['Date'] = pd.to_datetime(df['Date'])

df = df.sort_values('Date')

df

Unnamed: 0,Date,Holiday,Day of Week,Top 10 Gross,Number of Releases,Top Release,Gross
364,2023-01-01,New Year's Day,Sunday,36210982,31,Avatar: The Way of Water,24519161
363,2023-01-02,,Monday,32548656,30,Avatar: The Way of Water,21411622
362,2023-01-03,,Tuesday,16965068,31,Avatar: The Way of Water,10544729
361,2023-01-04,,Wednesday,12131291,30,Avatar: The Way of Water,7475308
360,2023-01-05,,Thursday,10864987,30,Avatar: The Way of Water,6830651
...,...,...,...,...,...,...,...
4,2023-12-27,,Wednesday,33892628,42,Wonka,8135639
3,2023-12-28,,Thursday,33261609,43,Wonka,7988504
2,2023-12-29,,Friday,37348409,41,Wonka,8630268
1,2023-12-30,,Saturday,40050370,41,Wonka,8637841


In [15]:
df['Date'] = pd.to_datetime(df['Date'])
df.sort_values('Date', inplace=True)

# Initialize an empty DataFrame to hold segments
segments = pd.DataFrame(columns=['Start', 'End', 'Top Release'])

# Iterate through the DataFrame to create segments
prev_movie = None
start_date = None
for i, row in df.iterrows():
    if row['Top Release'] != prev_movie:
        if prev_movie is not None:
            # End the previous segment
            segments = segments.append({
                'Start': start_date,
                'End': row['Date'],
                'Top Release': prev_movie
            }, ignore_index=True)
        # Start a new segment
        start_date = row['Date']
    prev_movie = row['Top Release']

# Add the last segment if the last movie was still at the top at the end of the dataset
if start_date is not None:
    segments = segments.append({
        'Start': start_date,
        'End': df['Date'].iloc[-1],
        'Top Release': prev_movie
    }, ignore_index=True)

segments


  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segments = segments.append({
  segmen

Unnamed: 0,Start,End,Top Release
0,2023-01-01,2023-01-06,Avatar: The Way of Water
1,2023-01-06,2023-01-07,M3GAN
2,2023-01-07,2023-01-25,Avatar: The Way of Water
3,2023-01-25,2023-01-26,Pathaan
4,2023-01-26,2023-02-02,Avatar: The Way of Water
...,...,...,...
78,2023-12-08,2023-12-15,The Boy and the Heron
79,2023-12-15,2023-12-22,Wonka
80,2023-12-22,2023-12-25,Aquaman and the Lost Kingdom
81,2023-12-25,2023-12-26,The Color Purple


In [25]:
import altair as alt 

chart = alt.Chart(segments).mark_bar(
    cornerRadiusTopLeft=5,
    cornerRadiusBottomLeft=5,
    cornerRadiusTopRight=5,
    cornerRadiusBottomRight=5
).encode(
    x=alt.X('Start:T', axis=alt.Axis(format='%B', title='Month')),
    x2='End:T',              
    y=alt.Y('Top Release:N',
            sort=alt.EncodingSortField(field="Start", op="min", order='ascending')),
    color='Top Release:N' 
)

text = alt.Chart(segments).mark_text(
    align='right',
    baseline='middle',
    dx=-5
).encode(
    text='Top Release:N'
)

final_chart = (chart).properties(
    width=1045, 
    height=768 
)

final_chart.display()