1) Filter the data to include only weekdays (Monday to Friday) and plot a line graph showing the pedestrian counts for each day of the week.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

url = "https://data.cityofnewyork.us/api/views/6fi9-q3ta/rows.csv?accessType=DOWNLOAD"
df = pd.read_csv(url)

df['hour_beginning'] = pd.to_datetime(df['hour_beginning'])

weekdays = df[df['hour_beginning'].dt.dayofweek <= 4]

weekdays['day_name'] = weekdays['hour_beginning'].dt.day_name()

ordered_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']

ped_counts_per_day = (
    weekdays.groupby('day_name')['Pedestrians']
    .sum()
    .reindex(ordered_days) 
)

plt.figure(figsize=(10, 6))
plt.plot(ped_counts_per_day.index, ped_counts_per_day.values, marker='o', linestyle='-', linewidth=2)
plt.title('Total Pedestrian Counts (Weekdays Only)', fontsize=16)
plt.xlabel('Day of the Week', fontsize=12)
plt.ylabel('Pedestrian Counts', fontsize=12)
plt.xticks(fontsize=11)
plt.yticks(fontsize=11)
plt.grid(True, linestyle='--', alpha=0.6)

plt.tight_layout()
plt.show()



2) Track pedestrian counts on the Brooklyn Bridge for the year 2019 and analyze how different weather conditions influence pedestrian activity in that year. Sort the pedestrian count data by weather summary to identify any correlations( with a correlation matrix) between weather patterns and pedestrian counts for the selected year.

-This question requires you to show the relationship between a numerical feature(Pedestrians) and a non-numerical feature(Weather Summary). In such instances we use Encoding. Each weather condition can be encoded as numbers( 0,1,2..). This technique is called One-hot encoding.

-Correlation matrices may not always be the most suitable visualization method for relationships involving categorical datapoints, nonetheless this was given as a question to help you understand the concept better.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

url = "https://data.cityofnewyork.us/api/views/6fi9-q3ta/rows.csv?accessType=DOWNLOAD"
df = pd.read_csv(url)
df['hour_beginning'] = pd.to_datetime(df['hour_beginning'])
df_2019 = df[(df['hour_beginning'].dt.year == 2019) & (df['Pedestrian_Bridge'] == 'Brooklyn Bridge')]
df_2019 = df_2019.dropna(subset=['Weather_Summary', 'Pedestrians'])
weather_encoded = pd.get_dummies(df_2019['Weather_Summary'])
weather_ped_df = pd.concat([df_2019[['Pedestrians']], weather_encoded], axis=1)
correlation_matrix = weather_ped_df.corr()
corr_with_peds = correlation_matrix[['Pedestrians']].drop('Pedestrians')

plt.figure(figsize=(10, 6))
sns.heatmap(corr_with_peds, annot=True, cmap='coolwarm', center=0)
plt.title('Correlation Between Weather Conditions and Pedestrian Counts (Brooklyn Bridge, 2019)', fontsize=14)
plt.xlabel('Pedestrian Count')
plt.ylabel('Weather Summary')
plt.tight_layout()
plt.show()

 3)Implement a custom function to categorize time of day into morning, afternoon, evening, and night, and create a new column in the DataFrame to store these categories. Use this new column to analyze pedestrian activity patterns throughout the day.

 -Students can also show plots analyzing activity.


In [None]:
import pandas as pd
import matplotlib.pyplot as plt

url = "https://data.cityofnewyork.us/api/views/6fi9-q3ta/rows.csv?accessType=DOWNLOAD"
df = pd.read_csv(url)

df['hour_beginning'] = pd.to_datetime(df['hour_beginning'])

def categorize_time_of_day(hour):
    if 5 <= hour < 11:
        return 'Morning'
    elif 11 <= hour < 16:
        return 'Afternoon'
    elif 16 <= hour < 21:
        return 'Evening'
    else:
        return 'Night'

df['Time_of_Day'] = df['hour_beginning'].dt.hour.apply(categorize_time_of_day)

ped_by_time = df.groupby('Time_of_Day')['Pedestrians'].sum().reindex(['Morning', 'Afternoon', 'Evening', 'Night'])

plt.figure(figsize=(8, 6))
ped_by_time.plot(kind='bar', color='skyblue')
plt.title('Total Pedestrian Activity by Time of Day')
plt.xlabel('Time of Day')
plt.ylabel('Total Pedestrians')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()