In [1]:
import pandas as pd
import numpy as np          
import matplotlib.pyplot as plt
import seaborn as sns   
import statsmodels.api as sm
import statsmodels.formula.api as smf
from numpy import log

In [2]:
df = pd.read_csv('top_stations.csv')

In [3]:
# Get unique stations from the aggregated data
stations = df['place_id'].unique()

# Dictionary to store the regression results for each station
station_results = {}

# Define the regression formula
formula = ("trip_count ~ max_temp + avg_humidity + min_temp + avg_windspeed + pressure_change + "
           "avg_pressure + avg_visibility + bad_weather_half_hours + trip_count_lag1 + "
           "C(time_of_day) + C(month) + C(weekend)")

for station in stations:
    # Subset data for the current station
    subset = df[df['place_id'] == station].copy()
    
    # Negative Binomial regression for this station
    model_station = smf.glm(formula=formula,
                            data=subset,
                            family=sm.families.NegativeBinomial()).fit()
    
    # Store the model summary for the station
    station_results[station] = model_station.summary()



In [4]:
df['place_id'].unique()

array([ 515,  517,  607,  609,  611,  905, 1101, 1304], dtype=int64)

### Station 611: Nyugati tér

In [5]:
df[df['place_id']==611]['place_name'].head(1)

2530    Nyugati tér
Name: place_name, dtype: object

In [6]:
print(station_results[611])  # Print the summary for station 611

                 Generalized Linear Model Regression Results                  
Dep. Variable:             trip_count   No. Observations:                  648
Model:                            GLM   Df Residuals:                      627
Model Family:        NegativeBinomial   Df Model:                           20
Link Function:                    Log   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -1603.9
Date:                Tue, 15 Apr 2025   Deviance:                       121.77
Time:                        11:40:41   Pearson chi2:                     128.
No. Iterations:                     7   Pseudo R-squ. (CS):             0.2682
Covariance Type:            nonrobust                                         
                                          coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------------------------
In

### Nyugati tér Station (Station 611) – Regression Results Interpretation

**Context:**  
Nyugati tér is one of the most common end stations, recording the highest number of trips. It is a major transportation hub in central Budapest, located near the WestEnd shopping mall and the Nyugati Railway Station. This area is bustling with commuters, tourists, and has strong public transport connections (Metro M3, trams 4 and 6), which may contribute to the observed trip patterns.

---

#### Month & Weekend Effects  
- **Results:**  
  - The coefficients for Month [T.2], [T.3], [T.4], [T.5], are **not statistically significant** (p > 0.05).
  - Weekend has a **significant  negative effect**

- **Interpretation:**  
  - There is **no clear seasonal effect** on trip counts at Station 611. Usage appears consistent regardless of the month but the usage appears to diminsh during the weekends compared to working days weekend.

---

#### Weather Variables  
- **Max Temperature:**  
  - **Coefficient:** exp(0.0712) ≈ 1.07  
  - **p-value:** 0.756  
- **Min Temperature:**  
  - **Coefficient:** exp(−0.0833) ≈ 0.92  
  - **p-value:** 0.651  
- **Other Weather Factors: 
   - Avg Humidity, Avg Windspeed:** have p-values > 0.05.
   - Pressure change has a significant positive effect. 
- **Interpretation:**  
  - **No strong evidence** that day-to-day variations in temperature, humidity, wind, or minor “bad weather” events affect trip counts at Station 611.
  - This suggests that the station’s trafficc is **comparatively insensitive to typical weather fluctuations**.
  - A potential explanation is that many users are commuters or short-distance travelers who rely on this hub regardless of moderate weather changes.

---

#### Time-of-Day Effects  
- **Result:**  
  - Time of Day is the **primary factor** influencing trip counts at this location.
  - The **morning period** shows significantly fewer trips compared to the baseline period (likely Afternoon Rush Hour).
  - Other dayparts exhibit non-significant differences relative to the baseline.

- **Interpretation:**  
  - The strong impact of the time of day indicates that the station's usage is more closely linked with daily commuting patterns rather than weather conditions.

---

### **Conclusion**

- **Weather Variables:**  
  Weather variables (temperature, humidity, wind speed, etc.) are **not significant** predictors of trip counts at Nyugati tér.  
- **Overall Impact:**  
  - The data indicate that **Nyugati tér Station's usage is robust to typical weather fluctuations**.
  - This robustness may be due to heavy usage by commuters and travelers who depend on the station for connecting to other public transport options.
- **Implications for Policy/Planning:**  
  - Efforts to maintain or enhance service quality at this hub might focus more on managing peak time demands rather than weather mitigation strategies.

This analysis supports the hypothesis that strategic end stations like Nyugati tér are relatively **less affected by weather factors**, primarily because their user base is driven by consistent, weather-insensitive travel needs.


### Station 517: Városháza Park

In [7]:
df[df['place_id']==517]['place_name'].head(1)

609    Városháza Park
Name: place_name, dtype: object

In [8]:
print(station_results[517]) 

                 Generalized Linear Model Regression Results                  
Dep. Variable:             trip_count   No. Observations:                  640
Model:                            GLM   Df Residuals:                      619
Model Family:        NegativeBinomial   Df Model:                           20
Link Function:                    Log   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -1518.3
Date:                Tue, 15 Apr 2025   Deviance:                       122.16
Time:                        11:40:41   Pearson chi2:                     128.
No. Iterations:                     8   Pseudo R-squ. (CS):             0.2877
Covariance Type:            nonrobust                                         
                                          coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------------------------
In

**Városháza Park Station – Regression Results Interpretation**

**Context:**  
Városháza Park is located near the City Hall in central Budapest, an area that can attract both residents and tourists. 

---

### Time-of-Day Effects  
- **Baseline:** Afternoon Rush Hour (implied reference).  
- **Coefficients (Morning, Evening, etc.)** are **not statistically significant** (p > 0.05).  
  - *Interpretation:* There is no strong evidence that time-of-day segments differ from Afternoon Rush Hour in terms of trip counts for this station.

---

### Month & Weekend Effects  
- **Month [T.2], [T.3], [T.4], [T.5]** and **Weekend** have **p > 0.05**.  
  - *Interpretation:* No clear seasonal or weekend pattern emerges. Usage appears relatively stable across months and weekdays/weekends.

---

### Weather Variables  
- **Max Temp (coef ≈ 0.1790, p=0.572)**  
- **Min Temp (coef ≈ 0.0343, p=0.858)**  
- **Avg Humidity, Avg Windspeed, Avg Visibility, Bad Weather Half Hours**  
  - All show **p-values > 0.05**.  
  - *Interpretation:* No statistically significant influence of these weather factors on bike trips to Városháza Park.

#### **Pressure Change** (coef ≈ 0.1951, p=0.074)  
- Borderline significance (p just above 0.05).  
  - If we exponentiate the coefficient, `exp(0.1951) ≈ 1.215`, implying a potential 21.5% increase in trips with each unit increase in pressure change.  
  - Since p=0.074 > 0.05, this effect is **not conventionally significant** but could merit a cautious note.

---

### **Conclusion on Weather Sensitivity**  
- **Overall Finding:** Temperature (both Max and Min) emerges as a **strong predictor** of trip counts to Városháza Park, while other weather predictors (wind, humidity, etc.) show no significant effects.   
- **Implication:** Trips ending at Városháza Park seem **largely unaffected by day-to-day weather variations**, suggesting that other factors (like local attractions, short-distance or routine usage) may dominate ridership decisions at this station.



In [None]:
df[df['place_id']==905]['place_name'].head(1)

1867    Kálvin tér
Name: place_name, dtype: object

In [None]:
print(station_results[905])

                 Generalized Linear Model Regression Results                  
Dep. Variable:             trip_count   No. Observations:                  687
Model:                            GLM   Df Residuals:                      666
Model Family:        NegativeBinomial   Df Model:                           20
Link Function:                    Log   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -1750.3
Date:                Mon, 14 Apr 2025   Deviance:                       136.57
Time:                        12:39:35   Pearson chi2:                     140.
No. Iterations:                     8   Pseudo R-squ. (CS):             0.3447
Covariance Type:            nonrobust                                         
                                          coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------------------------
In

**No particular differences with the other top 5 stations, besides the borderline non-significance of pressure change.**

Kálvin tér is a major square and intersection in the city center. It serves as a key urban hub where cultural events. Being a major thoroughfare and locality, the square is a major transport hub with tram, bus, and trolleybus routes serving the square. The Kálvin tér station on the M3 (North-South) line, and M4 of the Budapest Metro is located here.
The Hungarian National Museum is near Kálvin tér.

In [11]:
df[df['place_id']==1304]['place_name'].head(1)

4460    Margitsziget 
Name: place_name, dtype: object

In [12]:
print(station_results[1304])

                 Generalized Linear Model Regression Results                  
Dep. Variable:             trip_count   No. Observations:                  541
Model:                            GLM   Df Residuals:                      520
Model Family:        NegativeBinomial   Df Model:                           20
Link Function:                    Log   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -1296.7
Date:                Tue, 15 Apr 2025   Deviance:                       114.22
Time:                        11:40:41   Pearson chi2:                     124.
No. Iterations:                     9   Pseudo R-squ. (CS):             0.4660
Covariance Type:            nonrobust                                         
                                          coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------------------------
In

Across the top five strategic end stations in Budapest, the results consistently indicate that only extreme temperature metrics and the magnitude of daily pressure change significantly influence trip counts. In other words, both maximum and minimum temperatures play a significant role in determining rider behavior—higher maximum temperatures tend to boost trips while higher minimum temperatures tend to lower them. Similarly, larger daily pressure changes (which signal weather volatility) are associated with noticeable shifts in trip demand.

These stations, which include hubs like Nyugati tér, Városháza Park, Keleti Pályaudvar, and Szent István Park, are located in key central areas where commuters and tourists rely heavily on the network. Because these users are generally motivated by tight schedules and strong transit connections, they tend to disregard modest variations in humidity, wind speed, or minor adverse weather. Only when the weather conditions are markedly different—such as with temperature extremes or significant pressure changes—do these factors meaningfully affect travel choices. This pattern reflects the robustness of these stations as central transit nodes in Budapest, where travel is driven more by established commuting needs than by everyday weather fluctuations.

# A Note on Pressure Change

Daily pressure change, defined as the difference between the highest and lowest atmospheric pressure throughout the day, offers a unique insight into weather volatility. Unlike average temperature or humidity, this metric captures the rapid fluctuations that often signal transitional weather periods. When pressure changes significantly, it can indicate that conditions are on the brink of a shift—either improving or deteriorating. This phenomenon is especially relevant to bike riders; a marked pressure drop might warn of an impending storm or increased cloud cover, prompting commuters to finish their trips promptly, while a rise in pressure could suggest a clearing, more stable day.

This pattern is consistently observed across different stations, making pressure variability a reliable proxy for weather-induced changes in rider behavior throughout the city. In the broader context of weather forecasting, a decrease in pressure typically suggests the arrival of a low-pressure system, associated with rising, cooling air and a greater chance of precipitation or storms. Conversely, increasing pressure usually heralds clearer, calmer conditions. Thus, tracking pressure change not only enhances our understanding of daily weather patterns but also provides practical insights into how such fluctuations may drive decisions in urban mobility.
