# Hypothesis Question:
How do driver vacancies impact bus service cancellations and on-time performance across different regions over time?

# Requirements Outline 



## Functional Requirements - 

Import and process data from the bus performance dataset.

Filter and group data by region and time (e.g. monthly).

Visualize trends in driver vacancies, cancellations, and OTR.

Calculate correlations between driver vacancies and service reliability.

Generate reports or dashboards that summarize findings.

Export results for presentations or further analysis.

## Non Funcitonal Requirements - 

Accuracy: Ensure calculations (e.g. correlation, averages) are correct and verifiable.

Performance: Handle moderate dataset sizes without noticeable delays.

Usability: Easy to use with clear labels, charts, and filters.

Scalability: Should support updates if new monthly data is added.

Security: Protect sensitive information if working with private school/community data.

Maintainability: Code and process should be easy to update or modify.



# Use-Case

Actor: Any User

Goal: To explore, visualize, and interact with data related to driver shortages, service cancellations, and on-time performance.

Preconditions:
The bus performance dataset is already preloaded into the system.

The user has access to the text-based user interface (e.g., via terminal or Python script).

Python and required libraries (e.g., pandas, matplotlib) are installed.

Main Flow:
User opens the program and sees the following text-based menu:

1. View data table
2. Filter data by region or month
3. View visualisation
4. Analyze driver vacancies vs cancellations
5. Update a data entry
6. Exit

User selects one of the options:

Option 1: Displays the full dataset or a sample.

Option 2: Asks user for filter criteria (e.g., Region = “Sydney”, Month = “May”) and displays results.

Option 3: Shows line/bar graph (e.g., driver vacancies over time).

Option 4: Calculates and displays correlation between driver shortages and cancellations or OTR.

Option 5: Prompts for a row number and new value to update a specific field.

Option 6: Exits the program.

System executes the action and displays results in the console or as pop-up charts.

Postconditions:
User has viewed data trends and visualisations.

If a valid update was made, it is saved in the session (or optionally exported).

The dataset remains available for further analysis or exploration.




## Research On Issue

“Bus driver shortage hits Sydney — brace for chaos” reports widespread cancellations across the city due to driver shortages

Link:https://www.7news.com.au/stories/bus-driver-shortage-hits-sydney/

Sydney school transport ‘chaos’ warning describes thousands of route cancellations affecting students, with the Rail, Tram and Bus Union (RTBU) pointing to workplace conditions and privatization as key issues

Link:https://www.9news.com.au/national/sydney-transport-chaos-warning-as-bus-services-cancelled/35bc34ff-f1f9-4186-9b29-7f5458bc11fe

A recent Auditor‑General’s report highlights that driver shortages are tightly linked to cancellation rates and poor on‑time performance, urging contract reforms in NSW

Link:https://www.busnews.com.au/nsw-bus-industry-calls-for-action-after-auditor-generals-bus-contracts-findings/

Academic & Industry Research

A study in Vietnam examines intent to quit among urban bus drivers, showing factors like scheduling pressure, lack of support, and environmental stressors predict attrition

Link: https://www.mdpi.com/2071-1050/17/7/2850

An APTA workforce report (North America) shows 96% of transit agencies report staffing shortages, with aging workforce, retirements, and pandemic-era churn cited as key causes 

Link: https://www.apta.com/research-technical-resources/research-reports/transit-workforce-shortage/


## SEE I Paragraph

S – State it:
The ongoing bus driver shortage in Sydney is significantly impacting the reliability of public transport, leading to widespread cancellations and frustration among commuters.

E – Elaborate:
This shortage has resulted in a consistent rise in service cancellations and delays across major regions in NSW. Government reports and taskforce findings reveal that the root causes include poor working conditions, an aging workforce, and insufficient recruitment efforts. These challenges are not unique to Sydney; similar patterns are observed internationally, highlighting systemic issues within the transit industry.

E – Exemplify:
For instance, the NSW Bus Industry Taskforce described the system as "neglected and underfunded," while a 7News article warned commuters to "brace for chaos" due to hundreds of cancelled routes. Additionally, academic studies from Vietnam and the Netherlands show that long hours, stress, and a lack of flexibility drive attrition rates among bus drivers worldwide.

I – Illustrate:
It’s like a sports team trying to win a match with half its players injured or missing—no matter how good the strategy or schedule, without enough skilled players (in this case, drivers), the whole system struggles to perform. Similarly, until driver numbers are stabilized, transport reliability will continue to suffer.

## Analyse and Conclude

S (State it):
Public complaints are a strong indicator of gaps in transport reliability, especially in areas like cancelled services and driver shortages.

E (Elaborate):
This means that the volume and type of complaints received can reveal how well or poorly a transport system is functioning. For example, spikes in complaints often align with service cancellations, suggesting that unreliability directly affects user experience.

E (Example):
In the dataset analysed, months with high driver vacancy rates also showed increased percentages of service cancellations. This likely contributes to more people reporting missed or delayed buses. If people cannot rely on the system, their trust and satisfaction decrease, which is reflected in complaint numbers.

I (Interpret):
I see this as confirmation of the hypothesis: "Higher driver vacancies lead to lower reliability and more negative public response." However, to strengthen the analysis, more detailed complaint data (e.g. type, location, frequency) would help draw clearer connections between specific issues and public sentiment.

Conclusion:
Based on current data, there is a clear relationship between staffing shortages and reliability issues, which likely contributes to higher complaint volumes. More granular complaint records or rider feedback would support deeper insights.

## Peer Reports

Quintus- Dev's calculations and datasets for his bus problems are quite good. 

+ Dev calculation consice, reliable dataset, good coding, good evaluation, good research
- Nothing minus if there was one thing I would change is to use better markdown formatting
Implications- The implications are that Dev did a really good job in this project and I could not do better myself.

Oscar - Dev's project meets exceptional standards. I am blown away be the level of complexity in Dev's Project 

+ Dev has made exceptional use of his amazing coding skills and his amazing research skills.
- I don't see anything that I would improve. I would suggest him to probably suggest him to transfer is dataset into an md format.
Implications- Dev did an amazing job on this Assesment Task and should be awarded full marks. 


Karna- Devs data dictionary is well organised with reilable sources.

+ Dev's uses his knowledge of python to construct accurate data sets that show the impact of driver shortages and bus times.
- Could uses a shape up with amount of data collect to create a more variable represention of this question.
Implictaion - Dev's project meets the best standards of a datasets and organiesation to create a perfect view of this problem.



## Evaluate Project

 ### Requirements Outline 
 
The system successfully delivers on key requirements:

Data Loading & Cleaning – Automatically imports the dataset, parses dates, and handles missing values.

Visualisation – Produces clear graphs comparing vacancies and cancellations over time.

Filtering – Enables targeted analysis by region or month.

However, some advanced requirements remain unmet, such as real-time data updates, interactive dashboards, and integrating public complaint data.

### Peer Feedback

Peer feedback on my project was consistently positive, recognising the accuracy of my calculations, the reliability of my dataset, the strength of my coding, and the depth of my research. Reviewers noted that the project demonstrated a high level of complexity and organisation, with datasets that effectively addressed the bus performance problem.

The main suggestions for improvement were to refine the Markdown formatting for better readability, consider presenting datasets in a Markdown-friendly format for easier sharing, and increase dataset variety to allow for a more diverse representation of the issue. These are relatively minor adjustments, but implementing them could enhance the accessibility and richness of the project.

Overall, the feedback confirms that my work meets a high standard and is well-suited to its purpose, with improvements mainly centred on presentation and variety rather than core accuracy or functionality.


### Project Management

Strengths: Modular coding approach (main.py + data_module.py) improved maintainability and troubleshooting.

Weaknesses: Late introduction of new features (e.g., complaints data) created scope creep and disrupted scheduling.

Next Steps: Use a more structured sprint cycle, approving feature additions only after milestone reviews.

Data & Security

Data Quality:

Validity: Data sourced from official transport reports is reliable in structure and format.

Accuracy: Likely accurate, though minor cancellations or unlogged disruptions may not appear.

Timeliness: Updated monthly, meaning real-time disruptions are excluded.

Bias: May underrepresent smaller or unofficial service issues.

Security:

Current Risks: Local storage without encryption, no user authentication.

Improvements: Encrypt datasets, introduce password-protected dashboards, and store files in secure, access-controlled locations.

### UX & Accessibility

Current State: CLI interface works for technical users but lacks user-friendliness for the general public.

Recommended Improvements:

Build a web-based dashboard (Streamlit/Dash) for ease of use.

Add hover-over explanations and interactive filtering.

Provide alternative text and high-contrast colour palettes for accessibility.


## Conclusion

The system meets most of the core functional requirements and provides accurate, valid transport performance analysis.

However incorporating better visual presentation, accessibility, data security and the integration of additional datasets would enhance its usefullness. The peer feedback given to me helps me refine will be helpful in the long run. 
