This project involves analyzing ridership data for the Chicago Transit Authority's Red Line. The analysis aims to provide insights into ridership patterns, station usage, and other relevant metrics over a specified period.
- CTA_RedLine_Final.R: R script containing the data processing, analysis, and visualization steps for the project.
- Data Cleaning and Preparation: Load, filter, and merge ridership and station information data to prepare for analysis.
- Summary Statistics: Calculate and interpret summary statistics for ridership and other categorical variables.
- Correlation Analysis: Explore relationships between variables using correlation matrices and visualizations.
- Regression Modeling: Fit a multiple linear regression model to understand the impact of various factors on ridership.
-
Data Owner
-
CTA_DailyTotals.csv: Contains daily ridership totals for various stations.
-
CTA_System.csv: Provides information about the CTA system, including station details.
Ensure you have R and the following packages installed for the project:
install.packages("corrplot") # For visualizing correlation matrices
install.packages("dplyr") # For data manipulation
install.packages("ggplot2") # For data visualization
install.packages("Hmisc") # For advanced data analysis and correlation calculations
install.packages("lubridate") # For date and time manipulation-
Install Required Packages: Use the commands above to install the necessary packages.
-
Load Data: Ensure the datasets
CTA_DailyTotals.csvandCTA_System.csvare placed in the project directory. -
Run the Analysis: Execute the R script
CTA_RedLine_Final.Rto perform the analysis and generate insights.
This project is intended for data analysts and researchers interested in public transportation data analysis. The R script can be used as a template for similar analyses.
- Basic knowledge of R programming and data analysis.
- Familiarity with libraries such as
ggplot2,dplyr, andcorrplot.
- Andrex Ibiza, MBA
- 2024-06-13
For any questions or further information, please contact me via [andrexibiza@gmail.com], via Github Issues, or pull request.
Happy Analyzing!