tidyverse, doParallel, caret, randomForest, pdp, glmnet, shiny
This repository contains the following...
- Combine Data Files
- Data Pre-Processing
- Feature Selection and EDA
- Random Forests Model
- Support Vector Machine Regression Model
Libraries are community lifelines, but challenges like declining visits and the COVID-19 pandemic threaten their vitality. Drawing on sociologist Eric Klinenberg's insights, I delved into California libraries' data to uncover trends and solutions.
My analysis focused on California libraries, categorized by community size. Leveraging publicly available data, I scrutinized metrics like library visits, computer usage, and program attendance.
From 2016 to 2020, library visits dipped, rebounding post-pandemic but falling short of pre-crisis levels. Despite size differences, average visits per person hovered around 3 yearly. Small libraries showed greater resilience in program attendance.
Employing Support Vector Machine Regression (SVMR) and Random Forest models, I probed predictors of library visits. Both models spotlighted the importance of computer usage, non-English materials circulation, and adult programs.
While SVMR excelled in prediction accuracy, Random Forest captured more visitation variation. Insights gleaned include adjusting performance metrics and resource allocation strategies.
This research arms librarians with tools to predict and grasp library visitation dynamics. Key findings suggest prioritizing experimentation with computer usage, non-English materials circulation, and adult programming. Integrating demographic data in future research is advised for a deeper understanding.