Skip to content

itstrieu/california-public-libraries-analysis-in-r

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analysis of California Public Libraries

tidyverse, doParallel, caret, randomForest, pdp, glmnet, shiny

This repository contains the following...

Presentation Files

  1. Project Report
  2. Presentation Slides
  3. Shiny App
  4. Shiny App Code

Markdown Files

  1. Combine Data Files
  2. Data Pre-Processing
  3. Feature Selection and EDA
  4. Random Forests Model
  5. Support Vector Machine Regression Model

Background

Libraries are community lifelines, but challenges like declining visits and the COVID-19 pandemic threaten their vitality. Drawing on sociologist Eric Klinenberg's insights, I delved into California libraries' data to uncover trends and solutions.

Data Description

My analysis focused on California libraries, categorized by community size. Leveraging publicly available data, I scrutinized metrics like library visits, computer usage, and program attendance.

Descriptive Analytics

From 2016 to 2020, library visits dipped, rebounding post-pandemic but falling short of pre-crisis levels. Despite size differences, average visits per person hovered around 3 yearly. Small libraries showed greater resilience in program attendance.

Model Building

Employing Support Vector Machine Regression (SVMR) and Random Forest models, I probed predictors of library visits. Both models spotlighted the importance of computer usage, non-English materials circulation, and adult programs.

Model Analysis

While SVMR excelled in prediction accuracy, Random Forest captured more visitation variation. Insights gleaned include adjusting performance metrics and resource allocation strategies.

Conclusion

This research arms librarians with tools to predict and grasp library visitation dynamics. Key findings suggest prioritizing experimentation with computer usage, non-English materials circulation, and adult programming. Integrating demographic data in future research is advised for a deeper understanding.

Releases

No releases published

Packages

No packages published