Skip to content

smwanzi/Final-Project

Repository files navigation

NYC CitiBike availability 24 X 7

Table of Contents :

Introduction

This project evaluates current bike availability trends during a 24 hours period and seeks to predict bike availability for the NYC Citibike bike share system. The three main sections of this project include: explores the clustering Technique to sort the maximum bike availability across hours, and four other versions of machine learned models: Logistic Regression Model, Random Forest Classifier, Linear SVC Model and KNeighbors Classifier, to determine which model best fits the dataset . Comparative evaluation of the models indicates that Random Forest and the SVC models out-performed Logistic Regression and K-Neighbors models.

Technologies:

Project is created with:

  • Python language for scripting.
  • SQLAchemy
  • Postgres SQL Database for storage in the backend.
  • AWS to read / write intermediate data coming in from the API calls.
  • Python Libraries Used : Sklearn , Pandas, Numpy Matplotlib and Seaborn.