# Using Stochastic Gradient Descent On Linear Regression To Predict The Best Time To Go To The Gym

## 1. Introduction

We will use the "Crowdedness at the Campus Gym" dataset and the stochastic gradient descent linear regression (SGDRegressor) algorithm from the scikit-learn library to build and train a model for predicting the best time to go to the gym and avoid crowds.

The dataset contains 11 columns with the following information.

* `number_people`: this is the number of people at the gym at each observation. This will be our target variable or label.
* `date`: a string value with the specific date and time information.
* `timestamp`: an integer (int), with the number of seconds since the start of the day (00:00).
* `day_of_week`: an integer (int). 0 is equal to Monday and 6 is equal to Sunday.
* `is_weekend`: a Boolean value defining if this observation happened during a weekend. 1 for yes, 0 for no.
* `is_holiday`: a Boolean value defining if the observation happened during a holiday. 1 for yes, 0 for no.
* `temperature`: a float, defining the temperature during the day of the observation in Fahrenheit.
* `is_start_of_semester`: a Boolean defining if the observation happened in the first 2 weeks of a semester. 1 for yes, 0 for no.
* `is_during_semester`: a Boolean defining if the observation happened during the active semester. 1 for yes, 0 for no.
* `month`: an integer (int) defining the month of the year. 1 is equal to January, 12 is equal to December.
* `hour`: an integer (int) for the hour of the day from 0 to 23.

## 2. Import the Libraries and Load the Data

In [2]:
# Load the relevant libraries for the project
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

In [3]:
# Read the data into a dataframe
gym_data = pd.read_csv('crowdness_gym_data.csv', parse_dates=['date'])

# Let's take a look at the first 5 rows of the dataframe
gym_data.head()

Unnamed: 0,number_people,date,timestamp,day_of_week,is_weekend,is_holiday,temperature,is_start_of_semester,is_during_semester,month,hour
0,37,2015-08-14 17:00:11-07:00,61211,4,0,0,71.76,0,0,8,17
1,45,2015-08-14 17:20:14-07:00,62414,4,0,0,71.76,0,0,8,17
2,40,2015-08-14 17:30:15-07:00,63015,4,0,0,71.76,0,0,8,17
3,44,2015-08-14 17:40:16-07:00,63616,4,0,0,71.76,0,0,8,17
4,45,2015-08-14 17:50:17-07:00,64217,4,0,0,71.76,0,0,8,17
