# COVID-19 Survey Linear Regression

Let's fit a multidimensional linear model to the Covid-19 survey data. We can add as many input variables as we want. Here we choose the step count and stress level as inputs and sleep latency as the output. After fitting, we can predict sleep latency for any step count and stress level combination.

In [2]:
from sklearn.linear_model import LinearRegression
import numpy as np
import pandas as pd
import pickle

# Choose which variables to include in each analysis
input_variables = ['Steps', 'Stress', 'Exercise']

# Model dimensionality
n = len(input_variables)

# Load preprocessed data
filename = './data/covid_data_preprocessed.csv'
df = pd.read_csv(filename)
df_latency = df[input_variables + ['Latency']]
df_sleeptime = df[input_variables + ['Sleeptime (h)']]
df_wakes = df[input_variables + ['Wakes']]

# Fit the models
x = df_latency.drop('Latency', axis = 1)
y = df_latency['Latency']
model_latency = LinearRegression().fit(x, y)

x = df_sleeptime.drop('Sleeptime (h)', axis = 1)
y = df_sleeptime['Sleeptime (h)']
model_sleeptime = LinearRegression().fit(x, y)

x = df_wakes.drop('Wakes', axis = 1)
y = df_wakes['Wakes']
model_wakes = LinearRegression().fit(x, y)

# Define a function to make easy predictions
def predict(x, model, n = 1):
    x_tmp = np.array(x).reshape(-1, n)
    y_pred = model.predict(x_tmp)[0]
    # Enforce that the result is positive
    if y_pred >=0:
        return y_pred
    else:
        return 0

Make some predictions to test the models.

In [4]:
# Predict sleep time and quality from steps and stress
steps = 2000
stress_level = 3
exercise = 1

x = [steps, stress_level, exercise]
pred_latency = predict(x, model_latency, n)
pred_sleeptime = predict(x, model_sleeptime, n)
pred_wakes = predict(x, model_wakes, n)

print('Predicted sleep time in hours: ', round(pred_sleeptime, 2))
print('Predicted sleep latency in minutes: ', round(pred_latency, 2))
print('Predicted number of wakes: ', int(round(pred_wakes, 0)))

Predicted sleep time in hours:  7.92
Predicted sleep latency in minutes:  21.39
Predicted number of wakes:  1




Save models with pickle.

In [None]:
with open('./models/model_latency.pkl', 'wb') as f:
    pickle.dump(model_latency, f)

with open('./models/model_sleeptime.pkl', 'wb') as f:
    pickle.dump(model_sleeptime, f)

with open('./models/model_wakes.pkl', 'wb') as f:
    pickle.dump(model_wakes, f)