# Testing the Export Functionality

In [2]:
import pandas as pd
import numpy as np
import datetime

In [6]:
bike = pd.read_csv("BikeData.csv")
bike.head(5)

Unnamed: 0,Date,Rented Bike Count,Hour,Temperature (C),Humidity (%),Wind speed (m/s),Visibility (10m),Dew point temperature (C),Solar Radiation (MJ/m2),Rainfall (mm),Snowfall (cm),Seasons,Holiday,Functioning Day
0,01/12/2017,254,0,-5.2,37,2.2,2000,-17.6,0.0,0.0,0.0,Winter,No Holiday,Yes
1,01/12/2017,204,1,-5.5,38,0.8,2000,-17.6,0.0,0.0,0.0,Winter,No Holiday,Yes
2,01/12/2017,173,2,-6.0,39,1.0,2000,-17.7,0.0,0.0,0.0,Winter,No Holiday,Yes
3,01/12/2017,107,3,-6.2,40,0.9,2000,-17.6,0.0,0.0,0.0,Winter,No Holiday,Yes
4,01/12/2017,78,4,-6.0,36,2.3,2000,-18.6,0.0,0.0,0.0,Winter,No Holiday,Yes


In [8]:
bike_preds = pd.read_csv("prediction_data.csv")
bike_preds.head(5)

Unnamed: 0,Rented Bike Count,Hour,Temperature (C),Humidity (%),Seasons,Holiday,Functioning Day,Precipitation,Day,Month,Year
0,254,0,-5.2,37,1,0,1,0,1,12,2017
1,204,1,-5.5,38,1,0,1,0,1,12,2017
2,173,2,-6.0,39,1,0,1,0,1,12,2017
3,107,3,-6.2,40,1,0,1,0,1,12,2017
4,78,4,-6.0,36,1,0,1,0,1,12,2017


## Modifying the data for the predictions

The data visualization portion doesn't require as much modification, as I can just choose which columns to use without converting their data types or tranforming them with ColumnTransformer.  I performed all necessary modification in the Visualizations notebook

In [39]:
# Setting bike to the "BikeData.csv" data
bike = pd.read_csv("BikeData.csv")

# Splitting data into day, month, year columns and saving to bike2
bike2 = bike.Date.str.split("/", expand=True)
bike2.rename(columns = {0:'Day', 1:'Month', 2:'Year'}, inplace=True)

for i in range(0, len(bike["Date"])):
    
    # Reformatting the data in the "Date" column from dd/mm/yyyy to mm/dd/yyyy
    bike["Date"][i] = datetime.datetime.strptime(bike["Date"][i], "%d/%m/%Y").date()
    
    # Converting the season strings to int
    if bike["Seasons"][i] == "Winter":
        bike["Seasons"][i] = 1
    elif bike["Seasons"][i] == "Spring":
        bike["Seasons"][i] = 2
    elif bike["Seasons"][i] == "Summer":
        bike["Seasons"][i] = 3
    elif bike["Seasons"][i] == "Autumn":
        bike["Seasons"][i] = 4
        
    # Converting the holiday strings to int    
    if bike["Holiday"][i] == "No Holiday":
        bike["Holiday"][i] = 0
    elif bike["Holiday"][i] == "Holiday":
        bike["Holiday"][i] = 1
        
    # Converting the functioning day strings to int
    if bike["Functioning Day"][i] == "Yes":
        bike["Functioning Day"][i] = 1
    elif bike["Functioning Day"][i] == "No":
        bike["Functioning Day"][i] = 0

# Creating pd dataframe with dictionary of Day: Hour
dt_df = pd.DataFrame({
    'Day': np.array(bike["Date"]), 
    'Hour': np.array(bike["Hour"])})

# Combining the "Date" and "Hour" data from each row into one cell in the "Datetime" column
bike["Datetime"] = pd.to_datetime(dt_df.Day) + pd.to_timedelta(dt_df.Hour, unit='h')

# Adding Precipitation boolean column - "True" if either "Rainfall (mm)" or "Snowfall (cm)" are greater than 0
bike["Precipitation"] = np.where((bike['Rainfall (mm)'] > 0) | (bike['Snowfall (cm)'] > 0), 1, 0)

# Combining bike and bike two into one Dataframe on axis=1 (columns)
bike = pd.concat([bike, bike2], axis=1)

# Return bike with new "Datetime" and "Precipitation" columns
bike.to_csv("BikeDataExpanded.csv", index=False)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  bike["Date"][i] = datetime.datetime.strptime(bike["Date"][i], "%d/%m/%Y").date()
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  bike["Seasons"][i] = 1
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  bike["Holiday"][i] = 0
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  bike["Functioning Day"][i] = 1
A value is

In [40]:
# Dropping unnecessary columns
bike_predict = bike.drop(["Date", "Wind speed (m/s)", "Visibility (10m)", "Dew point temperature (C)", "Solar Radiation (MJ/m2)", "Rainfall (mm)", "Snowfall (cm)", "Datetime"], axis=1)
bike_predict.head()
bike_predict.to_csv("prediction_data.csv", index=False)