## Section 1. Import Libraries

### Description
---
This section includes code to import the following libraries into python. 
- #### [Numpy][Numpyid]  is a package that performs high-level mathematical functions 
[numpyid]: https://numpy.org/devdocs/user/absolute_beginners.html "Numpy Docs"
- #### [Pandas][id] is a package that handles data structures for data analysis 
[id]: https://pandas.pydata.org/docs/getting_started/overview.html "Pandas Docs"
- #### [datetime][datetimeid] is a package that provides code for manipulating dates and times
[datetimeid]: https://docs.python.org/3/library/datetime.html "datetime docs"

In [1]:
# Importing libraries
import numpy as np
import pandas as pd 
import datetime as dt

### Summary
---
This code uses the import module to import **numpy**, **pandas**, and **datetime** and assigns them the aliases of **np**, **pd**, and **dt** respectively. 

- Python's [import module](https://docs.python.org/3/reference/import.html, "import docs") searches for the named module, then binds it to a name in the local scope. In this case, that name is either np, pd, or dt. 
- If the module is not found, it will invoke a ModuleNotFoundError. This likely means the module either needs to be imported using [pip](https://docs.python.org/3/installing/index.html, "how to install with python") or has a typo.




---
---

## Section 2. Import Data

### Description
---
This section imports the weather data into a pandas DataFrame named **df_weather**. 

In [2]:
# Importing data
df_weather = pd.read_csv("data/weather.csv",index_col=None,header=0);

### Summary
---
- This code creates a [DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html, "DataFrame Docs"), df_weather, via the [read_csv](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html, "read_csv docs") function in pandas to import a csv file, weather.csv, from the data directory. 
- The read_csv function takes many optional arguments, but in this case it uses:<br> 
    - **filepath_or_buffer**: Accepts any string path, or path object with a read() mehod.<br> 
    - **index_col**: Denotes the columns to use as row labels. If a series is given, a MultiIndex will be formed. In this case, index_col = None, which signifies that there is not an index provided.<br>
    - **header**: Indicates the row number that contains the column labels and marks the start of the data. 

---
---

## Section 3. View Data

### Description
---
This section looks at the data in the **df_weather** DataFrame. <br>

In [3]:
# Viewing the top 5 rows of data
df_weather.head(5)


Unnamed: 0,Date,MaxTemp,MaxTime,MinTemp,MinTime,cloudCover,precipProbability,precipAmount,sunriseTime,sunsetTime,windSpeed
0,3/1/17,68.48,3/1/17 5:00,51.69,2/28/17 9:18,0.96,0.91,0.0184,2/28/17 12:25,2/28/17 23:46,6.2
1,3/2/17,69.75,3/1/17 10:30,36.0,3/2/17 5:00,0.53,1.0,0.2422,3/1/17 12:24,3/1/17 23:47,9.69
2,3/3/17,51.93,3/2/17 21:49,30.87,3/2/17 11:39,0.18,0.36,0.0028,3/2/17 12:23,3/2/17 23:48,4.63
3,3/4/17,44.03,3/3/17 21:51,26.49,3/3/17 14:06,0.14,0.7,0.0017,3/3/17 12:21,3/3/17 23:49,2.26
4,3/5/17,65.98,3/4/17 21:24,27.4,3/4/17 10:06,0.03,0.0,0.0,3/4/17 12:20,3/4/17 23:50,4.69


In [4]:
# Viewing a sample of 7 rows of df_weather
df_weather.sample(7)

Unnamed: 0,Date,MaxTemp,MaxTime,MinTemp,MinTime,cloudCover,precipProbability,precipAmount,sunriseTime,sunsetTime,windSpeed
11,3/12/17,34.76,3/11/17 20:33,26.2,3/11/17 11:24,0.8,0.0,0.0,3/11/17 12:10,3/11/17 23:57,5.29
13,3/14/17,37.77,3/13/17 22:24,29.11,3/13/17 10:45,0.7,0.85,0.0371,3/13/17 12:07,3/13/17 23:59,2.42
30,3/31/17,77.21,3/30/17 19:21,55.29,3/31/17 4:00,0.68,0.81,0.1021,3/30/17 11:41,3/31/17 0:14,7.25
48,4/18/17,68.7,4/17/17 16:57,57.82,4/18/17 2:12,0.88,0.91,0.0138,4/17/17 11:15,4/18/17 0:30,2.53
58,4/28/17,59.95,4/27/17 23:39,49.54,4/28/17 3:18,0.75,1.0,0.1001,4/27/17 11:02,4/28/17 0:39,5.66
23,3/24/17,55.7,3/23/17 22:51,31.01,3/23/17 10:48,0.48,0.91,0.0209,3/23/17 11:51,3/24/17 0:08,3.69
60,4/30/17,86.36,4/29/17 21:48,66.1,4/29/17 7:13,0.74,1.0,0.1351,4/29/17 11:00,4/30/17 0:41,5.98


In [5]:
# View the bottom 2 rows of df_weather
df_weather.tail(2)

Unnamed: 0,Date,MaxTemp,MaxTime,MinTemp,MinTime,cloudCover,precipProbability,precipAmount,sunriseTime,sunsetTime,windSpeed
60,4/30/17,86.36,4/29/17 21:48,66.1,4/29/17 7:13,0.74,1.0,0.1351,4/29/17 11:00,4/30/17 0:41,5.98
61,,,,,,,,0.178,,,


---
---

### Summary
---
This code provides methods to view the data within the df_weather DataFrame. 
- The first line **df_weather.head(5)**, uses the [.head(n)]( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html, "head method docs") method to return the first n rows of the df_weather DataFrame. The default value for n is 5, but I chose to include it explicitly to show its functionality.<br>
- The second line, **df_weather.sample(7)**, uses the [.sample(n)]( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sample.html#pandas.DataFrame.sample, "sample method docs") method to retrieve a sample of 7 rows from df_weather. The default value of n is 1. This code allows you to visualize a section of the code that is not necessarily at the start or end of the table. 
- The third line, **df_weather.tail(2)**, uses the [.tail(n)]( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html#pandas.DataFrame.tail, "tail method docs") method to retrieve the last n rows from df_weather. The default value of n is 5. In this case, the last row, 61, appears to have dirty data, as most of the values are NaN. 

---
---


## Section 4. Variable & Celsius to Fahrenheit Conversion

### Description
---
This section converts a Fahrenheit value into a Celsius value using variables and a formula, then prints it out. 

In [6]:
# Variable for Fahrenheit
f = 32

In [7]:
# Variable for Celsius
c = 0

In [8]:
# Display the fahrenheit and celsius variables
print(f)
print(c)

32
0


In [9]:
# Create a formulate that converts f to c. 
f = 86
c = int((f-32)*5/9) # Converted c to an integer to match hw guideline. To display decimal, I would remove int()
print(c)

30


In [10]:
# Print a statement and convert f and c to str type

print(f"Fahrenheit {str(f)} converts to {str(c)} Celsius")

Fahrenheit 86 converts to 30 Celsius


### Summary
---
- The first line, f = 32, declares an integer variable, f, and assigns the value 32 to it. <br>
- The second line, c = 0, establishes an integer value of 0 for the variable, c. <br>
- The third code block prints both values to show they were properly assigned. <br>
- The fourth code block assigns a new value of 86 to f, then changes c to be assigned a formula value. The formula (f-32_*5/9) is then cast as an integer to handle floating point values. The value of c is then returned.<br> 
- The fifth and final code block uses an fstring print statement to print out what the fourth code block did. 


---
---

## Section 5. Convert Objects to Dates using datetime

### Description
---
This section converts four columns from objects to datetime values.

In [11]:
# Convert Maxtime, MinTime, sunriseTime, and sunsetTime from objects to datetime
df_weather['MaxTime'] = pd.to_datetime(df_weather['MaxTime'], format="%m/%d/%y %H:%M")
df_weather['MinTime'] = pd.to_datetime(df_weather['MinTime'], format="%m/%d/%y %H:%M")
df_weather['sunriseTime'] = pd.to_datetime(df_weather['sunriseTime'], format="%m/%d/%y %H:%M")
df_weather['sunsetTime'] = pd.to_datetime(df_weather['sunsetTime'], format="%m/%d/%y %H:%M")

### Summary
---
- This code uses the pandas [to_datetime]( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html#pandas-to-datetime, "to_datetime method docs")  method to convert the MaxTime, MinTime, sunriseTime, and sunsetTime values from objects to datetime data types.<br>
- Using the format="%m/%d/%y %H:%M" parameter prevents the to_datetime method from using the dateutil function to guess the format of the object. This changed with version 2.0, and is indicated in the documentation. 

## Section 6. Create additional Data Features

### Description
---
This section creates three columns, MaxDay, MaxMonth, and MaxDayName from the MaxTime column. 

In [12]:
# Create day of week for MaxTime
df_weather['MaxDay'] = df_weather['MaxTime'].dt.dayofweek
# Create Month for MaxTime
df_weather['MaxMonth'] = df_weather['MaxTime'].dt.month
# Save day name for day of week
df_weather['MaxDayName'] = df_weather['MaxTime'].dt.day_name()

In [13]:
df_weather.head(5)

Unnamed: 0,Date,MaxTemp,MaxTime,MinTemp,MinTime,cloudCover,precipProbability,precipAmount,sunriseTime,sunsetTime,windSpeed,MaxDay,MaxMonth,MaxDayName
0,3/1/17,68.48,2017-03-01 05:00:00,51.69,2017-02-28 09:18:00,0.96,0.91,0.0184,2017-02-28 12:25:00,2017-02-28 23:46:00,6.2,2.0,3.0,Wednesday
1,3/2/17,69.75,2017-03-01 10:30:00,36.0,2017-03-02 05:00:00,0.53,1.0,0.2422,2017-03-01 12:24:00,2017-03-01 23:47:00,9.69,2.0,3.0,Wednesday
2,3/3/17,51.93,2017-03-02 21:49:00,30.87,2017-03-02 11:39:00,0.18,0.36,0.0028,2017-03-02 12:23:00,2017-03-02 23:48:00,4.63,3.0,3.0,Thursday
3,3/4/17,44.03,2017-03-03 21:51:00,26.49,2017-03-03 14:06:00,0.14,0.7,0.0017,2017-03-03 12:21:00,2017-03-03 23:49:00,2.26,4.0,3.0,Friday
4,3/5/17,65.98,2017-03-04 21:24:00,27.4,2017-03-04 10:06:00,0.03,0.0,0.0,2017-03-04 12:20:00,2017-03-04 23:50:00,4.69,5.0,3.0,Saturday


###  Summary
---
This section used pandas.datetime, abbreviated as dt, to produce three new columns. Code for each of these methods is found [here]( "https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DatetimeIndex.dayofweek.html", "dt docs"). 
- The first line of code uses the dt.dayofweek method to create the MaxDay column from the datetime column "MaxTime" and append it to the df_weather DataFrame. 
- The second line of code uses the dt.month method to create the MaxMonth column from the datetime column "MaxTime" and append it to the df_weather DataFrame.  
- The third line of code uses the dt.day_name() method to create a MaxDayName column from the datetime column "MaxTime" and append it to the df_weather DataFrame. 
- The last line of code displays the first five rows of df_weather. 

## Section 7. Create a new column


###  Description
---
This section creates a new calculated column and adds it to the DataFrame. 

In [14]:
# Create TempRange column by subtracting MinTemp from MaxTemp
df_weather['TempRange'] = df_weather['MaxTemp'] - df_weather['MinTemp']

In [15]:
# Display the first 5 records
df_weather.head(5)

Unnamed: 0,Date,MaxTemp,MaxTime,MinTemp,MinTime,cloudCover,precipProbability,precipAmount,sunriseTime,sunsetTime,windSpeed,MaxDay,MaxMonth,MaxDayName,TempRange
0,3/1/17,68.48,2017-03-01 05:00:00,51.69,2017-02-28 09:18:00,0.96,0.91,0.0184,2017-02-28 12:25:00,2017-02-28 23:46:00,6.2,2.0,3.0,Wednesday,16.79
1,3/2/17,69.75,2017-03-01 10:30:00,36.0,2017-03-02 05:00:00,0.53,1.0,0.2422,2017-03-01 12:24:00,2017-03-01 23:47:00,9.69,2.0,3.0,Wednesday,33.75
2,3/3/17,51.93,2017-03-02 21:49:00,30.87,2017-03-02 11:39:00,0.18,0.36,0.0028,2017-03-02 12:23:00,2017-03-02 23:48:00,4.63,3.0,3.0,Thursday,21.06
3,3/4/17,44.03,2017-03-03 21:51:00,26.49,2017-03-03 14:06:00,0.14,0.7,0.0017,2017-03-03 12:21:00,2017-03-03 23:49:00,2.26,4.0,3.0,Friday,17.54
4,3/5/17,65.98,2017-03-04 21:24:00,27.4,2017-03-04 10:06:00,0.03,0.0,0.0,2017-03-04 12:20:00,2017-03-04 23:50:00,4.69,5.0,3.0,Saturday,38.58


### Summary
---
This section added a new "TempRange" column to the df_weather DataFrame by performing a calculation between two other columns. The code then runs the .head(5) method in order to verify the new column's presence at the end of the DataFrame.

## Section 8. Slice and Filter Data

### Description
---
This sections slices the data to show just the first five values for MaxDay and MaxMonth. <br>
It then creates a new DataFrame using the slice, and shows the descriptive statistics of the new DataFrame

In [16]:
# Display the MaxDay column only 
df_weather[['MaxDay']].head()

Unnamed: 0,MaxDay
0,2.0
1,2.0
2,3.0
3,4.0
4,5.0


In [17]:
# Display the MaxMonth column only using iloc
df_weather.iloc[:,[-3]].head()

Unnamed: 0,MaxMonth
0,3.0
1,3.0
2,3.0
3,3.0
4,3.0


In [18]:
# Create a Dataframe named df_temp that includes MaxTemp, MinTemp, TempRange, and precipAmount
df_temp = df_weather.iloc[:,[1,3,-1,6,]]

In [19]:
# Show descriptive statistics about df_temp
df_temp.describe()

Unnamed: 0,MaxTemp,MinTemp,TempRange,precipProbability
count,61.0,61.0,61.0,61.0
mean,65.899672,44.517869,21.381803,0.374754
std,13.68652,12.397472,7.482015,0.435797
min,29.53,16.39,8.22,0.0
25%,56.63,36.0,15.75,0.0
50%,68.7,45.7,20.89,0.0
75%,76.19,52.69,25.61,0.89
max,86.36,66.1,38.58,1.0


### Summary
---
This sections uses various techniques to slice and filter the data, and creates a new DataFrame. 
- The first line of code uses the .head() method to view only one column of the df_weather DataFrame, 'MaxDay'
- The second line of code uses the [iloc]( https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html, "iloc docs") method to display the MaxMonth column of the df_weather DataFrame. 
- The third line of code creates a new DataFrame, df_temp, composed of 4 columns, MaxTemp, MinTemp, TempRange, and precipAmount, which are located at column index numbers 1,3,-1,and 6 respectively. 
- The final line of code runs the [.describe()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.describe.html#pandas.Series.describe, "describe docs") method, which provides general statistics about the DataFrame. 