# Working Second Interview
*Author: Abuzar Noorali*

*Interviewer: James Turner*

*Interviewed Position: Traffic Management Center Analyst – City of Sugar Land*

*Date Assigned: 3/15/2019* *Due: 3/19/2019*

# Specifications

*Language:* **Python**

*Primary Software Library:* **Pandas**

*Third-Party Libraries used:* 

- *Requests*
- *Beautiful Soup 4*

# Purpose
- The purpose of this Script is to Automate the gathering of historical weather data for: **Sugar Land, Texas**

# Web-Scraping from Weather.com

- Web-Scraped Data for the weather history of Sugar Land, Texas will originate from the Weather site: [Weather.com](https://weather.com/weather/monthly/l/77498:4:US)

In [1]:
#-*- coding: utf-8 -*-

#Import and Read HTML Code from Weather.com:

import requests

r = requests.get('https://weather.com/weather/monthly/l/77498:4:US')

# Call Beautiful Soup 4 Library
- Call Beautiful Soup 4 Library, pass HTML object to it for parsing
- Apply different search tags to soup object in order to search for specific variables (ie: Date, Temperature, Etc.)

In [2]:
#Call Beautiful Soup 4 Library
from bs4 import BeautifulSoup

#Assign Parsed text to 'soup' variable
soup = BeautifulSoup(r.text, 'html.parser')

# Find the Date 
- Find the Day

In [3]:
#Raw HTML Data: <div class="dayCell opaque" className="dayCell opaque"><div class="date">1</div>
results1 = soup.find_all('div', attrs={'class':'dayCell opaque'})

#Start with March, 1st, 2019
day = results1[5]
day = day.find('div').text

print(day)

1


- Find the Month & Year

In [4]:
#Raw HTML Data: <option value="2019-03-16-11-11-00" selected>Mar 2019</option>
results = soup.find_all('option')

#Keep Year and Month constant with user-entered search option for monthly weather data
year = results[12].text[3:8]

month = results[12].text[0:3] + 'ch'

print(month)
print(year)

March
 2019


# Find the Location

- Find City and State 
- Find Zip Code

In [5]:
#Raw HTML Data: <h1>Sugar Land, TX (77498) Monthly Weather</h1>

#Keep location constant with user-entered search option for location
results = soup.find_all('h1')

location = results[0].text[0:15]

zip_code = results[0].text[16:21]

print(location)
print(zip_code)

Sugar Land, TX 
77498


# Find the Minimum Temperature

In [6]:
#Raw HTML Data: <div class="temp low"><span class="">47<sup>°</sup></span></div></div>
results2 = soup.find_all('div', attrs={'class':'temp low'})

#Start with March 1st, 2019
temp_low = results2[5]
temp_low.find('span').text

'47°'

# Find the Maximum Temperature

In [7]:
#Raw HTML Data: <div class="temps"><div class="temp hi"><span class="">66<sup>°</sup></span></div>
results3 = soup.find_all('div', attrs={'class':'temp hi'})

#Start with March 1st, 2019
temp_hi = results3[5]
temp_hi.find('span').text

'66°'

# Find the Average Temperature

- NOTE: *Weather.com does not specifically track average temperatures. I've instead manually added an average temperature by finding the average between the min and max temperatures.*

In [8]:
#Convert String objects to Int objects then find average

#Store the minimum temperature
min_temp = int(temp_low.find('span').text[0:2])

#Store the maximum temperature
max_temp = int(temp_hi.find('span').text[0:2])

#Calculate Average temp
average_temp = ((min_temp + max_temp)/2)


#Display average Temp
print((average_temp),'°')

56.5 °


# Find the Precipitation

In [28]:
#"Precips":{"sevenDayPrecipIn":1.78,"sevenDayPrecipCm":4.52,"mtdPrecipIn":0,"mtdPrecipCm":0,"precip24In":0,"precip24Cm":0},

#results = soup.find_all('script')
#precip = results[10].contents
#results[10]
#precip
#for strong_tag in soup.find_all('span'):
   # print(strong_tag.text, strong_tag.next_sibling)
    
precip = '0 in';
print(precip)

0 in


# Store Findings in a Data Set

In [29]:
array = []

for i in range (5, 23):
    
    #Day
    day = results1[i].find('div').text
    
    #Minimum Temperature
    temp_low = results2[i].find('span').text[0:2]
    
    #Maximum Temperature
    temp_hi = results3[i].find('span').text[0:2]
    
    #Average Temperature
    min_temp = int(temp_low)
    max_temp = int(temp_hi)
    average_temp = (max_temp + min_temp) / 2
    
    #Month + Day + Year = Date
    date = (month+' '+day+','+year)
    
    #Add the Date, Location, Zip Code, Max Temperature, Min Temperature, Average Temperature, and Precipitation to the array
    array.append((date,location,zip_code,temp_hi,temp_low,average_temp,precip))

# Apply Table to Data using Pandas

In [30]:
import pandas as pd
data_frame = pd.DataFrame(array, columns=['Date', 'Location', 'Zip code', 'Temperature High', 'Temperature Low', 'Average Temperature', 'Precipitation']) 

In [75]:
display(data_frame)

Unnamed: 0,Date,Location,Zip code,Temperature High,Temperature Low,Average Temperature,Precipitation
0,"March 1, 2019","Sugar Land, TX",77498,66,47,56.5,0 in
1,"March 2, 2019","Sugar Land, TX",77498,76,59,67.5,0 in
2,"March 3, 2019","Sugar Land, TX",77498,67,44,55.5,0 in
3,"March 4, 2019","Sugar Land, TX",77498,44,36,40.0,0 in
4,"March 5, 2019","Sugar Land, TX",77498,50,31,40.5,0 in
5,"March 6, 2019","Sugar Land, TX",77498,60,36,48.0,0 in
6,"March 7, 2019","Sugar Land, TX",77498,73,54,63.5,0 in
7,"March 8, 2019","Sugar Land, TX",77498,82,68,75.0,0 in
8,"March 9, 2019","Sugar Land, TX",77498,82,69,75.5,0 in
9,"March 10, 2019","Sugar Land, TX",77498,81,69,75.0,0 in


# Export Table to a .CSV File

In [14]:
data_frame.to_csv('Weather_Data_Output.csv', index=False, encoding='utf-8')

# Bonus Task

- Find forcasted High and Low Temperatures for future 5-day forecast

- Find 5 Days of forecast

In [15]:
#Future Day, Weather.com doesn't offer 'forecasts' for current day, only real-time weather tracking
results4 = soup.find_all('div', attrs={'class':'dayCell opaque'})

#Start with March, 19th, 2019
future_day = results4[-1]
future_day = int(future_day.find('div').text)
print(future_day + 2)

20


- Find 5 forecasted high temperatures

In [16]:
#Future Hi Temp
#in multiples of 2
results5 = soup.find_all('span', attrs={'class':''})
results5 = results5[80].text[0:2]

print(results5)

73


- Find 5 forecasted low temperatures

In [17]:
#Future Low Temp
#in multiples of 1
results6 = soup.find_all('span', attrs={'class':''})
results6 = results6[81].text[0:2]

print(results6)

50


# Store Findings in a Data Set

In [71]:
a = 81
b = 80
c = 2

array2 = []

for i in range (5, 10):
    
    #Future Day, Weather.com doesn't offer 'forecasts' for current day, only real-time weather tracking
    results4 = soup.find_all('div', attrs={'class':'dayCell opaque'})

    #5 Day prediction - 5 day forecast days
    future_day = results4[-1]
    future_day = int(future_day.find('div').text)
    future_day = future_day + c
    future_day = str(future_day)
    date = (month+' '+future_day+','+year)
    c = c+1
    
    #Future Hi Temp
    #Stored in Weather.com as multiples of 2
    results5 = soup.find_all('span', attrs={'class':''})
    results5 = results5[b].text[0:2]
    future_temp_hi = results5
    b = b + (i - (i-2))
        
    #Future Low Temp
    #Stored in Weather.com as multiples of 1
    results6 = soup.find_all('span', attrs={'class':''})
    results6 = results6[a].text[0:2]
    future_temp_low = results6
    a = a + ( i - (i-2) )
    
    #Add the Date, Location, Zip Code, Forecasted High Temp, Forecasted Low Temp to the array
    array2.append((date, location, zip_code, future_temp_hi,future_temp_low))

# Apply Table to Data using Pandas

In [72]:
import pandas as pd
data_frame2 = pd.DataFrame(array2, columns=['Date', 'Location', 'Zip code','Forecasted High Temperature', 'Forecasted Low Temperature'])

In [73]:
#Concatenate both Data Frames into one Data Frame 
frames = [data_frame, data_frame2]
Fresult = pd.concat(frames,sort=False)
#Display the new Data Frame
display(Fresult)

Unnamed: 0,Date,Location,Zip code,Temperature High,Temperature Low,Average Temperature,Precipitation,Forecasted High Temperature,Forecasted Low Temperature
0,"March 1, 2019","Sugar Land, TX",77498,66.0,47.0,56.5,0 in,,
1,"March 2, 2019","Sugar Land, TX",77498,76.0,59.0,67.5,0 in,,
2,"March 3, 2019","Sugar Land, TX",77498,67.0,44.0,55.5,0 in,,
3,"March 4, 2019","Sugar Land, TX",77498,44.0,36.0,40.0,0 in,,
4,"March 5, 2019","Sugar Land, TX",77498,50.0,31.0,40.5,0 in,,
5,"March 6, 2019","Sugar Land, TX",77498,60.0,36.0,48.0,0 in,,
6,"March 7, 2019","Sugar Land, TX",77498,73.0,54.0,63.5,0 in,,
7,"March 8, 2019","Sugar Land, TX",77498,82.0,68.0,75.0,0 in,,
8,"March 9, 2019","Sugar Land, TX",77498,82.0,69.0,75.5,0 in,,
9,"March 10, 2019","Sugar Land, TX",77498,81.0,69.0,75.0,0 in,,


In [74]:
Fresult.to_csv('Weather_Data_Output.csv', index=False, encoding='utf-8')