# WeatherPy
----

### Analysis
* As expected, the weather becomes significantly warmer as one approaches the equator (0 Deg. Latitude). More interestingly, however, is the fact that the southern hemisphere tends to be warmer this time of year than the northern hemisphere. This may be due to the tilt of the earth.
* There is no strong relationship between latitude and cloudiness. However, it is interesting to see that a strong band of cities sits at 0, 80, and 100% cloudiness.
* There is no strong relationship between latitude and wind speed. However, in northern hemispheres there is a flurry of cities with over 20 mph of wind.

---

#### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [162]:
# Dependencies and Setup
import matplotlib.pyplot as plt
import pandas as pd
import datetime as dt
import numpy as np
import requests
import time
import json

# Import API key
from api_keys import myKey

# Incorporated citipy to determine city based on latitude and longitude
from citipy import citipy

# Output File (CSV)
output_data_file = "output_data/cities.csv"

# Range of latitudes and longitudes
lat_range = (-90, 90)
lng_range = (-180, 180)

## Generate Cities List

In [163]:
# List for holding lat_lngs and cities
lat_lngs = []
cities = []
countries = []

# Create a set of random lat and lng combinations
lats = np.random.uniform(low=-90.000, high=90.000, size=1750)  
lngs = np.random.uniform(low=-180.000, high=180.000, size=1750)
lat_lngs = zip(lats, lngs)
#uniform is to have the numbers evenly distributed
#size is how many random numbers will be generated.
#we are saying, generate 1500 random evenly distributed numbers between -90 and 90.

# Identify nearest city for each lat, lng combination
for each in lat_lngs:
    city = citipy.nearest_city(each[0], each[1]).city_name 
    country = citipy.nearest_city(each[0], each[1]).country_code 
    
    # If the city is unique, then add it to a our cities list
    if city not in cities:
        cities.append(city)
        countries.append(country)

# Print the city count to confirm sufficient count
len(cities)

#612 cites are returned.   There must be at least 500 on the next part of finding the weather.  If not, then must come back to this step and increase size.

697

In [164]:
len(countries)

697

In [165]:
#??  how do I see the ln_lngs list?
print(lat_lngs)

<zip object at 0x000001DEEA6D0D48>


In [166]:
print(lngs)

[-162.06210666  170.81680706 -119.89926335 ... -140.55581961 -103.75424751
   61.61059373]


In [167]:
#

### Perform API Calls
* Perform a weather check on each city using a series of successive API calls.
* Include a print log of each city as it'sbeing processed (with the city number and city name).


In [168]:
#first, perform a check on a single city.

In [169]:
url = "http://api.openweathermap.org/data/2.5/weather?"
city = "alofi"
units = "imperial"

# Build query URL
query_url = url + "appid=" + myKey + "&q=" + city + "&units=imperial"
#note:   adding imperial units will return the temperature in farenheight and the wind speed in miles per hour
print(query_url)

http://api.openweathermap.org/data/2.5/weather?appid=8bea25d5e18b7c2fb2ee3ac12cec5dad&q=alofi&units=imperial


In [170]:
# Get weather data
weather_response = requests.get(query_url)
weather_json = weather_response.json()


In [171]:
print(json.dumps(weather_json, indent=4, sort_keys=True))

{
    "base": "stations",
    "clouds": {
        "all": 56
    },
    "cod": 200,
    "coord": {
        "lat": -19.06,
        "lon": -169.92
    },
    "dt": 1554069600,
    "id": 4036284,
    "main": {
        "humidity": 88,
        "pressure": 1011,
        "temp": 78.8,
        "temp_max": 78.8,
        "temp_min": 78.8
    },
    "name": "Alofi",
    "rain": {
        "3h": 0.625
    },
    "sys": {
        "country": "NU",
        "id": 7306,
        "message": 0.0044,
        "sunrise": 1554053160,
        "sunset": 1554096080,
        "type": 1
    },
    "weather": [
        {
            "description": "light rain",
            "icon": "10d",
            "id": 500,
            "main": "Rain"
        }
    ],
    "wind": {
        "deg": 220,
        "speed": 10.29
    }
}


In [172]:
#Success!   Able to retrieve a single city.
#now retrieve each individual piece of data needed for the loop

In [173]:
print(weather_json["name"])

Alofi


In [174]:
print(weather_json["coord"]["lat"])

-19.06


In [175]:
print(weather_json["main"]["temp"])

78.8


In [176]:
print(weather_json["main"]["humidity"])

88


In [177]:
print(weather_json["clouds"]["all"])

56


In [178]:
print(weather_json["wind"]["speed"])

10.29


### need a dataframe containing city and country to begin
NOTE:   the Weather API will take only a city; however, if multiples are found, API responds with a list of results that match a searching word. 
<BR>
Therefore, I will begin with a list of cities and countries which can also be passed to the 
    <BR>q city name and country code divided by comma, use ISO 3166 country codes
        
<BR> https://openweathermap.org/current

In [179]:
#Begin with the city and country list which was created in the begining.

cityCountry = pd.DataFrame(list(zip(cities, countries)),
              columns=['city','country'])

cityCountry.head()

Unnamed: 0,city,country
0,avarua,ck
1,nikolskoye,ru
2,punta arenas,cl
3,butaritari,ki
4,rincon,an


In [180]:
#confirm there are at least 500 countries
len(cityCountry)

697

In [181]:
# create a data frome with the values needed for the API calls

cityCountry2 = cityCountry
cityCountry2["location"] = (cityCountry2["city"] + ", " + cityCountry2["country"])
cityCountry2["newCity"] = ""
cityCountry2["newCountry"] = ""
cityCountry2["lat"] = ""
cityCountry2["long"] = ""
cityCountry2["tempF"] = ""
cityCountry2["humidity"] = ""
cityCountry2["cloud"] = ""
cityCountry2["windMPH"] = ""
cityCountry2["maxTemp"] = ""
cityCountry2["timeStamp"] = ""
cityCountry2.head()

Unnamed: 0,city,country,location,newCity,newCountry,lat,long,tempF,humidity,cloud,windMPH,maxTemp,timeStamp
0,avarua,ck,"avarua, ck",,,,,,,,,,
1,nikolskoye,ru,"nikolskoye, ru",,,,,,,,,,
2,punta arenas,cl,"punta arenas, cl",,,,,,,,,,
3,butaritari,ki,"butaritari, ki",,,,,,,,,,
4,rincon,an,"rincon, an",,,,,,,,,,


## Retrieve Data from the API
Practice on 10 cities before attemping to get data for 500+ cities

In [182]:
#create a new df with only 10 records
cityCountrySample = cityCountry2.head(10)
cityCountrySample.head(20)

Unnamed: 0,city,country,location,newCity,newCountry,lat,long,tempF,humidity,cloud,windMPH,maxTemp,timeStamp
0,avarua,ck,"avarua, ck",,,,,,,,,,
1,nikolskoye,ru,"nikolskoye, ru",,,,,,,,,,
2,punta arenas,cl,"punta arenas, cl",,,,,,,,,,
3,butaritari,ki,"butaritari, ki",,,,,,,,,,
4,rincon,an,"rincon, an",,,,,,,,,,
5,belushya guba,ru,"belushya guba, ru",,,,,,,,,,
6,hermanus,za,"hermanus, za",,,,,,,,,,
7,nagornskiy,ru,"nagornskiy, ru",,,,,,,,,,
8,souillac,mu,"souillac, mu",,,,,,,,,,
9,atar,mr,"atar, mr",,,,,,,,,,


In [183]:
cityCountrySample.dtypes

city          object
country       object
location      object
newCity       object
newCountry    object
lat           object
long          object
tempF         object
humidity      object
cloud         object
windMPH       object
maxTemp       object
timeStamp     object
dtype: object

In [184]:
#cast the timeStamp as a date type
cityCountrySample['timeStamp'] = pd.to_datetime(cityCountrySample['timeStamp'])

In [185]:
#ensure I can get a single response using parameters
parameters = {"appid": myKey,
             "units": "imperial",
             "q": "vaini, to"}
base_url = "http://api.openweathermap.org/data/2.5/weather?"

In [186]:
#get the API response
singleResponse = requests.get(base_url, params=parameters)

In [187]:
#confirm the URL
print(singleResponse.url)

http://api.openweathermap.org/data/2.5/weather?appid=8bea25d5e18b7c2fb2ee3ac12cec5dad&units=imperial&q=vaini%2C+to


In [188]:
#convert the response to JSON 
singleResponseJ = singleResponse.json()

In [189]:
#print the response to view the data
print(json.dumps(singleResponseJ, indent=4, sort_keys=True))

{
    "base": "stations",
    "clouds": {
        "all": 75
    },
    "cod": 200,
    "coord": {
        "lat": -21.2,
        "lon": -175.2
    },
    "dt": 1554069600,
    "id": 4032243,
    "main": {
        "humidity": 88,
        "pressure": 1012,
        "temp": 80.6,
        "temp_max": 80.6,
        "temp_min": 80.6
    },
    "name": "Vaini",
    "sys": {
        "country": "TO",
        "id": 7285,
        "message": 0.0043,
        "sunrise": 1554140885,
        "sunset": 1554183652,
        "type": 1
    },
    "visibility": 10000,
    "weather": [
        {
            "description": "broken clouds",
            "icon": "04d",
            "id": 803,
            "main": "Clouds"
        }
    ],
    "wind": {
        "deg": 100,
        "speed": 10.29
    }
}


In [190]:
#success!  I can get a valid URL using parameters.  

In [191]:
#Adding to ignore the warning for appending the values in the next step.  
#Error is:  SettingWithCopyWarning:   A value is trying to be set on a copy of a slice from a DataFrame
pd.options.mode.chained_assignment = None  # default='warn'   

###  Complete for 10 Records
Before applying to all records, complete the for loop for a small sample.

In [192]:
import datetime as dt
timeTest = pd.Series([dt.datetime.now()])
print(timeTest)

0   2019-03-31 19:06:22.165696
dtype: datetime64[ns]


In [193]:
# create a parameters dict that will be updated with new city each iteration
#per API, imperial units will obtain temperature in F and wind speed in mph
parameters = {"appid": myKey,
             "units": "imperial"}

print("---------------------------------------------") 
print("Beginning Data Retrieval")
print("---------------------------------------------") 
      
#begin by knowing the number of cities
print("Number of cities to obtain data: " + str(len(cityCountry))) 
print("---------------------------------------------")       

# Loop through the cityCountrySample and perform a search on each
for index, row in cityCountrySample.iterrows():
    base_url = "http://api.openweathermap.org/data/2.5/weather?"

    city = row['city']
    country = row['country']

    # update address key value
    parameters['q'] = f"{city},{country}"

    # make request
    cities_data = requests.get(base_url, params=parameters)
    
    #print the record number  ??? how do I print the index number?
    print("Retrieving index # " + str(index) + " | " + row["city"] + ", " + row["country"])
    
    #confirm the URL
    print("  " + (cities_data.url))
        
    # convert to json
    citiesJ = cities_data.json()
    
    try:
        cityCountrySample.loc[index, "newCity"] = citiesJ["name"]
    except:
        print("     Missing city name... skipping data.")     
    
    try:    
        cityCountrySample.loc[index, "newCountry"] = citiesJ["sys"]["country"]
    except: 
        print("     Missing country name... skipping data.")
 
    try:    
        cityCountrySample.loc[index, "lat"] = citiesJ["coord"]["lat"]
    except: 
        print("     Missing latitude... skipping data.")
        
    try:    
        cityCountrySample.loc[index, "long"] = citiesJ["coord"]["lon"]
    except: 
        print("     Missing longitude... skipping data.")
        
    try:    
        cityCountrySample.loc[index, "tempF"] = citiesJ["main"]["temp"]
    except: 
        print("     Missing temp(F)... skipping data.")
        
    try:    
        cityCountrySample.loc[index, "humidity"] = citiesJ["main"]["humidity"]
    except: 
        print("     Missing humidity... skipping data.")
        
    try:    
        cityCountrySample.loc[index, "cloud"] = citiesJ["clouds"]["all"]
    except: 
        print("     Missing %of cloudiness... skipping data.")
     
    try:    
        cityCountrySample.loc[index, "windMPH"] = citiesJ["wind"]["speed"]
    except: 
        print("     Missing wind speed... skipping data.")
        
    try:    
        cityCountrySample.loc[index, "maxTemp"] = citiesJ["main"]["temp_max"]
    except: 
        print("     Missing maximum temperature... skipping data.")
    
    #add the timestamp when the record was added.   ??does not work.
    try:
        cityCountrySample.loc[index, "timeStamp"] = pd.Series([dt.datetime.now()])
    except:
        dumbVariable = 1
    
    #the openweather API is free for up to 60 calls per minute.  Add a sleep timer to wait .75 seconds between calls.
    time.sleep(.75)

   

#At least 500 cities must be included for the next step of creating graphics.   Ensure there are at least 500 cities.

#obtain the number of records where a city was not found
noRecord = len(cityCountrySample[cityCountrySample['newCity'] == ""])
print("Records not found in the weather database: " + str(noRecord))

#subtract these from the number of records at the beginning
endRecords = (len(cityCountry)) - noRecord

initRecords = (len(cityCountry))

print("---------------------------------------------")
print("Beginning Records: " + str(initRecords))
print("Ending Records: " + str(endRecords))

if endRecords < 500:
    print("The data sample is less than 500, which is too small.  Please start from the beginning to obtain a new data sample.")
    print("---------------------------------------------")   
    #store the dataframe to a csv so the user can get details if needed.
    cityCountrySample.to_csv("weatherData_sample10.csv", encoding='utf-8', index=False)
    print("All data collected is stored in the same directory as this python file.   Filename = weatherData_sample10.csv")
    print("Date Retrieval Complete")
    print("---------------------------------------------")   

else: 
    print("There are more than 500 ending records!   Continue to the graphs.")


print("---------------------------------------------")   
print("Date Retrieval Complete")
print("---------------------------------------------")   


# Visualize to confirm new city and new country appear
cityCountrySample.head(10)


---------------------------------------------
Beginning Data Retrieval
---------------------------------------------
Number of cities to obtain data: 697
---------------------------------------------
Retrieving index # 0 | avarua, ck
  http://api.openweathermap.org/data/2.5/weather?appid=8bea25d5e18b7c2fb2ee3ac12cec5dad&units=imperial&q=avarua%2Cck
Retrieving index # 1 | nikolskoye, ru
  http://api.openweathermap.org/data/2.5/weather?appid=8bea25d5e18b7c2fb2ee3ac12cec5dad&units=imperial&q=nikolskoye%2Cru
Retrieving index # 2 | punta arenas, cl
  http://api.openweathermap.org/data/2.5/weather?appid=8bea25d5e18b7c2fb2ee3ac12cec5dad&units=imperial&q=punta+arenas%2Ccl
Retrieving index # 3 | butaritari, ki
  http://api.openweathermap.org/data/2.5/weather?appid=8bea25d5e18b7c2fb2ee3ac12cec5dad&units=imperial&q=butaritari%2Cki
Retrieving index # 4 | rincon, an
  http://api.openweathermap.org/data/2.5/weather?appid=8bea25d5e18b7c2fb2ee3ac12cec5dad&units=imperial&q=rincon%2Can
     Missing city

Unnamed: 0,city,country,location,newCity,newCountry,lat,long,tempF,humidity,cloud,windMPH,maxTemp,timeStamp
0,avarua,ck,"avarua, ck",Avarua,CK,-21.21,-159.78,78.8,94.0,12.0,6.93,78.8,NaT
1,nikolskoye,ru,"nikolskoye, ru",Nikolskoye,RU,59.7,30.79,32.68,82.0,90.0,8.95,33.8,NaT
2,punta arenas,cl,"punta arenas, cl",Punta Arenas,CL,-53.16,-70.91,54.27,62.0,75.0,17.22,55.4,NaT
3,butaritari,ki,"butaritari, ki",Butaritari,KI,3.07,172.79,81.53,100.0,36.0,13.24,81.53,NaT
4,rincon,an,"rincon, an",,,,,,,,,,NaT
5,belushya guba,ru,"belushya guba, ru",,,,,,,,,,NaT
6,hermanus,za,"hermanus, za",Hermanus,ZA,-34.42,19.24,57.99,84.0,88.0,5.01,57.99,NaT
7,nagornskiy,ru,"nagornskiy, ru",Nagornskiy,RU,58.77,57.55,30.99,100.0,92.0,1.01,30.99,NaT
8,souillac,mu,"souillac, mu",Souillac,MU,-20.52,57.52,75.96,83.0,40.0,8.05,77.0,NaT
9,atar,mr,"atar, mr",Atar,MR,20.52,-13.05,52.91,79.0,0.0,5.53,52.91,NaT


In [194]:
#remove the rows with no newCity  These are the records which were not found 
cityCountrySample2 = cityCountrySample[cityCountrySample.newCity != ""].reset_index()
cityCountrySample2.head(10)

Unnamed: 0,index,city,country,location,newCity,newCountry,lat,long,tempF,humidity,cloud,windMPH,maxTemp,timeStamp
0,0,avarua,ck,"avarua, ck",Avarua,CK,-21.21,-159.78,78.8,94,12,6.93,78.8,NaT
1,1,nikolskoye,ru,"nikolskoye, ru",Nikolskoye,RU,59.7,30.79,32.68,82,90,8.95,33.8,NaT
2,2,punta arenas,cl,"punta arenas, cl",Punta Arenas,CL,-53.16,-70.91,54.27,62,75,17.22,55.4,NaT
3,3,butaritari,ki,"butaritari, ki",Butaritari,KI,3.07,172.79,81.53,100,36,13.24,81.53,NaT
4,6,hermanus,za,"hermanus, za",Hermanus,ZA,-34.42,19.24,57.99,84,88,5.01,57.99,NaT
5,7,nagornskiy,ru,"nagornskiy, ru",Nagornskiy,RU,58.77,57.55,30.99,100,92,1.01,30.99,NaT
6,8,souillac,mu,"souillac, mu",Souillac,MU,-20.52,57.52,75.96,83,40,8.05,77.0,NaT
7,9,atar,mr,"atar, mr",Atar,MR,20.52,-13.05,52.91,79,0,5.53,52.91,NaT


## Retrieve data on all cities 
Since the practice run on 10 entries works well, complete the exercise on all records.

In [128]:
#create a new df to store the values from the API
apiData = cityCountry2
apiData.head(20)

Unnamed: 0,city,country,location,newCity,newCountry,lat,long,tempF,humidity,cloud,windMPH,maxTemp
0,lolua,tv,"lolua, tv",,,,,,,,,
1,taolanaro,mg,"taolanaro, mg",,,,,,,,,
2,pevek,ru,"pevek, ru",,,,,,,,,
3,ushuaia,ar,"ushuaia, ar",,,,,,,,,
4,punta arenas,cl,"punta arenas, cl",,,,,,,,,
5,sovetskiy,ru,"sovetskiy, ru",,,,,,,,,
6,bluff,nz,"bluff, nz",,,,,,,,,
7,chokurdakh,ru,"chokurdakh, ru",,,,,,,,,
8,rikitea,pf,"rikitea, pf",,,,,,,,,
9,new norfolk,au,"new norfolk, au",,,,,,,,,


In [None]:
# create a parameters dict that will be updated with new city each iteration
#per API, imperial units will obtain temperature in F and wind speed in mph
parameters = {"appid": myKey,
             "units": "imperial"}

print("---------------------------------------------") 
print("Beginning Data Retrieval")
print("---------------------------------------------") 
      
#begin by knowing the number of cities
print("Number of cities to obtain data: " + str(len(cityCountry))) 
print("---------------------------------------------")       

# Loop through the cityCountrySample and perform a search on each
for index, row in apiData.iterrows():
    base_url = "http://api.openweathermap.org/data/2.5/weather?"

    city = row['city']
    country = row['country']

    # update address key value
    parameters['q'] = f"{city},{country}"

    # make request
    cities_data = requests.get(base_url, params=parameters)
    
    #print the record number  ??? how do I print the index number?
    print("Retrieving index # " + str(index) + " | " + row["city"] + ", " + row["country"])
    
    #confirm the URL
    print("  " + (cities_data.url))
        
    # convert to json
    citiesJ = cities_data.json()
    
    try:
        apiData.loc[index, "newCity"] = citiesJ["name"]
    except:
        print("     Missing city name... skipping data.")     
    
    try:    
        apiData.loc[index, "newCountry"] = citiesJ["sys"]["country"]
    except: 
        print("     Missing country name... skipping data.")
 
    try:    
        apiData.loc[index, "lat"] = citiesJ["coord"]["lat"]
    except: 
        print("     Missing latitude... skipping data.")
        
    try:    
        apiData.loc[index, "long"] = citiesJ["coord"]["lon"]
    except: 
        print("     Missing longitude... skipping data.")
        
    try:    
        apiData.loc[index, "tempF"] = citiesJ["main"]["temp"]
    except: 
        print("     Missing temp(F)... skipping data.")
        
    try:    
        apiData.loc[index, "humidity"] = citiesJ["main"]["humidity"]
    except: 
        print("     Missing humidity... skipping data.")
        
    try:    
        apiData.loc[index, "cloud"] = citiesJ["clouds"]["all"]
    except: 
        print("     Missing %of cloudiness... skipping data.")
     
    try:    
        apiData.loc[index, "windMPH"] = citiesJ["wind"]["speed"]
    except: 
        print("     Missing wind speed... skipping data.")
        
    try:    
        apiData.loc[index, "maxTemp"] = citiesJ["main"]["temp_max"]
    except: 
        print("     Missing maximum temperature... skipping data.")
    
    try:
        #add the timestamp when the record was retrieved
        apiData.loc[index, "timeStamp"] = pd.Series([dt.datetime.now()])
    except:
        dumbVariable = 1
    
    #the openweather API is free for up to 60 calls per minute.  Add a sleep timer to wait .75 seconds between calls.
    time.sleep(.75)

   
#At least 500 cities must be included for the next step of creating graphics.   Ensure there are at least 500 cities.

#obtain the number of records where a city was not found
noRecord = len(apiData[apiData['newCity'] == ""])
print("Records not found in the weather database: " + str(noRecord))

#subtract these from the number of records at the beginning
endRecords = (len(cityCountry)) - noRecord

initRecords = (len(cityCountry))

print("---------------------------------------------")
print("Beginning Records: " + str(initRecords))
print("Ending Records: " + str(endRecords))

if endRecords < 500:
    print("The data sample is less than 500, which is too small.  Please start from the beginning to obtain a new data sample.")
    print("---------------------------------------------")  
    
    #save the data to a csv in case the user needs to view the details
    apiData.to_csv("weatherData_TooSmall.csv", encoding='utf-8', index=False)
    print("All data collected is stored in the same directory as this python file.   Filename = weatherData_TooSmall.csv")
    
    print("Date Retrieval Complete")
    print("---------------------------------------------")   

else: 
    print("There are more than 500 ending records!   Continue to the graphs.")
    apiData.to_csv("weatherData.csv", encoding='utf-8', index=False)
    print("All data collected is stored in the same directory as this python file.   Filename = weatherData.csv")
    


print("---------------------------------------------")   
print("Date Retrieval Complete")
print("---------------------------------------------")   
    
# Visualize to confirm new city and new country appear
apiData.head(10)


In [130]:
noRecord = len(apiData[apiData['newCity'] == ""])
print("Records not found in the weather database: " + str(noRecord))

Records not found in the weather database: 85


In [None]:
#remove the rows with no newCity  These are the records which were not found 
apiData2 = apiData[apiData.newCity != ""].reset_index()
apiData.head(10)

### Convert Raw Data to DataFrame
* Export the city data into a .csv.
* Display the DataFrame

### Plotting the Data
* Use proper labeling of the plots using plot titles (including date of analysis) and axes labels.
* Save the plotted figures as .pngs.

#### Latitude vs. Temperature Plot

#### Latitude vs. Humidity Plot

#### Latitude vs. Cloudiness Plot

#### Latitude vs. Wind Speed Plot