**Copyright: © NexStream Technical Education, LLC**.  
All rights reserved


# USGS Earthquake Scraper Introduction
In this project, you will create a 'web scraper' to access and retrieve real-time data from the US Geological Service (USGS) reflecting the latest active earthquakes around the world which are equal or above a user input magnitude.

The data is in JSON format so you'll need to convert the output into a user-readable (friendly) format.

The feed is from the USGS database here:  https://earthquake.usgs.gov/earthquakes/feed/.  You should become familiar with this site.

The format of the feed summary is here: https://earthquake.usgs.gov/earthquakes/feed/v1.0/geojson.php.  You should become familiar with the fields for the JSON data.  

Note you can use a JSON viewer for a more readable format of the data.  






# Part 1a:  Setup the environment and script and prompt the user for input.
Setup the script imports and prompt the user for the magnitude from which the USGS data will be accessed.  That is, any earthquake greater than or equal to the input magnitude will be retrieved from the database.  
You'll need to import the urllib.request library to get to the web site.
You also can input the json library to utilize the functions in that library.
Check out both API's for reference.


In [15]:
#Import the urllib.request and json libraries
import urllib.request as request
import json as datajson
import requests

####Your code here....

#Prompt the user to input a magnitude parameter of type floating point.  
while True:
    try: 
        magn = float(input("Enter magnitude parameter between 2.5 to 10: "))
        print(magn)
    
  
        if magn < 2.5 or magn >10:
            print("Magnitude parameter does not fit within range. Try again with a proper value.")  
        else:
            break


    except ValueError:
            print("Magnitude parameter is not in float format. Try again.")

Enter magnitude parameter between 2.5 to 10:  4


4.0


# Part 1b:  Write the printResults function.  
In this function, you should print the output of the data you retrieved from the site:  http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson      
See the code comments for guided instruction.


Note you can use a JSON viewer for a more readable format of the data if you want to view it before processing it with your function.



In [16]:
import json
#Function printResults(data)
#In Python 3.x we need to explicitly decode the response to a string 
#i.e. data is output from data.decode("utf-8") 

def printResults(data):

  # 1.  Use the json "loads" api  to load the string data into a dictionary
####Your code here....
    link = 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson'  
    response = requests.get(link)
    #stringdecode =  
    datajson = json.loads(response.content.decode('utf-8'))
    
      # with open(‘example_json.json’) as f: 

  # 2.  Access the contents of the JSON data
    
  #     and print out the metadata title
####Your code here....
    print("Metadata Title:", datajson.get('metadata', {}).get('title', 'No title found'))
  
  #3.  Output the number of events
####Your code here....
    count = datajson.get('features',[])
    print(type(count))
    print("Number of events",len(count))
  

  
  #4.  For each event, print the place where it occurred
####Your code here....
    for events in count:
        place = events.get('properties', {}).get('place','No location found')
    

  #5 For each event, if the magnitude is greater than the user input
    for events in count:
        magnitude = events.get('properties', {}).get('mag', 0)
        place = events.get('properties', {}).get('place','No location found')
        if magnitude > magn:
            print(f"Magnitude:{magnitude}, place:{place}")
  #  print both the magnitude and the place it occurred. 
  #  HINT: use the "title" field that each feature has.
####Your code here....

printResults(magn)

      


Metadata Title: USGS Magnitude 2.5+ Earthquakes, Past Day
<class 'list'>
Number of events 35
Magnitude:4.6, place:55 km S of Reuleuet, Indonesia
Magnitude:4.8, place:south of the Fiji Islands
Magnitude:4.5, place:50 km SSW of Sola, Vanuatu
Magnitude:4.1, place:175 km ESE of Chignik, Alaska
Magnitude:4.4, place:109 km S of Lorengau, Papua New Guinea
Magnitude:4.2, place:58 km SSW of Labuha, Indonesia
Magnitude:4.2, place:99 km NNW of Finschhafen, Papua New Guinea
Magnitude:4.7, place:164 km W of Lata, Solomon Islands
Magnitude:4.6, place:35 km NNE of Ziro, India
Magnitude:4.8, place:65 km WNW of Gunungsitoli, Indonesia
Magnitude:4.6, place:129 km W of Tobelo, Indonesia


# Part 1c:  Write the runner
In this code (either main or in a function), you should setup the URL from the USGS site, open the URL and read the data, call the printResults function.
http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson  
See the code comments for guided instruction.  
 
Note you can use a JSON viewer for a more readable format of the data if you want to view it before processing it with your function.

In [17]:
import urllib.request
# Define a variable to hold the source URL (see the notes for the URL)
link = ('https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson') 
response = urllib.request.urlopen(link)
# This feed lists all earthquakes for the last day larger than Mag 2.5 (this is your minimum input)
####Your code here....
# Open the URL and read the data
# See c
####Your code here....

# Print the HTTP status code of the response (200 is a valid response)
# See urllib.request.urlopen API
####Your code here....
try: 
    print("HTTP status", response.getcode())

except response.getcode != 200:
    print("HTTP error")


    
# If the HTTP status code of the response is valid (hint: 200) 
#    then read the data (hint: .read API) and convert to a string (hint: .decode("utf-8") API), 
#    and print the results using your printResults function from step 1b
# Make sure your code handles an error condition (i.e. non-valid status code) 
#    and print out the error code in that case.
####Your code here....


HTTP status 200


# Part 2:  Output data to spreadsheet
Convert output to CSV format.  

Rewrite the printResults function.  Call it printResults2(data) where a list or dictionary (your choice) is returned from the function to the runner then the data is converted to CSV format and saved to a file.

Change your runner to assign the returned data from your printResults2 function to a variable that you then convert to CSV format and save to a file.

Include at least the 4 retrieved from the database from Part 1.  
Include exception handling in your file IO processing.   

In [18]:
####Your code here....
import csv

def printResults2(magn):

  # 1.  Use the json "loads" api  to load the string data into a dictionary
####Your code here....
    link = 'https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson'  
    response = requests.get(link)
    #stringdecode =  
    datajson = json.loads(response.content.decode('utf-8'))

    csvdatalist = []
    count = datajson.get('features',[])
    for events in count:
        magnitude = events.get('properties', {}).get('mag', 0)
        place = events.get('properties', {}).get('place','No location found')
        if magnitude > magn:
            csvdatalist.append([magnitude,place])
    return csvdatalist
csvconvert = printResults2(magn)

filename= 'earthquakeresults.csv'
try:
    with open(filename, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerows(csvconvert)
        print ("File written, check location of notebook for file")
except Exception as e:
    print(f"File conversion error:{e}")



File written, check location of notebook for file
