<h3>Step 1: Import the requests library</h3>

In [1]:
import requests

<h3>Step 2: Send an HTTP request, get the response, and save in a variable</h3>

In [2]:
response = requests.get("http://www.epicurious.com/search/Tofu+Chili")

<h3>Step 3: Check the response status code to see if everything went as planned</h3>
<li>status code 200: the request response cycle was successful
<li>any other status code: it didn't work (e.g., 404 = page not found)

In [3]:
print(response.status_code)

200


<h3>Step 4: Get the content of the response</h3>
<li>Convert to utf-8 if necessary

In [12]:
response.content.decode('utf-8')[:400]

'<!doctype html>\n<html>\n  <head><meta charset="utf-8">\n<meta name="apple-itunes-app" content="app-id=312101965" />\n<title>Search | Epicurious.com</title>\n<link rel="dns-prefetch" href="//assets.adobedtm.com">\n<link rel="dns-prefetch" href="https://www.google-analytics.com">\n<link rel="dns-prefetch" href="//tpc.googlesyndication.com">\n<link rel="dns-prefetch" href="//static.parsely.com">\n<link rel="'

<h4>Problem: Get the contents of Wikipedia's main page and look for the string "Did you know" in it</h4>

In [14]:
import requests

url = "https://en.wikipedia.org/wiki/main_page"
response = requests.get(url)

In [16]:
print(response.status_code)

200


In [15]:
response.content.decode('utf-8')[:400]

'<!DOCTYPE html>\n<html class="client-nojs" lang="en" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>Wikipedia, the free encyclopedia</title>\n<script>document.documentElement.className = document.documentElement.className.replace( /(^|\\s)client-nojs(\\s|$)/, "$1client-js$2" );</script>\n<script>(window.RLQ=window.RLQ||[]).push(function(){mw.config.set({"wgCanonicalNamespace":"","wgCanonicalSpecialPa'

<h2>JSON</h2>
<li>The python library - json - deals with converting text to and from JSON


In [17]:
import json
data_string = '[{"b": [2, 4], "c": 3.0, "a": "A"}]'
python_data = json.loads(data_string)
print(python_data)

[{'b': [2, 4], 'c': 3.0, 'a': 'A'}]


<h3>json.loads recursively decodes a string in JSON format into equivalent python objects</h3>
<li>data_string's outermost element is converted into a python list
<li>the first element of that list is converted into a dictionary
<li>the key of that dictionary is converted into a string
<li>the value of that dictionary is converted into a list of two integer elements

In [18]:
print(type(data_string),type(python_data))
print(type(python_data[0]),python_data[0])
print(type(python_data[0]['b']),python_data[0]['b'])

<class 'str'> <class 'list'>
<class 'dict'> {'b': [2, 4], 'c': 3.0, 'a': 'A'}
<class 'list'> [2, 4]


<h3>json.loads will throw an exception if the format is incorrect</h3>

In [19]:
#Wrong
json.loads("Hello")
#Correct
json.loads('"Hello"')

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

In [20]:
import json
data_string = json.dumps(python_data)
print(type(data_string))
print(data_string)


<class 'str'>
[{"b": [2, 4], "c": 3.0, "a": "A"}]


<h2>requests library and JSON</h2>

In [38]:
address="Istanbul Bilgi University, Turkey"
url="https://maps.googleapis.com/maps/api/geocode/json?address=%s" % (address)
response = requests.get(url).json()
print(type(response))

<class 'dict'>


<h3>Exception checking!</h3>

In [39]:
try:
    response = requests.get(url)
    if not response.status_code == 200:
        print("HTTP error",response.status_code)
    else:
        try:
            response_data = response.json()
        except:
            print("Response not in valid JSON format")
except:
    print("Something went wrong with requests.get")
print(type(response_data))

<class 'dict'>


In [40]:
response_data

{'results': [{'address_components': [{'long_name': '2',
     'short_name': '2',
     'types': ['subpremise']},
    {'long_name': '13', 'short_name': '13', 'types': ['street_number']},
    {'long_name': 'Kazım Karabekir Caddesi',
     'short_name': 'Kazım Karabekir Cd.',
     'types': ['route']},
    {'long_name': 'Emniyettepe Mahallesi',
     'short_name': 'Emniyettepe Mahallesi',
     'types': ['administrative_area_level_4', 'political']},
    {'long_name': 'Eyüp',
     'short_name': 'Eyüp',
     'types': ['administrative_area_level_3', 'political']},
    {'long_name': 'Eyüp',
     'short_name': 'Eyüp',
     'types': ['administrative_area_level_2', 'political']},
    {'long_name': 'İstanbul',
     'short_name': 'İstanbul',
     'types': ['administrative_area_level_1', 'political']},
    {'long_name': 'Turkey',
     'short_name': 'TR',
     'types': ['country', 'political']},
    {'long_name': '34060', 'short_name': '34060', 'types': ['postal_code']}],
   'formatted_address': 'Emniyett

In [42]:
len(response_data['results'])

2

In [43]:
response_data['results'][0]

{'address_components': [{'long_name': '2',
   'short_name': '2',
   'types': ['subpremise']},
  {'long_name': '13', 'short_name': '13', 'types': ['street_number']},
  {'long_name': 'Kazım Karabekir Caddesi',
   'short_name': 'Kazım Karabekir Cd.',
   'types': ['route']},
  {'long_name': 'Emniyettepe Mahallesi',
   'short_name': 'Emniyettepe Mahallesi',
   'types': ['administrative_area_level_4', 'political']},
  {'long_name': 'Eyüp',
   'short_name': 'Eyüp',
   'types': ['administrative_area_level_3', 'political']},
  {'long_name': 'Eyüp',
   'short_name': 'Eyüp',
   'types': ['administrative_area_level_2', 'political']},
  {'long_name': 'İstanbul',
   'short_name': 'İstanbul',
   'types': ['administrative_area_level_1', 'political']},
  {'long_name': 'Turkey',
   'short_name': 'TR',
   'types': ['country', 'political']},
  {'long_name': '34060', 'short_name': '34060', 'types': ['postal_code']}],
 'formatted_address': 'Emniyettepe Mahallesi, Kazım Karabekir Cd. No:13 D:2, 34060 Eyüp/İs

In [44]:
response_data['results'][1]

{'address_components': [{'long_name': 'İstanbul',
   'short_name': 'İstanbul',
   'types': ['administrative_area_level_1', 'political']},
  {'long_name': 'Turkey',
   'short_name': 'TR',
   'types': ['country', 'political']},
  {'long_name': '34387', 'short_name': '34387', 'types': ['postal_code']}],
 'formatted_address': 'Kuştepe Mh., İnönü Cad. No: 6, Kuştepe Kampüsü, 34387 İstanbul, Turkey',
 'geometry': {'location': {'lat': 41.0729249, 'lng': 28.9915299},
  'location_type': 'GEOMETRIC_CENTER',
  'viewport': {'northeast': {'lat': 41.0742738802915, 'lng': 28.9928788802915},
   'southwest': {'lat': 41.0715759197085, 'lng': 28.9901809197085}}},
 'place_id': 'ChIJn-4jrPu2yhQRWCjPITDlyVM',
 'plus_code': {'compound_code': '3XFR+5J Şişli, 19 Mayıs Mahallesi, Şişli/Istanbul, Turkey',
  'global_code': '8GHC3XFR+5J'},
 'types': ['establishment', 'point_of_interest', 'university']}

In [45]:
response_data['results'][0]['geometry']

{'location': {'lat': 41.0671816, 'lng': 28.9455079},
 'location_type': 'ROOFTOP',
 'viewport': {'northeast': {'lat': 41.0685305802915, 'lng': 28.9468568802915},
  'southwest': {'lat': 41.0658326197085, 'lng': 28.9441589197085}}}

In [46]:
response_data['results'][0]['geometry']['location']

{'lat': 41.0671816, 'lng': 28.9455079}

<h2>Problem 1: Write a function that takes an address as an argument and returns a (latitude, longitude) tuple</h2>

In [None]:
def get_lat_lng(address_string):
    #python code goes here

<h2>Problem 2: Extend the function so that it takes a possibly incomplete address as an argument and returns a list of tuples of the form (complete address, latitude, longitude)</h2>

In [None]:
def get_lat_lng(address_string):
    #python code goes here

<h1>XML</h1>
<li>The python library - lxml - deals with converting an xml string to python objects and vice versa</li>

In [None]:
data_string = """
<Bookstore>
   <Book ISBN="ISBN-13:978-1599620787" Price="15.23" Weight="1.5">
      <Title>New York Deco</Title>
      <Authors>
         <Author Residence="New York City">
            <First_Name>Richard</First_Name>
            <Last_Name>Berenholtz</Last_Name>
         </Author>
      </Authors>
   </Book>
   <Book ISBN="ISBN-13:978-1579128562" Price="15.80">
      <Remark>
      Five Hundred Buildings of New York and over one million other books are available for Amazon Kindle.
      </Remark>
      <Title>Five Hundred Buildings of New York</Title>
      <Authors>
         <Author Residence="Beijing">
            <First_Name>Bill</First_Name>
            <Last_Name>Harris</Last_Name>
         </Author>
         <Author Residence="New York City">
            <First_Name>Jorg</First_Name>
            <Last_Name>Brockmann</Last_Name>
         </Author>
      </Authors>
   </Book>
</Bookstore>
"""

In [None]:
from lxml import etree
root = etree.XML(data_string)
print(root.tag,type(root.tag))

In [None]:
print(etree.tostring(root, pretty_print=True).decode("utf-8"))

<h3>Iterating over an XML tree</h3>
<li>Use an iterator. 
<li>The iterator will generate every tree element for a given subtree

In [None]:
for element in root.iter():
    print(element)

<h4>Or just use the child in subtree construction

In [None]:
for child in root:
    print(child)

<h4>Accessing the tag</h4>


In [None]:
for child in root:
    print(child.tag)

<h4>Using the iterator to get specific tags<h4>
<li>In the below example, only the author tags are accessed
<li>For each author tag, the .find function accesses the First_Name and Last_Name tags
<li>The .find function only looks at the children, not other descendants, so be careful!
<li>The .text attribute prints the text in a leaf node

In [None]:
for element in root.iter("Author"):
    print(element.find('First_Name').text,element.find('Last_Name').text)

<h4>Problem: Find the last names of all authors in the tree “root” using xpath</h4>

<h4>Using values of attributes as filters</h4>
<li>Example: Find the first name of the author of a book that weighs 1.5 oz

In [None]:
root.find('Book[@Weight="1.5"]/Authors/Author/First_Name').text

<h4>Problem: Print first and last names of all authors who live in New York City</h4>