# What is the Requests Resource?
Requests is an Apache2 Licensed HTTP library, written in Python. It is designed to be used by humans to interact with the language. This means you don’t have to manually add query strings to URLs, or form-encode your POST data. Don’t worry if that made no sense to you. It will in due time.

### What can Requests do?

Requests will allow you to send HTTP/1.1 requests using Python. With it, you can add content like headers, form data, multipart files, and parameters via simple Python libraries. It also allows you to access the response data of Python in the same way.

In programming, a library is a collection or pre-configured selection of routines, functions, and operations that a program can use. These elements are often referred to as modules, and stored in object format.

Libraries are important, because you load a module and take advantage of everything it offers without explicitly linking to every program that relies on them. They are truly standalone, so you can build your own programs with them and yet they remain separate from other programs.

Think of modules as a sort of code template.

To reiterate, Requests is a Python library.

<h3>Step 1: Import the requests library</h3>

In [164]:
import requests

<h3>Step 2: Send an HTTP request, get the response, and save in a variable</h3>

In [165]:
response = requests.get("http://www.epicurious.com/search/Tofu+Chili")

In [166]:
type(response)

requests.models.Response

<h3>Step 3: Check the response status code to see if everything went as planned</h3>
<li>status code 200: the request response cycle was successful
<li>any other status code: it didn't work (e.g., 404 = page not found)

In [167]:
print(response.status_code)

200


<h3>Step 4: Get the content of the response</h3>
<li>Convert to utf-8 if necessary

In [5]:
response.content.decode('utf-8')



<h4>Problem: Get the contents of Wikipedia's main page and look for the string "Did you know" in it</h4>

In [6]:
url = "https://en.wikipedia.org/wiki/main_page"
#The rest of your code should go below this line
response1 = requests.get(url)
response1.status_code

200

In [7]:
response1.content

b'<!DOCTYPE html>\n<html class="client-nojs" lang="en" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>Wikipedia, the free encyclopedia</title>\n<script>document.documentElement.className = document.documentElement.className.replace( /(^|\\s)client-nojs(\\s|$)/, "$1client-js$2" );</script>\n<script>(window.RLQ=window.RLQ||[]).push(function(){mw.config.set({"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":false,"wgNamespaceNumber":0,"wgPageName":"Main_Page","wgTitle":"Main Page","wgCurRevisionId":807996266,"wgRevisionId":807996266,"wgArticleId":15580374,"wgIsArticle":true,"wgIsRedirect":false,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":[],"wgBreakFrames":false,"wgPageContentLanguage":"en","wgPageContentModel":"wikitext","wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgMont

Here we can see letter **b** at the first position. It means that this is the representation called **byte string**. Let's transform it to Unicode. And to decode it into Unicode, we use the function **decode**, and we give it the coding scheme.

The **coding scheme** can vary.
There are lots of coding schemes.
But **UTF8** or **UTF16** are the most common.
So generally, if you're going to an English language web page,
you can expect that the result is going
to come back in UTF8 format.
It's going to come back.
It needs to be decoded using UTF8 as your decoder.

In [8]:
response1.content.decode('utf-8')

'<!DOCTYPE html>\n<html class="client-nojs" lang="en" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>Wikipedia, the free encyclopedia</title>\n<script>document.documentElement.className = document.documentElement.className.replace( /(^|\\s)client-nojs(\\s|$)/, "$1client-js$2" );</script>\n<script>(window.RLQ=window.RLQ||[]).push(function(){mw.config.set({"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":false,"wgNamespaceNumber":0,"wgPageName":"Main_Page","wgTitle":"Main Page","wgCurRevisionId":807996266,"wgRevisionId":807996266,"wgArticleId":15580374,"wgIsArticle":true,"wgIsRedirect":false,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":[],"wgBreakFrames":false,"wgPageContentLanguage":"en","wgPageContentModel":"wikitext","wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgMonth

Let's found the subtitle "Did you know..." from the main page and get number of byte where ir is situated.

In [9]:
response1.content.decode('utf-8').find("Did_you_know...")

14901

<h2>JSON</h2>
<li>The python library - json - deals with converting text to and from JSON


In [10]:
import json

data_string = '[{"b": [2, 4], "c": 3.0, "a": "A"}]'
python_data = json.loads(data_string)
print(python_data)

[{'b': [2, 4], 'c': 3.0, 'a': 'A'}]


<h3>json.loads recursively decodes a string in JSON format into equivalent python objects</h3>
<li>data_string's outermost element is converted into a python list
<li>the first element of that list is converted into a dictionary
<li>the key of that dictionary is converted into a string
<li>the value of that dictionary is converted into a list of two integer elements

In [11]:
print(type(data_string),type(python_data))
print(type(python_data[0]),python_data[0])
print(type(python_data[0]['b']),python_data[0]['b'])

<class 'str'> <class 'list'>
<class 'dict'> {'b': [2, 4], 'c': 3.0, 'a': 'A'}
<class 'list'> [2, 4]


<h3>json.loads will throw an exception if the format is incorrect</h3>

In [12]:
#Correct
json.loads('"Hello"')

'Hello'

The next code is wrong. And the reason I get that exception
is because here I have a string, but it
doesn't contain a JSON object.
**To contain a JSON object, it should have a string inside it.**

In [13]:
#Wrong
json.loads("Hello")

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

In [14]:
import json

data_string = json.dumps(python_data)
print(type(data_string))
print(data_string)


<class 'str'>
[{"b": [2, 4], "c": 3.0, "a": "A"}]


<h2>requests library and JSON</h2>

Luckily for us, we don't even have to do this.
The **Request Library** has a function
that automatically loads a JSON string into Python.
So for example, if we go to the API
that we saw earlier for Google APIs to get geocoding
and we send our request, instead of having to do response.content.decode and all that kind of stuff,
when we get the request back we can just call the JSON function
on it and it'll automatically load it,
assuming of course that it is a proper JSON string.

In [15]:
address="Columbia University, New York, NY"
url="https://maps.googleapis.com/maps/api/geocode/json?address=%s" % (address)
response = requests.get(url).json()
print(type(response))

<class 'dict'>


<h3>Exception checking!</h3>

So you should always be ready to face the fact
that your code may not work.
You may be expecting a JSON object back,
but the server instead sends you a malformed JSON object.
Be ready for that, too.
So always check for exceptions.
And that's what we're going to do now.
We're going to make sure that we have everything properly
checked over here.

In [16]:
address="Columbia University, New York, NY"
url="https://maps.googleapis.com/maps/api/geocode/json?address=%s" % (address)
try:
    response = requests.get(url)
    if not response.status_code == 200:
        print("HTTP error",response.status_code)
    else:
        try:
            response_data = response.json()
        except:
            print("Response not in valid JSON format")
except:
    print("Something went wrong with requests.get")
print(type(response_data))

<class 'dict'>


So let's see what the URL looks like.

In [17]:
url

'https://maps.googleapis.com/maps/api/geocode/json?address=Columbia University, New York, NY'

We get the response data.
And we notice that it's of type Dictionary.
So let's take a look at what this dictionary looks like. 

In [18]:
response_data

{'results': [{'address_components': [{'long_name': '116th St',
     'short_name': '116th St',
     'types': ['route']},
    {'long_name': 'Manhattan',
     'short_name': 'Manhattan',
     'types': ['political', 'sublocality', 'sublocality_level_1']},
    {'long_name': 'New York',
     'short_name': 'New York',
     'types': ['locality', 'political']},
    {'long_name': 'New York County',
     'short_name': 'New York County',
     'types': ['administrative_area_level_2', 'political']},
    {'long_name': 'New York',
     'short_name': 'NY',
     'types': ['administrative_area_level_1', 'political']},
    {'long_name': 'United States',
     'short_name': 'US',
     'types': ['country', 'political']},
    {'long_name': '10027', 'short_name': '10027', 'types': ['postal_code']}],
   'formatted_address': '116th St & Broadway, New York, NY 10027, USA',
   'geometry': {'location': {'lat': 40.8075355, 'lng': -73.9625727},
    'location_type': 'GEOMETRIC_CENTER',
    'viewport': {'northeast': {'l

We've got a JSON but it doesn't mean that Google actually
gave us the data we wanted.
Because if Google, if the Google part doesn't work,
they're going to send back a JSON object with the result
of an error inside it.
So the status here will be **'Bad'** instead of **"OK"**.

<h2>Problem 1: Write a function that takes an address as an argument and returns a (latitude, longitude) tuple</h2>

In [19]:
response_data['results']

[{'address_components': [{'long_name': '116th St',
    'short_name': '116th St',
    'types': ['route']},
   {'long_name': 'Manhattan',
    'short_name': 'Manhattan',
    'types': ['political', 'sublocality', 'sublocality_level_1']},
   {'long_name': 'New York',
    'short_name': 'New York',
    'types': ['locality', 'political']},
   {'long_name': 'New York County',
    'short_name': 'New York County',
    'types': ['administrative_area_level_2', 'political']},
   {'long_name': 'New York',
    'short_name': 'NY',
    'types': ['administrative_area_level_1', 'political']},
   {'long_name': 'United States',
    'short_name': 'US',
    'types': ['country', 'political']},
   {'long_name': '10027', 'short_name': '10027', 'types': ['postal_code']}],
  'formatted_address': '116th St & Broadway, New York, NY 10027, USA',
  'geometry': {'location': {'lat': 40.8075355, 'lng': -73.9625727},
   'location_type': 'GEOMETRIC_CENTER',
   'viewport': {'northeast': {'lat': 40.8088844802915,
     'lng':

In [20]:
response_data['results'][0]

{'address_components': [{'long_name': '116th St',
   'short_name': '116th St',
   'types': ['route']},
  {'long_name': 'Manhattan',
   'short_name': 'Manhattan',
   'types': ['political', 'sublocality', 'sublocality_level_1']},
  {'long_name': 'New York',
   'short_name': 'New York',
   'types': ['locality', 'political']},
  {'long_name': 'New York County',
   'short_name': 'New York County',
   'types': ['administrative_area_level_2', 'political']},
  {'long_name': 'New York',
   'short_name': 'NY',
   'types': ['administrative_area_level_1', 'political']},
  {'long_name': 'United States',
   'short_name': 'US',
   'types': ['country', 'political']},
  {'long_name': '10027', 'short_name': '10027', 'types': ['postal_code']}],
 'formatted_address': '116th St & Broadway, New York, NY 10027, USA',
 'geometry': {'location': {'lat': 40.8075355, 'lng': -73.9625727},
  'location_type': 'GEOMETRIC_CENTER',
  'viewport': {'northeast': {'lat': 40.8088844802915,
    'lng': -73.96122371970849},
  

In [21]:
for thing in response_data['results'][0]:
    print(thing)

address_components
formatted_address
geometry
place_id
types


In [22]:
response_data['results'][0]['geometry']

{'location': {'lat': 40.8075355, 'lng': -73.9625727},
 'location_type': 'GEOMETRIC_CENTER',
 'viewport': {'northeast': {'lat': 40.8088844802915,
   'lng': -73.96122371970849},
  'southwest': {'lat': 40.8061865197085, 'lng': -73.9639216802915}}}

In [23]:
response_data['results'][0]['geometry']['location']

{'lat': 40.8075355, 'lng': -73.9625727}

In [24]:
def get_lat_lng(address):
    #python code goes here
    import requests, time
    
    url="https://maps.googleapis.com/maps/api/geocode/json?address=%s" % (address)
    try:
        response = requests.get(url)
        if not response.status_code == 200:
            print("HTTP error",response.status_code)
        else:
            try:
                response_data = response.json()
            except:
                print("Response not in valid JSON format")
    except:
        print("Something went wrong with requests.get")
    try:
        time.sleep(1)
        lat = response_data['results'][0]['geometry']['location']['lat']
        lng = response_data['results'][0]['geometry']['location']['lng']
    except:
        print("Try another one.")
    return (lat,lng)

In [27]:
get_lat_lng("Columbia University, New York, NY")

(40.8075355, -73.9625727)

In [28]:
get_lat_lng("Maidan Nezalezhnosti, Kyiv, Ukraine")

(50.4507781, 30.5236861)

<h2>Problem 2: Extend the function so that it takes a possibly incomplete address as an argument and returns a list of tuples of the form (complete address, latitude, longitude)</h2>

In [36]:
get_lat_lng("London")

(51.5073509, -0.1277583)

In [41]:
address="Lon"
url="https://maps.googleapis.com/maps/api/geocode/json?address=%s" % (address)
try:
    response = requests.get(url)
    if not response.status_code == 200:
        print("HTTP error",response.status_code)
    else:
        try:
            response_data = response.json()
        except:
            print("Response not in valid JSON format")
except:
    print("Something went wrong with requests.get")
print(type(response_data))

<class 'dict'>


In [51]:
response_data['results'][1]['address_components'][0]['long_name']

'Lon'

In [55]:
propos_adr = []
for i in range(len(response_data['results'])):
    adr = response_data['results'][i]['address_components'][0]['long_name']
    lat = response_data['results'][i]['geometry']['location']['lat']
    lng = response_data['results'][i]['geometry']['location']['lng']
    propos_adr.append((adr,lat,lng))
    
propos_adr

[('London', 51.5073509, -0.1277583),
 ('Lon', 34.1500792, -105.123883),
 ('Lon', 37.1836603, -93.0593459)]

In [61]:
def get_lat_lng_incompl(address):
    #python code goes here
    import requests, time
    
    url="https://maps.googleapis.com/maps/api/geocode/json?address=%s" % (address)
    try:
        response = requests.get(url)
        if not response.status_code == 200:
            print("HTTP error",response.status_code)
        else:
            try:
                response_data = response.json()
            except:
                print("Response not in valid JSON format")
    except:
        print("Something went wrong with requests.get")
    try:
        time.sleep(1)
        propos_adr = []
        for i in range(len(response_data['results'])):
            adr = response_data['results'][i]['address_components'][0]['long_name']
            lat = response_data['results'][i]['geometry']['location']['lat']
            lng = response_data['results'][i]['geometry']['location']['lng']
            propos_adr.append((adr,lat,lng))
    except:
        print("Try another one.")
    return propos_adr    

In [62]:
get_lat_lng_incompl("Chi")

[('Chincoteague', 39.25757369999999, -76.7104279),
 ('Chitwood Hall', 39.6361068, -79.95463989999999)]

<h1>XML</h1>
<li>The python library - lxml - deals with converting an xml string to python objects and vice versa</li>

In [33]:
data_string = """
<Bookstore>
   <Book ISBN="ISBN-13:978-1599620787" Price="15.23" Weight="1.5">
      <Title>New York Deco</Title>
      <Authors>
         <Author Residence="New York City">
            <First_Name>Richard</First_Name>
            <Last_Name>Berenholtz</Last_Name>
         </Author>
      </Authors>
   </Book>
   <Book ISBN="ISBN-13:978-1579128562" Price="15.80">
      <Remark>
      Five Hundred Buildings of New York and over one million other books are available for Amazon Kindle.
      </Remark>
      <Title>Five Hundred Buildings of New York</Title>
      <Authors>
         <Author Residence="Beijing">
            <First_Name>Bill</First_Name>
            <Last_Name>Harris</Last_Name>
         </Author>
         <Author Residence="New York City">
            <First_Name>Jorg</First_Name>
            <Last_Name>Brockmann</Last_Name>
         </Author>
      </Authors>
   </Book>
</Bookstore>
"""

In [34]:
from lxml import etree
root = etree.XML(data_string)
print(root.tag,type(root.tag))

Bookstore <class 'str'>


In [35]:
print(etree.tostring(root, pretty_print=True).decode("utf-8"))

<Bookstore>
   <Book ISBN="ISBN-13:978-1599620787" Price="15.23" Weight="1.5">
      <Title>New York Deco</Title>
      <Authors>
         <Author Residence="New York City">
            <First_Name>Richard</First_Name>
            <Last_Name>Berenholtz</Last_Name>
         </Author>
      </Authors>
   </Book>
   <Book ISBN="ISBN-13:978-1579128562" Price="15.80">
      <Remark>
      Five Hundred Buildings of New York and over one million other books are available for Amazon Kindle.
      </Remark>
      <Title>Five Hundred Buildings of New York</Title>
      <Authors>
         <Author Residence="Beijing">
            <First_Name>Bill</First_Name>
            <Last_Name>Harris</Last_Name>
         </Author>
         <Author Residence="New York City">
            <First_Name>Jorg</First_Name>
            <Last_Name>Brockmann</Last_Name>
         </Author>
      </Authors>
   </Book>
</Bookstore>



<h3>Iterating over an XML tree</h3>
<li>Use an iterator. 
<li>The iterator will generate every tree element for a given subtree

In [63]:
for element in root.iter():
    print(element)

<Element Bookstore at 0x1d3a6253c08>
<Element Book at 0x1d3a61383c8>
<Element Title at 0x1d3a62bb8c8>
<Element Authors at 0x1d3a62bb4c8>
<Element Author at 0x1d3a61383c8>
<Element First_Name at 0x1d3a62bb8c8>
<Element Last_Name at 0x1d3a62bb4c8>
<Element Book at 0x1d3a62bbf08>
<Element Remark at 0x1d3a62bb8c8>
<Element Title at 0x1d3a61383c8>
<Element Authors at 0x1d3a62bbf08>
<Element Author at 0x1d3a62bb8c8>
<Element First_Name at 0x1d3a61383c8>
<Element Last_Name at 0x1d3a62bbf08>
<Element Author at 0x1d3a62bb4c8>
<Element First_Name at 0x1d3a62bb8c8>
<Element Last_Name at 0x1d3a61383c8>


<h4>Or just use the child in subtree construction

In [64]:
for child in root:
    print(child)

<Element Book at 0x1d3a628ee88>
<Element Book at 0x1d3a62b67c8>


<h4>Accessing the tag</h4>


In [65]:
for child in root:
    print(child.tag)

Book
Book


<h4>Using the iterator to get specific tags<h4>
<li>In the below example, only the author tags are accessed
<li>For each author tag, the .find function accesses the First_Name and Last_Name tags
<li>The .find function only looks at the children, not other descendants, so be careful!
<li>The .text attribute prints the text in a leaf node

In [66]:
for element in root.iter("Author"):
    print(element.find('First_Name').text,element.find('Last_Name').text)

Richard Berenholtz
Bill Harris
Jorg Brockmann


<h4>Problem: Find the last names of all authors in the tree “root” using xpath</h4>

In [67]:
for element in root.findall("Book/Title"):
    print(element.text)

New York Deco
Five Hundred Buildings of New York


In [70]:
for element in root.findall("Book/Authors/Author/Last_Name"):
    print(element.text)

Berenholtz
Harris
Brockmann


<h4>Using values of attributes as filters</h4>
<li>Example: Find the first name of the author of a book that weighs 1.5 oz

In [149]:
root.find('Book[@Weight="1.5"]/Authors/Author/First_Name').text

'Richard'

<h4>Problem: Print first and last names of all authors who live in New York City</h4>

In [213]:
books = root.findall("Book")
for i in range(len(books)):
    print(root.findall('Book/Authors/Author[@Residence="New York City"]/First_Name')[i].text,
          root.findall('Book/Authors/Author[@Residence="New York City"]/Last_Name')[i].text)

Richard Berenholtz
Jorg Brockmann
