# Getting Data - II

In this section, you will learn to:
- Get data from APIs  
- Read PDF files in python


### Getting Data from APIs

APIs, or application programming interfaces, are created by developers, companies and organisations to provide restricted access to data. It is very common to get data from APIs for data analysis, for example, you can get financial data (stock prices etc.), social media data (Facebook, Twitter etc. provide APIs), weather data, data about healthcare, music, food and drinks, and from almost every domain. 


Apart from being rich sources of data, there are other reasons to use APIs:
- When the data is being updated in real time. If you use downloaded CVS files, you'll have to download data manually, and update your analysis multiple times. Through APIs, you can automate the process of getting real-time data.
- Easy access to structured and verified data - though you can scrape websites, APIs can directly provide data in structured format, and is of better quality
- Access to restricted data: You cannot scrape all websites easily, and that's often illegal (e.g. Facebook, financial data etc.). APIs are the only way to get this data.

There are many more reasons depending on the use cases and the domain of application.

A list of useful APIs is available here: https://github.com/toddmotto/public-apis

#### Example Use Case: Google Maps Geocoding API

Google Maps provides many APIs, one of which is the <a href="https://developers.google.com/maps/documentation/geocoding/start?authuser=1">Google Maps Geocoding API</a>. You can use it to geocode addresses, i.e. get the latitude-longitude coordinates, and vice-versa. 
    
To use the API, go to <a href="https://developers.google.com/maps/">Google Developers</a>, get an API key, and go to the Geocoding API page.


Once you have an API key, getting the geocoded data of an address is easy. For e.g., if you want to geocode the address "UpGrad, Nishuvi building, Anne Besant Road, Worli, Mumbai", you need to separate the words using a "+", and provide the address and your API key in this format:

https://maps.googleapis.com/maps/api/geocode/json?address=UpGrad,+Nishuvi+building,+Anne+Besant+Road,+Worli,+Mumbai&key=YOUR_API_KEY


Thus, this is a two step process:
- Join the words in the address by a plus and convert it to a form ```words+in+the+address``` 
- Connect to the URL by appending the address and the API key
- Get a response from the API and convert it to a python object (here, a dictionary)


In [33]:
import numpy as np
import pandas as pd

# Need requests to connect to the URL, json to convert JSON to dict
import requests, json
import pprint

# joining words in the address by a "+"
add = "UpGrad, Nishuvi building, Anne Besant Road, Worli, Mumbai"
split_address = add.split(" ")
address = "+".join(split_address)
print(address)



UpGrad,+Nishuvi+building,+Anne+Besant+Road,+Worli,+Mumbai


Now, we can connect to the Google Maps URL using the api key and the address and get a response. Like most APIs, Google Maps returns the geocoded data in a JSON format, which is similar to a python dict.

As seen in the earlier section, we use the ```requests.get(url)``` method to get data from a URL. 

In [36]:
api_key = "AIzaSyBXrK8md7uaOcpRpaluEGZAtdXS4pcI5xo"

url = "https://maps.googleapis.com/maps/api/geocode/json?address={0}&key={1}".format(address, api_key)
r = requests.get(url)

print(type(r.text))
print(r.text)

<class 'str'>
{
   "results" : [
      {
         "address_components" : [
            {
               "long_name" : "75",
               "short_name" : "75",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Doctor Annie Besant Road",
               "short_name" : "Dr Annie Besant Rd",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Bhim Nagar",
               "short_name" : "Bhim Nagar",
               "types" : [ "political", "sublocality", "sublocality_level_2" ]
            },
            {
               "long_name" : "Worli",
               "short_name" : "Worli",
               "types" : [ "political", "sublocality", "sublocality_level_1" ]
            },
            {
               "long_name" : "Mumbai",
               "short_name" : "Mumbai",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "Mumbai",
           

The dict-like structure that you see above is a JSON object, and is the most common way of exchanging data through APIs. We can easily convert the JSON object to a python dict using ```json.loads(json_object)```.

Notice that it returns the components of the address, the full address, the latitude and the longitude, PIN code, etc. 

In [40]:
# converting the json object to a dict using json.loads()
r_dict = json.loads(r.text)

# the pretty printing library pprint makes it easy to read long dicts
pprint.pprint(r_dict)

{'results': [{'address_components': [{'long_name': '75',
                                      'short_name': '75',
                                      'types': ['street_number']},
                                     {'long_name': 'Doctor Annie Besant Road',
                                      'short_name': 'Dr Annie Besant Rd',
                                      'types': ['route']},
                                     {'long_name': 'Bhim Nagar',
                                      'short_name': 'Bhim Nagar',
                                      'types': ['political',
                                                'sublocality',
                                                'sublocality_level_2']},
                                     {'long_name': 'Worli',
                                      'short_name': 'Worli',
                                      'types': ['political',
                                                'sublocality',
                                 

In [42]:
r_dict.keys()
type(r_dict['results'])
r_dict['results'][0]

{'address_components': [{'long_name': '75',
   'short_name': '75',
   'types': ['street_number']},
  {'long_name': 'Doctor Annie Besant Road',
   'short_name': 'Dr Annie Besant Rd',
   'types': ['route']},
  {'long_name': 'Bhim Nagar',
   'short_name': 'Bhim Nagar',
   'types': ['political', 'sublocality', 'sublocality_level_2']},
  {'long_name': 'Worli',
   'short_name': 'Worli',
   'types': ['political', 'sublocality', 'sublocality_level_1']},
  {'long_name': 'Mumbai',
   'short_name': 'Mumbai',
   'types': ['locality', 'political']},
  {'long_name': 'Mumbai',
   'short_name': 'Mumbai',
   'types': ['administrative_area_level_2', 'political']},
  {'long_name': 'Maharashtra',
   'short_name': 'MH',
   'types': ['administrative_area_level_1', 'political']},
  {'long_name': 'India',
   'short_name': 'IN',
   'types': ['country', 'political']},
  {'long_name': '400018', 'short_name': '400018', 'types': ['postal_code']}],
 'formatted_address': 'Ground Floor, Nishuvi Building, 75, Dr Annie