# Asynchronous Requests

If we want to make many request, we might just use a for loop, however if we have 1000s to send this can take a long time. This is because each request has to wait until we got a response from the previous before it can be sent. 


In [1]:
from requests_futures.sessions import FuturesSession

In [2]:
session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already


In [3]:
r = future_one.result()

In [4]:
r.status_code ==200

True

# Making Requests

Let's write a function that will take a session and and address and make a request for us.

In [5]:
import numpy as np
import pandas as pd

In [7]:
def make_request(session, address):
    
    data = {"q":address, "n":1}
    headers ={"Accept": "application/json"}
    api_url = "https://www.als.ogcio.gov.hk/lookup"
    future = session.post(api_url,data=data,headers=headers)
    return future

Since the API is quite slow this will take sometime, therefore bellow is a function that prints the percentage progress.

In [13]:
import time,sys

def print_progress(futures):

    check_done = lambda x: x.done()
    check_done = np.vectorize(check_done)

    #basic percentage progress
    while not check_done(futures).all():
        time.sleep(1)
        percent = check_done(futures).mean() * 100
        sys.stdout.write("\r%d%%" % percent)
        sys.stdout.flush()    
    print("\n")


We can now read in the open rice csv and make a request for each unique address

In [6]:
df = pd.read_csv("data/open-rice.csv")

In [16]:
%%time
#create session with 16 workers
session = FuturesSession(max_workers=16)
#make all of the requests
futures =   np.array([ make_request(session,address) for address in df.address.unique()]) 
print_progress(futures)

100%CPU times: user 2min 51s, sys: 8.95 s, total: 3min
Wall time: 38min 13s


It took nearly 40 minutes even with async requests.

## Parsing Response

Now all of the requests have been made we can parse them to get the json.

In [43]:
json_results = np.vectorize(lambda x: x.result().json())(futures)

In [55]:
json_results[0]

{'RequestAddress': {'AddressLine': ['Shop J-K., 200 Hollywood Road,']},
 'SuggestedAddress': [{'Address': {'PremisesAddress': {'EngPremisesAddress': {'BuildingName': 'KEE ON BUILDING',
      'EngStreet': {'StreetName': 'HOLLYWOOD ROAD', 'BuildingNoFrom': '200'},
      'EngDistrict': {'DcDistrict': 'CW'},
      'Region': 'HK'},
     'ChiPremisesAddress': {'Region': '香港',
      'ChiDistrict': {'DcDistrict': 'CW'},
      'ChiStreet': {'StreetName': '荷李活道', 'BuildingNoFrom': '200'},
      'BuildingName': '祺安大廈'},
     'GeospatialInformation': [{'Northing': '816264',
       'Easting': '833279',
       'Latitude': '22.2852',
       'Longitude': '114.1478'},
      {'Northing': '816264',
       'Easting': '833281',
       'Latitude': '22.2852',
       'Longitude': '114.1478'}]}},
   'ValidationInformation': {'ValidationTime': None}}]}

We'll write this json to disk for future use.

In [59]:
result = [ json.dumps(result) for result in json_results]
result = json.dumps(result)
with open("data/openrice_addresses.json","w") as f:
    f.write(result)