In [51]:
import requests
import altair as alt
import pandas as pd
import httpx
import time
import asyncio

## Making Multiple Simultaneous Web Request: Trying out Async coding with HttpX

### Motivation
For the course management project, we have to make several requests in a row to the Zoom API to do some tasks, and it would be great if we could speed that up!  Since this is an IO-bound bottleneck, concurrent requests should solve that.

### Appproach
Here, I'm trying out `httpx`, a much-hyped Python library that is supposedly easy to use (like `requests`), supports the faster HTTP/2 protocol (unlike `requests`), and also supports asynchronous requests (unlike `requests`), while otherwise being very much like `requests` in its interfaces.  I'll try out making different numbers of requests and see how the total time to do the batch changes.  A good solution takes roughly the same amount of time to make 8 requests as a single request.


### Standard Synchronous Requests

In [111]:
measurements = []

for rep in range(1, 4):
    for n in [1, 2, 3, 5, 8]:
        start_time = time.perf_counter()
        for el in range(n):
            r = requests.get('http://www.example.com')
            r.raise_for_status()
        end_time = time.perf_counter()
        duration = end_time - start_time
        measurement = {'Num Requests': n, 'Rep': rep, 'Total Time': duration}
        measurements.append(measurement)

df_sync = pd.DataFrame(measurements)
alt.Chart(data=df_sync).mark_line().encode(x='Num Requests', y='Total Time', color='Rep:N')

As expected, more requests means more waiting time.

### HTTPX Written Badly: Synchronous Code Written in an Asynchronous Style

If you `await` every single asynchronous operation immediately, there should be no concurrency benefit.  Tricky, but here it's demonstrated.

In [108]:
measurements = []

async with httpx.AsyncClient() as client:
    for rep in range(1, 4):
        for n in [1, 2, 3, 5, 8]:
            start_time = time.perf_counter()
            for el in range(n):
                r = await client.get('http://www.example.com')
                r.raise_for_status()
            end_time = time.perf_counter()
            duration = end_time - start_time
            measurement = {'Num Requests': n, 'Rep': rep, 'Total Time': duration}
            measurements.append(measurement)


df_async = pd.DataFrame(measurements)
alt.Chart(data=df_async).mark_line().encode(x='Num Requests', y='Total Time', color='Rep:N')


As expected, this still gives the increasing amount of time.

### HTTPX with Async written properly
Here, all the requests are made for the batch at once, then the whole group is awaited (here, using the `asyncio.gather()` function).  This should mean that all the requests have a chance to go out before the program waits for responses from them all.

In [106]:
measurements = []

async with httpx.AsyncClient() as client:
    for rep in range(1, 4):
        for n in [1, 2, 3, 5, 8]:
            start_time = time.perf_counter()
            tasks = []
            for el in range(n):
                task = client.get('http://www.example.com')
                tasks.append(task)
            result = await asyncio.gather(*tasks)
            end_time = time.perf_counter()
            duration = end_time - start_time
            measurement = {'Num Requests': n, 'Rep': rep, 'Total Time': duration}
            measurements.append(measurement)


df_gather = pd.DataFrame(measurements)
alt.Chart(data=df_gather).mark_line().encode(x='Num Requests', y='Total Time', color='Rep:N')

Success!  The waiting time stays about the same, even as the number of requests increases!

## Conclusions

  - This approach should significantly speed up the zoom requests.
  - The `httpx` library was really easy to use.  