# Make your requests faster

When you start scraping web-pages or requesting APIs, you will be facing a problem when doing a lot of requests: this is really slow!

It's because python is slow! You could say, well it should not be. Let's see how we can speed things up!

## Measure performances

In this notebook we will need to track how much time some code is taking to execute.
To make things easier, we will create a simple decorator that will print the number of micro-seconds a function takes to execute.

A good opportunity to practice decorators in a practical example!

*Note that you need python3.3 or higher.*

In [1]:
import time

def print_timing(func):
    '''Create a timing decorator function use @print_timing just above the function you want to time.'''

    def wrapper(*arg):
        start = time.perf_counter()
        
        # Run the function decorated
        result = func(*arg)

        end = time.perf_counter()
        execution_time = round((end - start), 2)
        print(f'{func.__name__} took {execution_time} sec')
        return result

    return wrapper


@print_timing
def example():
    time.sleep(2)


example()

example took 2.0 sec


## The API

For this example, we will use the [quotable.io](https://api.quotable.io) API. It's an online API you can use to generate a random cat fact.

But feel free to replace `api_url` value with any API you'd like.

In [2]:
api_url = "https://catfact.ninja/fact"

## The "classic" way

If you start playing with requests, your should probably have something like this:

In [3]:
import requests


def basic_request(url: str):
    response = requests.get(url)
    response_json = response.json()
    print(response_json["fact"])


@print_timing
def basic_loop_request(url: str):
    # Query 50 times the API
    for _ in range(50):
        basic_request(url)


basic_loop_request(api_url)

The cat's front paw has 5 toes, but the back paws have 4. Some cats are born with as many as 7 front toes and extra back toes (polydactl).
Cats lose almost as much fluid in the saliva while grooming themselves as they do through urination.
Cats walk on their toes.
A cat rubs against people not only to be affectionate but also to mark out its territory with scent glands around its face. The tail area and paws also carry the cat’s scent.
In the original Italian version of Cinderella, the benevolent fairy godmother figure was a cat.
Normal body temperature for a cat is 102 degrees F.
In ancient Egypt, mummies were made of cats, and embalmed mice were placed with them in their tombs. In one ancient city, over 300,000 cat mummies were found.
In one stride, a cheetah can cover 23 to 26 feet (7 to 8 meters).
Cats have individual preferences for scratching surfaces and angles. Some are horizontal scratchers while others exercise their claws vertically.
Every year, nearly four million cats are 

### Results

On my machine it took **9.31 sec for 50 requests**. 

Pretty slow right? But why is that?

Each time you make a request, your computer needs to create a new "session", format your request, send it and wait to receive the response before doing it again with the next request.

## The "session" way

To speed this, we can use a **"session"** that will be share by all the requests.

You can picture it as a postman that knows you already, so he knows which bell to ring, where is the mailbox,... Instead of having to search for those each time.

In [None]:
#session的普通用法
from requests import Session

session = Session() #创造一个session object，并且在每次reach相同网址时call同一个这个session

for _ in range(5):
    response = session.get("xxx") #注意，session instance不是必须同一个网址，它只是个客户端状态容器存储了TCP连接复用、身份验证等，一般而言同一个root网址就可以增加效率，不同root网址也能操作就是失去效率提高功能了
    data = response.json()
    print(data["fact"])

session.close() #记得要关闭，所以可以试着用with...语句使其自动关闭

In [None]:
from requests import Session


def session_request(url: str, session: Session): #注意，这里的session: Session只是注释哦，表明这里第二个传入的参数应该是Session类型（就像第一个参数url是一个str类型！）
    # Instead of using request.get, we use our session
    response = session.get(url)
    response_json = response.json()
    print(response_json["fact"])


@print_timing
def session_loop_request(url: str):
    # Create shared session for all of your requests
    with Session() as session: #注意！session实例是在这里创建的！
        # Query 50 times the API
        for _ in range(50):
            session_request(url, session)


session_loop_request(api_url)

Cats are subject to gum disease and to dental caries. They should have their teeth cleaned by the vet or the cat dentist once a year.
Not every cat gets \high\" from catnip. Whether or not a cat responds to it depends upon a recessive gene: no gene"
Cat paws act as tempetature regulators, shock absorbers, hunting and grooming tools, sensors, and more
Cats have 3 eyelids.
The oldest cat on record was Crème Puff from Austin, Texas, who lived from 1967 to August 6, 2005, three days after her 38th birthday. A cat typically can live up to 20 years, which is equivalent to about 96 human years.
The ancestor of all domestic cats is the African Wild Cat which still exists today.
Miacis, the primitive ancestor of cats, was a small, tree-living creature of the late Eocene period, some 45 to 50 million years ago.
The cheetah is the world's fastest land mammal. It can run at speeds of up to 70 miles an hour (113 kilometers an hour).
A cat can travel at a top speed of approximately 31 mph (49 km) ov

### Results

It took me **6.19 sec for 50 requests**. That's better!

And as you can see, I didn't change that much in the code.

## The "Async" way

If you need even more performances, you will need to use [AsyncIo](https://docs.python.org/3/library/asyncio.html).

This is a library to allow you to run asynchronous code.

Why is that more efficiant? Well, when you send a request you need to wait for the response. And during the waiting time, our computer does nothing.
If you count all the time the computer is just "waiting" on 50 or more requests, you will be surprised to see that most of the computing time is just waiting for the server to respond.

[AsyncIo](https://docs.python.org/3/library/asyncio.html) allow you to bypass that.

But as always, it has a cost: complexity.

Making your code async will complixify the code a lot and make the debugging not a pleasant experience. Also, you will go so fast that you could be banned by the server.

My advice? Use it only if you need it.

I will show you a simple example but you want to understand it better, I really advice you **[this video](https://www.youtube.com/watch?v=qAh5dDODJ5k)**!

### Requirements
In order to simplify a bit the code, I will use [httpx](https://www.python-httpx.org/) a python library that is working the same way as the `requests` module but with few helpers for async.

In [5]:
!pip install httpx

Collecting httpx
  Using cached httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
Collecting anyio (from httpx)
  Using cached anyio-4.9.0-py3-none-any.whl.metadata (4.7 kB)
Collecting httpcore==1.* (from httpx)
  Using cached httpcore-1.0.7-py3-none-any.whl.metadata (21 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx)
  Using cached h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)
Collecting sniffio>=1.1 (from anyio->httpx)
  Using cached sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Using cached httpx-0.28.1-py3-none-any.whl (73 kB)
Using cached httpcore-1.0.7-py3-none-any.whl (78 kB)
Using cached anyio-4.9.0-py3-none-any.whl (100 kB)
Using cached h11-0.14.0-py3-none-any.whl (58 kB)
Using cached sniffio-1.3.1-py3-none-any.whl (10 kB)
Installing collected packages: sniffio, h11, httpcore, anyio, httpx
Successfully installed anyio-4.9.0 h11-0.14.0 httpcore-1.0.7 httpx-0.28.1 sniffio-1.3.1


### Warning!
This code won't work in jupyter notebook, there are subtilities for async in jupyter notebook. See [this thread](https://stackoverflow.com/questions/47518874/how-do-i-run-python-asyncio-code-in-a-jupyter-notebook) for more informations.

To make it simpler, I will put this code in a .py file and run it in command line:

python '''
from httpx import AsyncClient
import asyncio
import time


api_url = "https://catfact.ninja/fact"


async def session_request_async(url: str, session: AsyncClient):
    # Instead of using request.get, we use our session
    response = await session.get(url)
    response_json = response.json()
    print(response_json["fact"])
    return response_json


async def session_loop_request_async(url: str):
    # Create shared session for all of your requests
    async with AsyncClient() as session:
        # Create a list of empty tasks
        tasks = []
        # Query 50 times the API
        for _ in range(50):
            # Add a request to tasks
            tasks.append(
                asyncio.create_task(
                    session_request_async(url, session)        
                )
            )
        # Now that all the tasks are registred, run them
        responses = await asyncio.gather(*tasks)
            
            


start = time.perf_counter()

# We need to use asyncio.run to run the async function
asyncio.run(session_loop_request_async(api_url))

end = time.perf_counter()
execution_time = round((end - start), 2)
print(f'session_loop_request_async took {execution_time} sec')
'''


In [6]:
!python ./assets/async_requests.py

The claws on the cat’s back paws aren’t as sharp as the claws on the front paws because the claws in the back don’t retract and, consequently, become worn.
Two members of the cat family are distinct from all others: the clouded leopard and the cheetah. The clouded leopard does not roar like other big cats, nor does it groom or rest like small cats. The cheetah is unique because it is a running cat; all others are leaping cats. They are leaping cats because they slowly stalk their prey and then leap on it.
Cats are the world's most popular pets, outnumbering dogs by as many as three to one
Unlike humans, cats cannot detect sweetness which likely explains why they are not drawn to it at all.
Unlike dogs, cats do not have a sweet tooth. Scientists believe this is due to a mutation in a key taste receptor.
In multi-cat households, cats of the opposite sex usually get along better.
Cats are now Britain's favourite pet: there are 7.7 million cats as opposed to 6.6 million dogs.
The group of 

### Results
It only took me **0.8 sec for 50 requests**! That's impressive.

But as you can see, it is harder to write, structure and debug. So make sure you **really** need it if you consider using this method.

## Summary

If we gather all our results:

| Method                     | Execution time for 50 requests |
|----------------------------|--------------------------------|
| `requests.get` loop        | 9.31 sec                  |
| `requests` with `Session`  | 5.99 sec                   |
| `httpx` with `AsyncClient` | 0.8 sec                   |