## Handling CSRF Token - Login website with AJAX API and Python library requests

Created by [tanyongsheng.net](https://tanyongsheng.net)

---

### Introduction

Before I start scraping, I always check if I can do it anonymously (without logging in) to keep my identity hidden from web servers. If possible, I'll skip logging in and directly get the AJAX API, as shown in Part 2. But if authorization is required to access the data, this section comes into play.

### Target Website: 
[quotes.toscrape.com](https://quotes.toscrape.com/login)


<img src="../assets/static/quotes-toscrape-login-page.png" width=500px alt="quotes.toscrape.com Login page">

#### Step 1: Install python libraries

In [1]:
%pip install requests
%pip install lxml
# to inspect what python package imported in this jupyter notebook
%pip install watermark

Note: you may need to restart the kernel to use updated packages.

Note: you may need to restart the kernel to use updated packages.


In [2]:
import requests
from lxml import html

#### Step 2: Login to the website via AJAX API with Python libary requests

In [3]:
login_page_url = "https://quotes.toscrape.com/login"

# get CSRF token
session = requests.Session()
response = session.request('GET', login_page_url)
webpage_with_CRSF_token = response.content

tree = html.fromstring(webpage_with_CRSF_token)
csrf_token = tree.xpath("//input[@name='csrf_token']/@value")[0]
print("CSRF token: ", csrf_token)


CSRF token:  XbLzdqJNSaHsQpVRYoWkKZvxuFygirjhMmlPeOBtUTcGnAECIDfw


In [4]:
payload = {
    "csrf_token": csrf_token,
    "username": "demo",
    "password": "demo"
}

headers = {
  'content-type': 'application/x-www-form-urlencoded',
  'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36'
}

# send POST request to login page
response = session.request("POST", login_page_url, headers=headers, data=payload)
response.status_code

200

## Computing environment

In [5]:
%load_ext watermark

%watermark

# print out pypi packages used
%watermark --iversions

# date
%watermark -u -n -t -z

Last updated: 2024-03-03T10:55:58.214936+08:00

Python implementation: CPython
Python version       : 3.10.12
IPython version      : 8.22.1

Compiler    : MSC v.1916 64 bit (AMD64)
OS          : Windows
Release     : 10
Machine     : AMD64
Processor   : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel
CPU cores   : 8
Architecture: 64bit

requests: 2.31.0
lxml    : 5.1.0

Last updated: Sun Mar 03 2024 10:55:58Malay Peninsula Standard Time

