# Introduction

This is a sandbox to explore methods which use to parse and construct URLs

# `urlparse` and `urlsplit`

The difference between the two in this example is the extra `params` field that `urlparse` returns.

In [2]:
from urllib.parse import (
    urlparse, urlunparse, urlsplit, urlunsplit, urlencode,
    parse_qs, parse_qsl)

In [3]:
url = 'https://reqres.in/api/users?page=4&per_page=5'
parsed = urlparse(url)
parsed

ParseResult(scheme='https', netloc='reqres.in', path='/api/users', params='', query='page=4&per_page=5', fragment='')

In [5]:
splitted = urlsplit(url)
splitted

SplitResult(scheme='https', netloc='reqres.in', path='/api/users', query='page=4&per_page=5', fragment='')

# Parsing and Constructing the Query String

## `parse_qs` and `urlencode`

In [6]:
query_dict = parse_qs(splitted.query)
query_dict

{'page': ['4'], 'per_page': ['5']}

Reconstruct the query with urlencode

In [7]:
urlencode(query_dict)

'page=%5B%274%27%5D&per_page=%5B%275%27%5D'

The above is not what we want, we need to use the `doseq=True` parameter
to tell `urlencode` to treat each values as a list.

In [8]:
urlencode(query_dict, doseq=True)

'page=4&per_page=5'

What about mixed values of lists and strings without the use of doseq?

In [9]:
urlencode({'page': ['1', '2'], 'per_page': '5'})

'page=%5B%271%27%2C+%272%27%5D&per_page=5'

The above is not what we want, use `doseq`?

In [10]:
urlencode({'page': ['1', '2'], 'per_page': '25'}, doseq=True)

'page=1&page=2&per_page=25'

## `parse_qsl` and `urlencode`

Similar to `parse_qs`, `parse_qsl` parses the query string, but instead of returning a dictionary, it returns a list of (*key*, *value*) tuples. `urlencode` can take this list and returns a query string.

In [11]:
query_list = parse_qsl(splitted.query)
query_list

[('page', '4'), ('per_page', '5')]

In [12]:
urlencode(query_list)

'page=4&per_page=5'

Note that with the list of keys/values, we do not have to use the `doseq=True` parameter.