# Day 3

## Topics

* Writing Custom Modules
* Testing Python Programs
* Organizing Python Code
* Working with APIs
* Invoking external applications
* Text processing using regular expressions

## Writing Custom Modules

Let's create a Python file `mymodule.py` with the following content.

In [11]:
%%file mymodule.py

print("BEGIN mymodule")
x = 3

def add(a, b):
    return a+b

print(add(3, 4))
print("END mymodule")


Overwriting mymodule.py


In [5]:
!python mymodule.py

BEGIN mymodule
7
END mymodule


Let's see what happens when we import it.

In [6]:
import mymodule

BEGIN mymodule
7
END mymodule


In [7]:
mymodule.x

2

In [9]:
mymodule.add(10, 20)

30

What will happen if we import the module again?

In [12]:
import mymodule

In [13]:
mymodule.x

2

### Reimporting a Module

In [14]:
import importlib

In [15]:
importlib.reload(mymodule)

BEGIN mymodule
7
END mymodule


<module 'mymodule' from '/opt/zeomega-python-2024/book/live-notes/mymodule.py'>

In [16]:
mymodule.x

3

### The `__name__` magic variable

In [17]:
%%file mymodule2.py

x = 2

def add(a, b):
    return a+b

print(add(3, 4))
print(__name__)


Writing mymodule2.py


Let's run this file as a script and see what will be the value of`__name__`.

In [18]:
!python mymodule2.py

7
__main__


Let's see what happens when we import the file as a module.

In [19]:
import mymodule2

7
mymodule2


We can use the `__name__` variable to find if the file is run a script or a module.

In [21]:
%%file mymodule3.py

x = 2

def add(a, b):
    return a+b

# do this only when this file is run as a script
if __name__ == "__main__":
    print(add(3, 4))


Writing mymodule3.py


In [22]:
!python mymodule3.py

7


In [24]:
import mymodule3

### Example: Square Module

Lets write a square program that can be used both a script and a module.

In [26]:
%%file sq.py
import sys

def square(x):
    return x*x

def main():
    n = int(sys.argv[1])
    print(square(n))

if __name__ == "__main__":
    main()

Overwriting sq.py


In [27]:
import sq

In [28]:
sq.square(5)

25

In [29]:
!python sq.py 4

16


### Docstrings

In [30]:
import os

In [31]:
help(os.listdir)

Help on built-in function listdir in module posix:

listdir(path=None)
    Return a list containing the names of the files in the directory.
    
    path can be specified as either str, bytes, or a path-like object.  If path is bytes,
      the filenames returned will also be bytes; in all other circumstances
      the filenames returned will be str.
    If path is None, uses the path='.'.
    On some platforms, path may also be specified as an open file descriptor;\
      the file descriptor must refer to a directory.
      If this functionality is unavailable, using it raises NotImplementedError.
    
    The list is in arbitrary order.  It does not include the special
    entries '.' and '..' even if they are present in the directory.



In [32]:
os.listdir?

[0;31mSignature:[0m [0mos[0m[0;34m.[0m[0mlistdir[0m[0;34m([0m[0mpath[0m[0;34m=[0m[0;32mNone[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return a list containing the names of the files in the directory.

path can be specified as either str, bytes, or a path-like object.  If path is bytes,
  the filenames returned will also be bytes; in all other circumstances
  the filenames returned will be str.
If path is None, uses the path='.'.
On some platforms, path may also be specified as an open file descriptor;\
  the file descriptor must refer to a directory.
  If this functionality is unavailable, using it raises NotImplementedError.

The list is in arbitrary order.  It does not include the special
entries '.' and '..' even if they are present in the directory.
[0;31mType:[0m      builtin_function_or_method

In [33]:
 help(sq.square)

Help on function square in module sq:

square(x)



#### Adding docstrings to a function

In [40]:
def add(a, b):
    """
    Adds two numbers.

        >>> add(3, 4)
        7
    """
    return a+b

In [41]:
help(add)

Help on function add in module __main__:

add(a, b)
    Adds two numbers.
    
        >>> add(3, 4)
        7



#### Using typehints

In [42]:
def add(a: int, b: int) -> int:
    """
    Adds two numbers.

        >>> add(3, 4)
        7
    """
    return a+b

In [43]:
help(add)

Help on function add in module __main__:

add(a: int, b: int) -> int
    Adds two numbers.
    
        >>> add(3, 4)
        7



In [44]:
def mean(numbers: list[float]) -> float:
    """Computes mean of a list of numbers.
    """
    pass

In [45]:
help(mean)

Help on function mean in module __main__:

mean(numbers: list[float]) -> float
    Computes mean of a list of numbers.



#### Adding docstrings to a module

In [51]:
%%file sq2.py
"""
The square module.

The module provides function to compute the square of a number.

This can also be used as a script to compute square of a number.

USAGE:
    $ python sq2.py 5
    25
"""
import sys

def square(x: int) -> int:
    """Computes square of a number.
    
        >>> square(4)
        16
    """
    return x*x

def main():
    n = int(sys.argv[1])
    print(square(n))

if __name__ == "__main__":
    main()

Writing sq2.py


In [53]:
import sq2

In [54]:
help(sq2)

Help on module sq2:

NAME
    sq2 - The square module.

DESCRIPTION
    The module provides function to compute the square of a number.
    
    This can also be used as a script to compute square of a number.
    
    USAGE:
        $ python sq2.py 5
        25

FUNCTIONS
    main()
    
    square(x: int) -> int
        Computes square of a number.
        
        >>> square(4)
        16

FILE
    /opt/zeomega-python-2024/book/live-notes/sq2.py




#### Problem: cube module

In [56]:
%load_problem cube-module

In [None]:
%%file cube.py
# your code here





## Installing third-party modules

Python maintains a repository of all third-party modules at https://pypi.org.

The `pip` too can be used to install any module from there.

In [57]:
!pip --version

pip 24.0 from /opt/tljh/user/lib/python3.10/site-packages/pip (python 3.10)


### Installing Packages

We can install a package by specifying the name, and optionally a verion.
```
$ pip install Flask
$ pip install Flask==2.2.3
$ pip install 'Flask>=2.2'
```

If we have multiple package, we can write that in a file and pass it to pip. The following example installs all packages specified in the file `requirements.txt`.

```
$ pip install -r requirements.txt
```

### Example: tabulate

In [58]:
!pip install tabulate

Defaulting to user installation because normal site-packages is not writeable


In [59]:
from tabulate import tabulate

headers = ["Name", "Price", "Quanity", "Amount"]
data = [
    ["Apple", 30, 3, 90],
    ["Banana", 4, 12, 48],
    ["Mango", 100, 4, 400]
]

print(tabulate(data, headers=headers))

Name      Price    Quanity    Amount
------  -------  ---------  --------
Apple        30          3        90
Banana        4         12        48
Mango       100          4       400


In [62]:
!pip install psycopg2

Defaulting to user installation because normal site-packages is not writeable
Collecting psycopg2
  Downloading psycopg2-2.9.9.tar.gz (384 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m384.9/384.9 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: psycopg2
  Building wheel for psycopg2 (setup.py) ... [?25ldone
[?25h  Created wheel for psycopg2: filename=psycopg2-2.9.9-cp310-cp310-linux_x86_64.whl size=166407 sha256=d6860c550aca81a8b585c243bc4d6709f69e24efc129bac987d33a8de5121719
  Stored in directory: /home/jupyter-anand/.cache/pip/wheels/7d/75/13/da1c6d88687ae81bf5e3cfa07d702981ba137963163472b050
Successfully built psycopg2
Installing collected packages: psycopg2
Successfully installed psycopg2-2.9.9


## Testing Python Programs

### Testing Manually - Naive approach

In [70]:
def square(x):
    return x*x+1

In [71]:
square(4)

17

Seems to be working fine. 

In [72]:
def test_square():
    assert square(4) == 16
    print("done")

In [73]:
test_square()

AssertionError: 

### Introduction to pytest

pytest is very nice library for testing Python program.

In [77]:
%%file sq.py

def square(x):
    return x*x+1

def test_square():
    assert square(4) == 16

Overwriting sq.py


In [78]:
!pytest sq.py

platform linux -- Python 3.10.14, pytest-8.2.0, pluggy-1.5.0
rootdir: /opt/zeomega-python-2024/book/live-notes
plugins: anyio-4.3.0
collected 1 item                                                               [0m

sq.py [31mF[0m[31m                                                                  [100%][0m

[31m[1m_________________________________ test_square __________________________________[0m

    [0m[94mdef[39;49;00m [92mtest_square[39;49;00m():[90m[39;49;00m
>       [94massert[39;49;00m square([94m4[39;49;00m) == [94m16[39;49;00m[90m[39;49;00m
[1m[31mE       assert 17 == 16[0m
[1m[31mE        +  where 17 = square(4)[0m

[1m[31msq.py[0m:6: AssertionError
[31mFAILED[0m sq.py::[1mtest_square[0m - assert 17 == 16


Often it is a common practice to seperate the main program with the test program.

In [79]:
%%file square2.py

def square(x):
    return x*x

Writing square2.py


In [82]:
%%file test_square2.py
from square2 import square  

def test_square():
    assert square(4) == 16

Overwriting test_square2.py


In [83]:
!pytest test_square2.py

platform linux -- Python 3.10.14, pytest-8.2.0, pluggy-1.5.0
rootdir: /opt/zeomega-python-2024/book/live-notes
plugins: anyio-4.3.0
collected 1 item                                                               [0m

test_square2.py [32m.[0m[32m                                                        [100%][0m



Lets extend the square module to add one more function and test for that.

In [84]:
%%file square3.py

def square(x):
    return x*x

def sum_of_squares(x, y):
    return square(x) + square(y)

Writing square3.py


In [85]:
%%file test_square3.py
from square3 import square, sum_of_squares

def test_square():
    assert square(0) == 0
    assert square(4) == 16

def test_sum_of_squares():
    assert sum_of_squares(3, 4) == 25
    assert sum_of_squares(4, 3) == 25
    

Writing test_square3.py


In [86]:
!pytest test_square3.py

platform linux -- Python 3.10.14, pytest-8.2.0, pluggy-1.5.0
rootdir: /opt/zeomega-python-2024/book/live-notes
plugins: anyio-4.3.0
collected 2 items                                                              [0m

test_square3.py [32m.[0m[32m.[0m[32m                                                       [100%][0m



In [87]:
!pytest test_square3.py -v

platform linux -- Python 3.10.14, pytest-8.2.0, pluggy-1.5.0 -- /opt/tljh/user/bin/python3.10
cachedir: .pytest_cache
rootdir: /opt/zeomega-python-2024/book/live-notes
plugins: anyio-4.3.0
collected 2 items                                                              [0m

test_square3.py::test_square [32mPASSED[0m[32m                                      [ 50%][0m
test_square3.py::test_sum_of_squares [32mPASSED[0m[32m                              [100%][0m



**Problem:** Write a test case for the `digit_count` function.

```python
def digit_count(number, digit):
    return str(number).count(str(digit))
```

**Hints:**

- create `digit_count.py` and `test_digit_count.py` files
- use `pytest` to run your tests

### Example: Hosts Parser

Let's write a program to parse the /etc/hosts file format.

In [88]:
!cat /etc/hosts

# /etc/hosts
127.0.0.1	localhost

# The following lines are desirable for IPv6 capable hosts
::1		localhost ip6-localhost ip6-loopback
ff02::1		ip6-allnodes
ff02::2		ip6-allrouters


Let's create a sample hosts file.

In [89]:
%%file hosts.txt
1.2.3.4 myhost www.myhost

# this is comment
127.0.0.1 localhost

1.2.3.4 foo bar
1.2.3.5 web1 web2

Writing hosts.txt


Let's write a program to parse this file format. 

At the end of the parsing, we need to know the IP address of every host specified in the file.

In [114]:
%%file hosts.py
"""
Module to parse /etc/hosts file format.
"""
import sys

def parse(filename: str) -> dict[str, str]:
    """Parses a hosts file.

    Returns a dictionary mapping from hostname to ip address 
    for every host specified in the file.

    The empty lines and comments are ignored.
    """
    result = {}
    for line in open(filename):
        hosts = parse_line(line)
        result.update(hosts)
    return result

def parse_line(line: str) -> dict[str, str]:
    """Parses a line of a hosts file.

    Returns a dict mapping from hostname to ip address
    of all the hosts mentioned in the line.

    The empty and comment lines gives empty result.

        >>> parse_line("") 
        {}
        >>> parse_line("1.2.3.4 h1 h2") 
        {"h1": "1.2.3.4", "h2": "1.2.3.4"}
    """
    if is_empty(line) or is_comment(line):
        return {}

    parts = line.split()
    ip = parts[0]
    hosts = parts[1:]
    
    return {h: ip for h in hosts}

def is_empty(line):
    """Checks if a line is empty.
    """
    return line.strip() == "" 

def is_comment(line):
    """Checks if a line is a comment.
    """
    return line.strip().startswith("#")

def main():
    filename = sys.argv[1]
    hosts = parse(filename)
    print(hosts)

if __name__ == "__main__":
    main()

Overwriting hosts.py


In [119]:
%%file test_hosts.py

from hosts import parse, parse_line, is_comment, is_empty

def test_is_comment():
    assert is_comment("") == False
    
    assert is_comment("# hello") == True
    assert is_comment("hello") == False

    assert is_comment("  # hello") == True
    assert is_comment("  # hello\n") == True

    assert is_comment("hello # world") == False

def test_is_empty():
    assert is_empty("") == True
    assert is_empty("hello") == False

    assert is_empty("  ") == True
    assert is_empty("\n") == True
    assert is_empty("\t") == True
    assert is_empty("   \n") == True

    assert is_empty("# hello") == False
    
def test_parse_line():
    # zero case
    assert parse_line("") == {}
    assert parse_line("\n") == {}

    # simple case
    assert parse_line("1.2.3.4 web") == {"web": "1.2.3.4"}
    assert parse_line("1.2.3.4 w1 w2") == {"w1": "1.2.3.4", "w2": "1.2.3.4"}

    # complex case: multiple spaces
    assert parse_line("1.2.3.4   web") == {"web": "1.2.3.4"}
    assert parse_line("1.2.3.4   w1   w2") == {"w1": "1.2.3.4", "w2": "1.2.3.4"}

    # trailing newline
    assert parse_line("1.2.3.4 web  \n") == {"web": "1.2.3.4"}

    # corner cases
    assert parse_line("1.2.3.4") == {}
    assert parse_line("w1 w2 1.2.3.4") == {"w2": "w1", "1.2.3.4": "w1"}

    assert parse_line("# comment") == {}

def test_parse(tmp_path):
    # zero case
    p = tmp_path / "empty.txt"
    p.write_text("")
    assert parse(p) == {}

    # simple case
    p = tmp_path / "a.txt"
    p.write_text("1.2.3.4 web\n") 
    assert parse(p) == {"web": "1.2.3.4"}

    p.write_text("1.2.3.4 web\n1.2.3.5 h1 h2\n") == {
        "web": "1.2.3.4",
        "h1": "1.2.3.5",
        "h2": "1.2.3.5"
    }

    # Complex use case
    p = tmp_path / "a.txt"
    p.write_text("""
    1.2.3.4 web

    # comment
    1.2.3.5 h1 h2
    1.2.3.6 h2 h3    
    """)
    assert parse(p) == {
        "web": "1.2.3.4",
        "h1": "1.2.3.5",
        "h2": "1.2.3.6",
        "h3": "1.2.3.6"
    }

Overwriting test_hosts.py


In [118]:
!pytest test_hosts.py  -v

platform linux -- Python 3.10.14, pytest-8.2.0, pluggy-1.5.0 -- /opt/tljh/user/bin/python3.10
cachedir: .pytest_cache
rootdir: /opt/zeomega-python-2024/book/live-notes
plugins: anyio-4.3.0
collected 4 items                                                              [0m

test_hosts.py::test_is_comment [32mPASSED[0m[32m                                    [ 25%][0m
test_hosts.py::test_is_empty [32mPASSED[0m[32m                                      [ 50%][0m
test_hosts.py::test_parse_line [32mPASSED[0m[32m                                    [ 75%][0m
test_hosts.py::test_parse [32mPASSED[0m[32m                                         [100%][0m



## Working with Web & APIs

* Working with Web using requests library
* Simple APIs
* JSON APIs
* HTTP methods
* Authorization

### Working with Web

The third-party library `requests` is a popular choice for working with web and APIs.

In [121]:
import requests

In [122]:
!curl https://anandology.com/tmp/hello.txt

Hello, world!


In [124]:
!curl https://figlet.apps.pipal.in/api/figlet

 _   _      _ _       _ 
| | | | ___| | | ___ | |
| |_| |/ _ \ | |/ _ \| |
|  _  |  __/ | | (_) |_|
|_| |_|\___|_|_|\___/(_)
                        


In [125]:
url = "https://anandology.com/tmp/hello.txt"

In [126]:
response = requests.get(url)

In [127]:
response

<Response [200]>

In [128]:
response.headers

{'Server': 'nginx/1.10.3 (Ubuntu)', 'Date': 'Wed, 15 May 2024 07:28:12 GMT', 'Content-Type': 'text/plain', 'Content-Length': '14', 'Last-Modified': 'Sat, 30 Nov 2019 10:26:32 GMT', 'Connection': 'keep-alive', 'ETag': '"5de243d8-e"', 'Access-Control-Allow-Origin': '*', 'Accept-Ranges': 'bytes'}

In [129]:
response.text

'Hello, world!\n'

In [130]:
requests.get(url).text

'Hello, world!\n'

### Example: Figlet

There is a fun unix command called `figlet` and `figlet.apps.pipal.in` provides as API to access that.

In [132]:
!figlet python

             _   _                 
 _ __  _   _| |_| |__   ___  _ __  
| '_ \| | | | __| '_ \ / _ \| '_ \ 
| |_) | |_| | |_| | | | (_) | | | |
| .__/ \__, |\__|_| |_|\___/|_| |_|
|_|    |___/                       


In [134]:
!figlet -f lean python

                                                                 
                            _/      _/                           
     _/_/_/    _/    _/  _/_/_/_/  _/_/_/      _/_/    _/_/_/    
    _/    _/  _/    _/    _/      _/    _/  _/    _/  _/    _/   
   _/    _/  _/    _/    _/      _/    _/  _/    _/  _/    _/    
  _/_/_/      _/_/_/      _/_/  _/    _/    _/_/    _/    _/     
 _/              _/                                              
_/          _/_/                                                 


In [135]:
!curl https://figlet.apps.pipal.in/api/figlet

 _   _      _ _       _ 
| | | | ___| | | ___ | |
| |_| |/ _ \ | |/ _ \| |
|  _  |  __/ | | (_) |_|
|_| |_|\___|_|_|\___/(_)
                        


we can specify the text and font as query parameters.

In [136]:
!curl "https://figlet.apps.pipal.in/api/figlet?text=Python"

 ____        _   _                 
|  _ \ _   _| |_| |__   ___  _ __  
| |_) | | | | __| '_ \ / _ \| '_ \ 
|  __/| |_| | |_| | | | (_) | | | |
|_|    \__, |\__|_| |_|\___/|_| |_|
       |___/                       


In [137]:
!curl "https://figlet.apps.pipal.in/api/figlet?text=Python&font=slant"

    ____        __  __              
   / __ \__  __/ /_/ /_  ____  ____ 
  / /_/ / / / / __/ __ \/ __ \/ __ \
 / ____/ /_/ / /_/ / / / /_/ / / / /
/_/    \__, /\__/_/ /_/\____/_/ /_/ 
      /____/                        


How to pass query parameters to requests?

In [139]:
url = "https://figlet.apps.pipal.in/api/figlet"
params = {
    "text": "Python",
    "font": ""
}
output = requests.get(url, params=params).text
print(output)

 ____        _   _                 
|  _ \ _   _| |_| |__   ___  _ __  
| |_) | | | | __| '_ \ / _ \| '_ \ 
|  __/| |_| | |_| | | | (_) | | | |
|_|    \__, |\__|_| |_|\___/|_| |_|
       |___/                       



In [141]:
def figlet(text, font=""):
    url = "https://figlet.apps.pipal.in/api/figlet"
    params = {
        "text": text,
        "font": font
    }
    return requests.get(url, params=params).text

In [143]:
figlet("0")

'  ___  \n / _ \\ \n| | | |\n| |_| |\n \\___/ \n       \n'

In [144]:
print(figlet("0"))

  ___  
 / _ \ 
| | | |
| |_| |
 \___/ 
       



In [145]:
for i in range(5):
    print(figlet(str(i)))

  ___  
 / _ \ 
| | | |
| |_| |
 \___/ 
       

 _ 
/ |
| |
| |
|_|
   

 ____  
|___ \ 
  __) |
 / __/ 
|_____|
       

 _____ 
|___ / 
  |_ \ 
 ___) |
|____/ 
       

 _  _   
| || |  
| || |_ 
|__   _|
   |_|  
        



In [146]:
%load_problem add-api

In [None]:
# your code here





In [148]:
%load_problem fibs-api

In [None]:
# your code here





### JSON APIs

In [149]:
import json

In [150]:
person = {
    "name": "Alice",
    "email": "alice@example.com",
    "verified": True,
    "numbers": [1, 2, 3, 4]
}

In [151]:
json.dumps(person)

'{"name": "Alice", "email": "alice@example.com", "verified": true, "numbers": [1, 2, 3, 4]}'

In [152]:
jsontext = json.dumps(person)

In [153]:
print(jsontext)

{"name": "Alice", "email": "alice@example.com", "verified": true, "numbers": [1, 2, 3, 4]}


In [154]:
json.loads(jsontext)

{'name': 'Alice',
 'email': 'alice@example.com',
 'verified': True,
 'numbers': [1, 2, 3, 4]}

The fibs API also supports response in JSON format.

In [156]:
!curl "https://numbers.apps.pipal.in/fibs?a=1&b=1&n=5"

1
1
2
3
5


In [157]:
!curl "https://numbers.apps.pipal.in/fibs?a=1&b=1&n=5&format=json"

{"result":[1,1,2,3,5]}


In [159]:
url = "https://numbers.apps.pipal.in/fibs"
params = {
    "a": 1,
    "b": 1,
    "n": 5,
    "format": "json"
}
response = requests.get(url, params=params).json()

In [160]:
response

{'result': [1, 1, 2, 3, 5]}

In [161]:
type(response)

dict

In [162]:
response['result']

[1, 1, 2, 3, 5]

In [163]:
sum(response['result'])

12

In [164]:
def fibs(a, b, n):
    url = "https://numbers.apps.pipal.in/fibs"
    params = dict(a=a, b=b, n=n, format="json")
    response = requests.get(url, params=params).json()    
    return response['result']

In [165]:
fibs(1,1,5)

[1, 1, 2, 3, 5]

In [167]:
fibs(1,1,10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

### HTTP Methods

Every HTTP request is made of one of the HTTP methods. The common ones are GET, POST, PUT and DELETE.

In [168]:
!curl -v https://anandology.com/tmp/hello.txt 

* Host anandology.com:443 was resolved.
* IPv6: (none)
* IPv4: 139.59.87.96
*   Trying 139.59.87.96:443...
* Connected to anandology.com (139.59.87.96) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES128-GCM-SHA256 / prime256v1 / id-ecPublicKey
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: CN=anandology.com
*  start date: May 12 13:26:19 2024 GMT
*  expire date: Aug 10 13:26:18 2024 GMT
*  subj

### Example: range

In [172]:
!curl -i \
    -H "Content-type: application/json" \
    -d '{"start": 1, "stop": 20, "step": 2}' \
    https://numbers.apps.pipal.in/range

HTTP/1.1 200 OK
[1mServer[0m: nginx/1.18.0 (Ubuntu)
[1mDate[0m: Wed, 15 May 2024 09:30:20 GMT
[1mContent-Type[0m: application/json
[1mContent-Length[0m: 38
[1mConnection[0m: keep-alive

{"result":[1,3,5,7,9,11,13,15,17,19]}


In [176]:
url = "https://numbers.apps.pipal.in/range"
data = {"start": 1, "stop": 20, "step": 2}

In [178]:
requests.post(url, json=data).json()

{'result': [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]}

In [179]:
def api_range(start, stop, step=1):
    url = "https://numbers.apps.pipal.in/range"
    data = {"start": start, "stop": stop, "step": step}
    response = requests.post(url, json=data).json()
    return response['result']

In [181]:
api_range(1, 20, 3)

[1, 4, 7, 10, 13, 16, 19]

In [182]:
%load_problem product-api

In [None]:
# your code here





### Authorization

In [193]:
!curl https://numbers.apps.pipal.in/store

{}


In [185]:
d = requests.get("https://numbers.apps.pipal.in/store").json()

In [187]:
headers = {
    "Authorization": "Bearer abcd1234"
}

In [192]:
for k in d:
    url = f"https://numbers.apps.pipal.in/store/{k}"
    requests.delete(url, headers=headers)
    print(k)

x
x0
x1
x2
x3
x4
x5
x6
x7
x8
x9


In [None]:
!curl -X DELETE \
    - H 'Authorization: Bearer abcd1234' \
    https://numbers.apps.pipal.in/store/x

In [201]:
!curl https://numbers.apps.pipal.in/store

{}


In [217]:
!curl -X PUT -v \
    -H 'Authorization: Bearer abcd1234' \
    -H 'Content-type: application/json' \
    -d '{"value": 42}' \
    https://numbers.apps.pipal.in/store/a0

* Host numbers.apps.pipal.in:443 was resolved.
* IPv6: (none)
* IPv4: 157.245.98.225
*   Trying 157.245.98.225:443...
* Connected to numbers.apps.pipal.in (157.245.98.225) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384 / X25519 / RSASSA-PSS
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: CN=numbers.apps.pipal.in
*  start date: Mar 20 05:08:49 2024 GMT
*  expire date: Jun 18 05:08:48 2024 GMT
*  subjectAltName: host "numbers.apps.pipal.in" matched c

In [199]:
!curl https://numbers.apps.pipal.in/store/a0

{"value":42}


In [200]:
!curl -X DELETE \
    -H 'Authorization: Bearer abcd1234' \
    https://numbers.apps.pipal.in/store/a0

{"ok":true}


Let's try to do this in Python.

In [206]:
base_url = "https://numbers.apps.pipal.in/store"
headers = {"Authorization": "Bearer abcd1234"}

In [211]:
def store_list():
    url = base_url
    return requests.get(url).json()

def store_get(name):
    url = f"{base_url}/{name}"
    return requests.get(url).json()["value"]

def store_put(name, value):
    url = f"{base_url}/{name}"
    data = {"value": value}
    return requests.put(url, headers=headers, json=data).json()

In [210]:
store_get("p0")

{'value': 42}

In [214]:
store_list()

{'a0': 0, 'p0': 42}

In [213]:
store_put("a0", 0)

{'ok': True}

In [215]:
for i in range(5):
    name = f"a{i}"
    store_put(name, i*i)

In [216]:
store_list()

{'a0': 0, 'a1': 1, 'a2': 4, 'a3': 9, 'a4': 16, 'p0': 42}

## Regular Expressions

Regular expression is a mini-language for pattern matching.

In [218]:
sentence = "10 apples and 20 mangoes"

In [219]:
sentence.split()

['10', 'apples', 'and', '20', 'mangoes']

In [220]:
import re

In [222]:
re.findall("\d+", sentence)

['10', '20']

In [223]:
re.sub("\d+", "X", sentence)

'X apples and X mangoes'

# Syntax

Regular expressions can have ordinary characters or special characters.

Rule 1: Ordinary characters match themselves

In [224]:
re.match("one", "one thousand phones")

<re.Match object; span=(0, 3), match='one'>

In [225]:
re.findall("one", "one thousand phones")

['one', 'one']

### Rule 2: Special character `.` matches any one character

In [227]:
re.match(".at", "cat")

<re.Match object; span=(0, 3), match='cat'>

In [228]:
re.match(".at", "bat")

<re.Match object; span=(0, 3), match='bat'>

If we want to match `.` literally, then we need to escape it.

In [229]:
re.match("\.at", "bat")

In [230]:
re.match("\.at", ".at")

<re.Match object; span=(0, 3), match='.at'>

### Special charater `|` is used to match one of the two patterns

In [232]:
re.match("a|b", "a")

<re.Match object; span=(0, 1), match='a'>

In [233]:
re.match("a|b", "b")

<re.Match object; span=(0, 1), match='b'>

In [234]:
re.findall("apple|mango", "10 apples and 20 mangoes")

['apple', 'mango']

### A character group

In [235]:
re.match("a[bc]", "ab")

<re.Match object; span=(0, 2), match='ab'>

In [236]:
re.match("a[bc]", "ac")

<re.Match object; span=(0, 2), match='ac'>

We can also specify a range.

In [237]:
re.match("a[0-9]b", "a5b")

<re.Match object; span=(0, 3), match='a5b'>

In [238]:
re.match("[a-z][0-9][a-z]", "k8s")

<re.Match object; span=(0, 3), match='k8s'>

### Modifiers

```
? - matches 0 or 1 occurances
* - matches 0 or more occurances
+ - matches 1 or more occurances
```

In [239]:
re.findall("0x[0-9a-f]+", "the number 0x12fe is a hexadecimal number.")

['0x12fe']

### Predefined escape codes

* `\d` - any digit
* `\s` - any whitespace
* `\w` - any identifier

In [241]:
re.findall("\d+", "10 apples and 20 mangoes")

['10', '20']

### Grouping

In [243]:
re.findall("(\d+) ([a-z]+)", "10 apples and 20 mangoes")

[('10', 'apples'), ('20', 'mangoes')]

In [244]:
text = "10 apples, 20 mangoes and 30 bananas"

In [245]:
from tabulate import tabulate

In [246]:
matches = re.findall("(\d+) ([a-z]+)", text)

In [247]:
matches

[('10', 'apples'), ('20', 'mangoes'), ('30', 'bananas')]

In [248]:
print(tabulate(matches))

--  -------
10  apples
20  mangoes
30  bananas
--  -------


### Match begin and end of a string

The special characters `^` and `$` indicate the begin and end of the string.

Remove trailing space.

In [249]:
re.sub("\s+$", "", "    hello world   ")

'    hello world'

## Python API of regular expressions

### findall

In [250]:
re.findall("\d+", "10 apples and 20 mangoes")

['10', '20']

### match

Match find a pattern at the beginning of a string.

In [251]:
re.match("\d+", "10 apples and 20 mangoes")

<re.Match object; span=(0, 2), match='10'>

In [252]:
m = re.match("\d+", "10 apples and 20 mangoes")

In [253]:
m.group()

'10'

### search

Similar to match, but finds anywhere in the string.

### sub

In [255]:
re.sub("\d+", "x", "10 apples and 20 mangoes")

'x apples and x mangoes'

### Split

In [257]:
re.split("<[a-z]+>", "<b>Hello</b><i>world</i>")

['', 'Hello</b>', 'world</i>']

### Example: antihtml

In [262]:
html = '<div>Try <a href="https://google.com/">Google</a> search.</div>'

In [263]:
import re

In [265]:
re.sub("<[^<>]*>", "", html)

'Try Google search.'

In [267]:
re.findall("<.*>", html)

['<div>Try <a href="https://google.com/">Google</a> search.</div>']

In [268]:
re.findall("<[^<>]*>", html)

['<div>', '<a href="https://google.com/">', '</a>', '</div>']

### Example: find links

Let's write a program to find all links in an HTML page.

Using regular expressions is probably not a fail-proof way to do this, but it is a interesting exercise.

In [274]:
%%file a.html

<ul>
<li><a title="Google" href="https://google.com/">Google</a></li>
<li><a active="true" href="https://microsoft.com/">Microsoft</a></li>
<li><a href="https://apple.com/" title="Apple">Apple</a></li>
</ul>  

href="https://flipkart.com/" is not a link.

Overwriting a.html


In [273]:
html = open("a.html").read()

In [279]:
re.findall('<a[^<>]*href="([^"]*)"[^<>]*>', html)

['https://google.com/', 'https://microsoft.com/', 'https://apple.com/']

## Hints for Assignment 03

## Problem 3: extcount

In [280]:
from pathlib import Path

In [284]:
for p in Path("files/extcount").iterdir():
    print(p.suffix)

.txt
.py
.txt
.py
.txt
.py
.csv
.yml
.csv
.py



In [287]:
[p.suffix for p in Path("files/extcount").iterdir()]


['.txt',
 '.py',
 '.txt',
 '.py',
 '.txt',
 '.py',
 '.csv',
 '.yml',
 '.csv',
 '.py',
 '']

In [289]:
data = [[2, "py"], [4, "txt"], [1, "yml"]]

In [290]:
sorted(data)

[[1, 'yml'], [2, 'py'], [4, 'txt']]

In [291]:
sorted(data, reverse=True)

[[4, 'txt'], [2, 'py'], [1, 'yml']]