# Web API

**Author:** Nico Curti

**Course:** Software and Computing for Applied Physics - 87948

**Github:** [Nico-Curti](https://github.com/Nico-Curti)

In many real case studies we need to develop applications capable to comunicate with an external program/database, getting the data from it and/or update its records from another program.

All these tasks requires the creation of **communications** between programs and/or via internet connection.

## How can we comunicate with another program?

The communication between programs can be established via Internet connection, creating a "dialog" between programs who listen and communicate using the PORTs.

The current Jupyter Notebook is a program which is running in my computer and it is sending some directives to a port in my computer (*localhost*) allowing me to write this slide!

By definition:

> A network **socket** is a software structure within a network node of a computer network that serves as an endpoint for sending and receiving data across the network. 
> The structure and properties of a socket are defined by an application programming interface (API) for the networking architecture. 
> Sockets are created only during the lifetime of a process of an application running in the node. 

In all your web API projects you need to manage (at the same time!) two main elements:

* A **server** which hosts your information and it is ready to listen and send something
* A **client** which typically produce something that must be sent to the *server*

If you want to simply test this behavior, you can try to use the standard **netcat** program in Unix systems:

1. Open 2 shells (S1 and S2)
2. In S1 run the command `nc -lv 1111`
3. In S2 run the command `nc -v localhost 1111`
4. Keeping both the shells side-by-side, write something in S2

| Shell 1 | Shell 2|
|:--------|:-------|
|`$ nc -lv 1111` | `nc -v localhost 1111`|
|`Listening on 0.0.0.0 1111` | `Connection to localhost 1111 port [tcp/*] succeeded!` |
|`Hi My name is Nico`|`Hi My name is Nico`|

## You have just created your first chat system!


### Now let's move to something more interesting

Python's socket module provides an interface to the Berkeley sockets API. 

As part of its standard library, Python also has classes that make using these low-level socket functions easier.

<img src="https://files.realpython.com/media/sockets-tcp-flow.1da426797e37.jpg" width="400"/>

## A very simple implementation

Let's try to emulate the same behavior of `netcat` program with an `echo` communication.

**Echo Server** `echo-server.py`

```python
import socket

HOST = '127.0.0.1'  # Standard loopback interface address (localhost)
PORT = 65432  # Port to listen on (non-privileged ports are > 1023)

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.bind((HOST, PORT))
    s.listen()
    conn, addr = s.accept()
    with conn:
        print(f'Connected by {addr}')
        while True:
            data = conn.recv(1024) # number of bytes
            if not data:
                break
            conn.sendall(data)
```

**Echo Client** `echo-client.py`

```python
import socket

HOST = '127.0.0.1'  # The server's hostname or IP address
PORT = 65432  # The port used by the server

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    s.sendall(b'Hello, world')
    data = s.recv(1024)

print(f'Received {data!r}')
```

Now we need to run the two codes in two different shells, starting from the *server* one (!!) which will start to listen in the given port

*Shell 1*
```bash
$ python echo-server.py
```

*Shell 2*
```bash
$ python echo-client.py
Received b'Hello, world'
```

Some **NOTE**:

1. The information sent by the server/client **must** be bytes!
2. Despite all the variables defined in a software are "composed" by bytes, you need to always specify how convert them into bytes
3. The object, indeed, must be **serializable** (!) and you can always define the criteria of serialization into your custom classes
4. The same criteria is used for the communication between multiple processes into a **distributed computing** framework (ref. MPI)

In [4]:
import pickle

test_dict = {'Hello': 'World!'}
print(test_dict)

test_dict_ba = pickle.dumps(test_dict)
print(test_dict_ba)

test_dict_reconstructed_ba = pickle.loads(test_dict_ba)
print(test_dict_reconstructed_ba)

{'Hello': 'World!'}
b'\x80\x04\x95\x15\x00\x00\x00\x00\x00\x00\x00}\x94\x8c\x05Hello\x94\x8c\x06World!\x94s.'
{'Hello': 'World!'}


In [6]:
class NewClass:
    def __init__(self, data):
        print(data)
        self.data = data

# Create an object of NewClass
new_class = NewClass(1)
 
# Serialize and deserialize
pickled_data = pickle.dumps(new_class)
reconstructed = pickle.loads(pickled_data)
 
# Verify
print('Data from reconstructed object:', reconstructed.data)

1
Data from reconstructed object: 1


If your class has more complex data like `numpy` arrays, you can override the magic member function `__reduce__` to preserve the compatibility with `pickle` serialization

In [8]:
import pickle

class Test(object):
    def __init__(self, file_path='test1234567890.txt'):
        # An open file in write mode
        self.some_file_i_have_opened = open(file_path, 'wb')

my_test = Test()
# Now, watch what happens when we try to pickle this object:
pickle.dumps(my_test)

TypeError: cannot pickle '_io.BufferedWriter' object

If your class has more complex data like `numpy` arrays, you can override the magic member function `__reduce__` to preserve the compatibility with `pickle` serialization

In [10]:
import pickle

class Test(object):
    def __init__(self, file_path='test1234567890.txt'):
        # Used later in __reduce__
        self._file_name_we_opened = file_path
        # An open file in write mode
        self.some_file_i_have_opened = open(self._file_name_we_opened, 'wb')
        
    def __reduce__(self):
        # we return a tuple of class_name to call,
        # and optional parameters to pass when re-creating
        return (self.__class__, (self._file_name_we_opened, ))

my_test = Test()
saved_object = pickle.dumps(my_test)
print(repr(saved_object))

b'\x80\x04\x95.\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x04Test\x94\x93\x94\x8c\x12test1234567890.txt\x94\x85\x94R\x94.'


If you want to see a more advanced application of this type, sending and receiving more complex data and instructions, you can take a look at the [CryptoSocket](https://github.com/Nico-Curti/CryptoSocket) repository.

## Move to the Web

The major part of the current web development is performed using other languages like HTML, Javascript, and CSS.

This is mainly due to the expertise of the current generation of developers and to the ease of Javascript.

Since we need to keep up with them, let's move to `Javascript`.

Some **NOTE**:

1. **HyperText Markup Language** or **HTML** is the standard <u>markup</u> language for documents <u>designed</u> to be displayed in a web browser. 
    It defines the content and structure of web content.
    
2. **Cascading Style Sheets (CSS)** is a style sheet language used for specifying the <u>presentation</u> and <u>styling</u> of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML)

3. **JavaScript (JS)**, is a programming language and core technology of the Web, alongside HTML and CSS. 99% of websites use JavaScript on the client side for <u>webpage behavior</u>

If you want to keep your Python habits, you can easily change the Javascript support integrating your project with other third-party libraries.

The most famous (and easy to use) is probably [Flask](https://flask.palletsprojects.com/en/3.0.x/)

## Our first Web Example

We have a very important database with private data that cannot be shared with the entire world.

This database will be host by a private server with high computational power.

We have a series of collaborators who want to analyze our data and also update them with new records (data collection).

> In all your web API projects you need to manage (at the same time!) two main elements:
> * A **server** which hosts your information and it is ready to listen and send something
> * A **client** which typically produce something that must be sent to the *server*

**Server**: the computer which host the data

**Client**: the partners who want to get access to our data

And the protection?!

### We need to develop a login authentication service

```bash
web
├── client
└── server

2 directories, 0 files
```

First of all we need to create a fake database to use as toy model.

For sake of simplicity, we will use a standard `MySQL` database ([installer](https://dev.mysql.com/downloads/installer/)).

```bash
$ sudo service mysql start
$ sudo mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 8.0.35-0ubuntu0.20.04.1 (Ubuntu)

Copyright (c) 2000, 2023, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>
```

MySQL has its own syntax and to keep track of the tables and data that we want to manage is always preferable to write a `scheme.sql` file.

**`server/scheme.sql`**
```sql
CREATE DATABASE IF NOT EXISTS `test` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
USE test;
```

**`server/scheme.sql`**
```sql
CREATE TABLE IF NOT EXISTS `accounts` (
  `id` int(20) NOT NULL AUTO_INCREMENT,
  `password` varchar(10) NOT NULL,
  `email` varchar(100) NOT NULL,
  `token` varchar(42),
  `create_date` date NOT NULL,
  `last_login` date,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
```

**`server/scheme.sql`**
```sql
# debug account
INSERT INTO `accounts` (`id`, `password`, `email`, `create_date`) VALUES (1, 'test', 'test@test.com', '2024-01-06');
```

You can find a complete list of the instructions for the MySQL database management at this [link](https://dev.mysql.com/doc/refman/8.0/en/sql-data-manipulation-statements.html)

However, the most common queries involve the use of these keywords:

* `SELECT`: allows to extract some information from a given table
* `INSERT`: allows to add new records in a given table
* `UPDATE`: allows to update an existing record in a given table
* `DELETE`: allows to remove an existing record in a given table

Now we have created the scheme of our database and we can simply copy&paste these instructions into the real MySQL database.

```bash
web
├── client
└── server
    └── scheme.sql

2 directories, 1 file
```

For the management of the database we will use JS language.

JS should already installed in your computer as much as the package manager associated to it.
Since the MySQL support is provided by an external-package, you can simply install it (and verify the support of JS) using the command:

```bash
web/server$ npm install mysql2

added 12 packages, and audited 14 packages in 1s

found 0 vulnerabilities
```

If the software `npm` is already installed, you can simply get it using the commands

```bash
$ sudo apt update
$ sudo apt install nodejs
$ node -v
$ sudo apt install npm
```

For Windows user, I recommend to take a look at this [page](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm)

Something happens in our folder tree:

```bash
web/
├── client
└── server
    ├── node_modules
    ├── package-lock.json
    ├── package.json
    └── scheme.sql

3 directories, 3 files
```

### Let's start with JS

**`server/database.js`**
```js
// connection with db
import mysql from 'mysql2'

// connect to the MYSQL db with a pool of
// async connections
const db = mysql.createPool({
  host        : '127.0.0.1',
  user        : 'nico.curti2',
  password    : 'password',
  port        : '3306',
  database    : 'test',
}).promise();
```

Since we have already set a fake account, we can try to get the data inserted in our database using a simple *query*

**`server/database.js`**
```js
const results = await db.query(`
    SELECT *
    FROM accounts
  `);
console.log(results);
```

The keyword `promise` and `await` are standard syntaxes for the management of asynchronous processes.

In the first case, your performing exactly a `promise` about something that will be evaluated (sooner or later), telling to the rest of program "trust me, when you need it the result will be ready".

Since every promise can take a different level of time, the rest of the code should "wait" the result, creating a sort of barrier in the execution.

### Run our first JS script

To run a JS script the syntax is simply `node script.js`, in our case:

```bash
web/server$ node database.js
```

```bash
(node:4493) Warning: To load an ES module, set "type": "module" in the package.json or use the .mjs extension.
(Use `node --trace-warnings ...` to show where the warning was created)
/mnt/c/Users/utente/Desktop/CONFERENCE/SoftwareAndComputing24/web/server/database.js:3
import mysql from 'mysql2'
^^^^^^

SyntaxError: Cannot use import statement outside a module
```

### The error provides the solution!

JS works in the same way of Python, so you can write your code directly in a script file **but** if you want to break it into a series of files (like in our case!) you need to declare the script as a "module" file.

In our case, indeed, we are trying to import other functions from an independent (series of) file(s) stored in 'mysql2' folder

**`server/package.json`**
```json
{
  "name": "test-server",
  "type": "module",
  "version": "0.0.1",
  "keywords": [],
  "author": "Nico Curti",
  "license": "MIT",
  "dependencies": {
    "mysql2": "^3.9.7"
  }
}
```

Re-running the above code

```bash
web/server$ node database.js
[
  [
    {
      id: 1,
      password: 'test',
      email: 'test@test.com',      
      create_date: '2024-04-26',
      last_login: '2024-04-26'
    }
  ],
  [
    `id` INT(20) NOT NULL PRIMARY KEY AUTO_INCREMENT,
    `password` VARCHAR(10) NOT NULL,
    `email` VARCHAR(100) NOT NULL,
    `create_date` DATE(10) NOT NULL,
    `last_login` DATE(10)
  ]
]
```

The program doesn't stop at the end of the task, since JS should be always ready to listen and receive new information (great language!)

The output is a simple list that can be manage with the "standard" syntax of all the other languages

**`server/database.js`**
```js
const results = await db.query(`
    SELECT *
    FROM accounts
  `);
console.log(results[0]);
```

```bash
web/server$ node database.js
[
  {
    id: 1,
    password: 'test',
    email: 'test@test.com',      
    create_date: '2024-04-26',
    last_login: '2024-04-26'
  }
]
```

**`server/database.js`**
```js
const [value, table_info] = await db.query(`
    SELECT *
    FROM accounts
  `);
console.log(value);
```

```bash
web/server$ node database.js
[
  {
    id: 1,
    password: 'test',
    email: 'test@test.com',      
    create_date: '2024-04-26',
    last_login: '2024-04-26'
  }
]
```

**`server/database.js`**
```js
const [value, table_info] = await db.query(`
    SELECT *
    FROM accounts
  `);
console.log(value[0].email, value[0].password);
```

```bash
web/server$ node database.js
test@test.com test
```

To check whether you are authorized to have access to our database we need to verify if the pair given by (`email`, `password`) is included in our `accounts` table.

It would be great to declare a function able to verify this condition!

**`server/database.js`**
```js
// check the correctness of the pair info
// provided in input
export async function checkAccount (email, pwd) {
  const [rows] = await db.query(`
    SELECT *
    FROM accounts
    WHERE
      email = ?
    AND
      password = ?
  `, [email, pwd]);
  return rows.length > 0 ? true : false;
}

const valid = await checkAccount('test@test.com', 'test')
console.log(valid);

const invalid = await checkAccount('test@test.com', 'hello')
console.log(invalid);
```

Some **NOTE**:

* Our function must be callable also from other programs, so we need to "mark" it with the keyword `export`.
* Our function should work in an asynchronous framework, so we need to "mark" it with the keyword `async`.
* Our function must tell us if there is at least one record with these two features, so a simple True/False is sufficient.

```bash
web/server$ node database.js
true
false
```

Some **NOTE**:

* It is never a good option to explicity write our credentials into a script (!)
* Furthermore, if we want to upload our codes on Github, it represents a giant vulnerability (!!)
* A possibility is given by using a configuration file, loaded at the beginning of the script.

Fortunately, JS has already thought about these very common issues and provides a ready-to-use solution to address this problem.

```bash
web/server$ npm install dotenv
```

```bash
web/
├── client
└── server
    ├── .env
    ├── database.js
    ├── node_modules
    ├── package-lock.json
    ├── package.json
    └── scheme.sql

3 directories, 5 files
```

**`server/.env`**
```js
MYSQL_HOST='127.0.0.1'
MYSQL_USER='nico.curti2'
MYSQL_PWD='password'
MYSQL_PORT='3306'
MYSQL_DATABASE='test'
```

**`server/database.js`**
```js
// connection with db
import mysql from 'mysql2'
// usage of env variables
import dotenv from 'dotenv'

// load the env variables from .env file
dotenv.config();

// connect to the MYSQL db with a pool of
// async connections
const db = mysql.createPool({
  host        : process.env.MYSQL_HOST,
  user        : process.env.MYSQL_USER,
  password    : process.env.MYSQL_PWD,
  port        : process.env.MYSQL_PORT,
  database    : process.env.MYSQL_DATABASE,
}).promise();
```

Up to now we have defined all the functions which connect our database with a JS script.

**NOTE**: the MySQL database is running on a port of our computer (PORT=3306) and it should be already accessible from an authorized program.
With our JS script we simply build a user friendly interface to manage the database by *other* programs!

## Let's move to the web

We need to create the server application script

```bash
web/
├── client
└── server
    ├── .env
    ├── app.js
    ├── database.js
    ├── node_modules
    ├── package-lock.json
    ├── package.json
    └── scheme.sql

3 directories, 6 files
```

The purpose of the application script is to play the role of the **server** in our framework.

It should be able to listen the information from the environment (setting a particular PORT) and to send information to the possible **client**s.

## How can we communicate with a Web page?

When you type an address such as www.google.com into your browser, you are commanding it to open a TCP channel to the server that responds to that URL (or Uniform Resource Locator).

In this situation, your computer, which is making the request, is called the client. 
The URL you are requesting is the address that belongs to the server. 
Once the TCP connection is established, the client sends a **HTTP GET** request to the server to retrieve the webpage it should display. 
After the server has sent the response, it closes the TCP connection.

Independently by the language/software/code in which you are working, there are 2 main ways to schematize the communication between 2 elements (server and client):

* **POST** request: takes in input something to analyze
* **GET** request: provides in output the result of the request

### And in our case?

We need to provide an authentication, so:

* The **client** needs to perform a POST request to server providing the user information
* The **server** needs to analyze these infos, giving in output the response

Also in this case there is a series of ready-to-use solutions in JS language and in our current example we will use the `express` package

```bash
web/server$ npm install express

added 64 packages, and audited 78 packages in 2s

13 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
```

**`server/app.js`**
```js
// get/post requests
import express from 'express'
// import query functions
import {
  checkAccount
} from './database.js'

// declare the application
const app = express();
```

To check the correctness of our script, we can create a simple dummy page that *gets* a message in response.

**`server/app.js`**
```js
// info and port setting
app.listen(8080, () => {
  console.log('Server is running on PORT 8080');
})
// initial page => https://localhost:8080
app.get('/', async (req, res) => {
  res.status(200).send('Test server is running');
});
```

```bash
web/server$ node app.js
Server is running on PORT 8080
```

And what's happening in our browser at the address `http://localhost:8080`?

Some **NOTE**:

* In the "standard" web pages we use the HTTP**S** protocol to achieve the page, while in the above address we have the **HTTP** one!
* The *Hypertext Transfer Protocol* (HTTP) is a protocol or an ensemble of rules for the communication between client and server. 
* The *Hypertext Transfer Protocol Secure* (HTTPS) is a safer way and extension of the classical HTTP. The security is guaranteed by a safer connection between the client and server using cryptography.

GET request - Our server is outputting something...

...but it must be able also to get something in input - POST request

### Where do we put the data?

All the communications between server and client(s) must be performed using *serializable* data structures.

The easier way is given by the `JSON` format, which contains both the value and names of the data.

The structure of the JSON in HTTP request is well standardized and for our tasks there are 2 main components to take care:

* *header*: the header of the request is typically used to extract sensitive information in a GET request
* *body*: the body of the request is typically used to send the data, storing all the relevant information to POST

To allow the correct management of the *body* serialization/de-serialization, we need to add an extra package to our code:

```bash
web/server$ npm install body-parser
```

And edit our app code:

**`server/app.js`**
```js
// necessary for request body parsing
import bodyParser from 'body-parser'

// declare the application
const app = express();
// set the parser for the body post request
app.use(bodyParser.json());
```

Now we can create our POST request, extracting the info from the body of the request and verifying their validity using our `checkAccount` function.

**`server/app.js`**
```js
// check authentication
app.post('/auth', async (req, res) => {
  // extract the user, pwd from the post request
  const { email, password } = req.body;
  // query the db in the user table
  let account = await checkAccount(email, password);
  // if it is valid
  if (account) {
    res.status(200).send('Login Success');
  } else {
    res.status(201).send('Login Failed');
  }
});
```

## How do we test it?

To test our framework we need to provide a *client*

Two possibilities:

1. Write another "server" which plays the role of the client.
2. Perform GET/POST requests using other languages and softwares.

### A ready-to-use solution

Simply using `curl`

```bash
$ curl -H 'Content-Type: application/json' -d '{"email" : "test@test.com", "password" : "test"}' -X POST http://localhost:8080/auth/
> Login success
```

### A Pythonic one

In [8]:
import requests

# set the url of the API
api_url = 'http://localhost:8080/auth/'
# set the user information for the login
data = {
    'email' : 'test@test.com',
    'password' : 'test',
}
# send the login request
res = requests.post(api_url, json=data)
# verify the status code
if res.status_code == 200:
    print('Login: Success')
else:
    print('Login Failed')

Login: Success
