<div style="position: relative;">
<img src="https://user-images.githubusercontent.com/7065401/98728503-5ab82f80-2378-11eb-9c79-adeb308fc647.png"></img>

<h1 style="color: white; position: absolute; top:27%; left:10%;">
     Secure RESTful APIs using Python
</h1>

<h3 style="color: #ef7d22; font-weight: normal; position: absolute; top:56%; left:10%;">
    David Mertz, Ph.D.
</h3>

<h3 style="color: #ef7d22; font-weight: normal; position: absolute; top:63%; left:10%;">
    Data Scientist
</h3>
</div>

# JSON Web Tokens

JSON Web Tokens, or 'JWT', are simply tokens that carry a richer context and more standardized encoded fields than the ad hoc random token shown in the last lesson.  The central concept in a JWT is a "claim" made between two communicating entities, where the features of a claim are simply JSON keys that take values in certain mandated forms.  The rules for JWTs are quite detailed, and are set out in [RFC 7519](https://tools.ietf.org/html/rfc7519).

Like many protocols, what JWT specifies is not anything you could not do in your own way (as in the last lesson), but it provides some standard spellings for common features tokens represent.  For example:

* “exp” (Expiration Time) Claim
* “nbf” (Not Before Time) Claim
* “iss” (Issuer) Claim
* “aud” (Audience) Claim
* “iat” (Issued At) Claim

All of these fields are optional.  Moreover, any other fields may also be optionally included, but do not carry prescribed meanings.

# Encrypting tokens

The library `PyJWT` supports all the conventions of the JWT protocol, and integrates with both symmetric and public key encryption techniques.  For example, let us create a token using symmetric-key encryption.  We can create tokens that utilize **none** of the recommended fields if we like.

In [1]:
import jwt
secret_key = 'gZivP8J1kSLDow'
jwt.encode({'this': 21, 'that': 99}, secret_key)

b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0aGlzIjoyMSwidGhhdCI6OTl9.eotKf11s826ilemfbI6e7AIHxsQiCblnfifg2vDHNxQ'

We get more capability by utilizing fields specified in the RFC.

In [2]:
from datetime import datetime, timedelta
now = datetime.utcnow()
later = now + timedelta(days=30)

In [3]:
data = {"extra-data": 42, 
        "exp": now,
        "nbf": now,
        "iss": "INE"}
crypt = jwt.encode(data, secret_key)
crypt

b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHRyYS1kYXRhIjo0MiwiZXhwIjoxNjE4ODc0NTEzLCJuYmYiOjE2MTg4NzQ1MTMsImlzcyI6IklORSJ9.aXZXp-fnQ6FtAnC4MUD-0iOox1rVbZr8uSudfMwsm_E'

At this point we have a token that is protected by a secret.  How we would do key management to make this secret available to all the interested endpoints is an issue outside the scope of this course.  But distributing symmetric keys is certainly a frequent element of security design.

We can try to decode this token to find its contents.  Unlike the purely random tokens in the prior lesson, this one actually **has** meaningful data in itself.

In [4]:
try:
    jwt.decode(crypt)
except Exception as err:
    print(repr(err))

InvalidSignatureError('Signature verification failed')


As intended, without the secret, we cannot successfully decode the data (except, we **can**, as below).  Let's try providing that.

In [5]:
try:
    jwt.decode(crypt, secret_key)
except Exception as err:
    print(repr(err))

ExpiredSignatureError('Signature has expired')


The exception raised is appropriately different to indicate that one of the RFC specified field is not being fulfilled.  The signature *was* verified successfully, however.  The token we created was not ideal since its expiration time is a couple minutes ago when the datetime object was created.  Let's create one with a more useful expiration.

In [6]:
data['exp'] = later
crypt2 = jwt.encode(data, secret_key)
crypt2

b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHRyYS1kYXRhIjo0MiwiZXhwIjoxNjIxNDY2NTEzLCJuYmYiOjE2MTg4NzQ1MTMsImlzcyI6IklORSJ9.HkdUdfKr88iP9uiw3GOb82zZAkehIC7e5bALzecztH8'

In [7]:
jwt.decode(crypt2, secret_key)

{'extra-data': 42, 'exp': 1621466513, 'nbf': 1618874513, 'iss': 'INE'}

However, these tokens are not actually encrypted, but merely signed.  This is useful for authentication, which is the purpose of them.  Encryption is handled by the SSL/TLS layer (or could be handled by other means).  The timestamps are expressed as "seconds since the epoch" (the same as what is returned by Python's `time.time()`).

### Decoding without the secret key

In fact, we can decide to decode while ignoring validation if we wish to. For example, we can decode the first token:

In [8]:
jwt.decode(crypt, verify=False)

{'extra-data': 42, 'exp': 1618874513, 'nbf': 1618874513, 'iss': 'INE'}

Even though we neither provided a key, nor checked the expiration, we can see the content of the token.  We might want to skip signature verification but still check for expiration compliance:

In [9]:
try:
    jwt.decode(crypt, options={"verify_signature": False})
except Exception as err:
    print(repr(err))

ExpiredSignatureError('Signature has expired')


With no secret key we can still check expiration. For example, if we check `crypt2` token, we get a different result.

In [10]:
jwt.decode(crypt2, options={"verify_signature": False})

{'extra-data': 42, 'exp': 1621466513, 'nbf': 1618874513, 'iss': 'INE'}

## How is this useful?

A plausible pattern for using these JWT tokens is to have services issue them upon successful login.  Some kind of credential validation at an initial step is still required if access should be restricted.  But by using tokens that themselves bundle information, this avoids the need to retain a database or in-memory mapping of which credential tokens have been issued. The tokens themselves contain the necessary information.

The data in tokens can include details like expiration date or "audience" (what we called "username" in the prior lesson).  But they may also include arbitrary `extra-data`, as in the above example.  As many such extra data fields as you like may be included, and each one can contain an arbitrary JSON data structure, not necessarily only a scalar like in the example.

### Threat model

For the use described here, you do **not** want to distribute the secret key to consumers of a service.  For example, if you utilize an expiration timestamp as part of the token, if consumers had the secret key, they could create false tokens with a later expiration.  

However, a consumer who gets a JWT token is still perfectly able to decode it, without verification, and inspect its contents.  What is required is simply that they send back the literal token they received from the service along with a request.  This might be attached to a session cookie. Or it might be a field in a JSON structure in a PUT request. Or it might be communicated in a designated header.  We have seen a variety of mechanisms by which a token can be communicated between a consumer and a service.

A service (which will possess its own secret key), is able to run code similar to this:


```python
@app.route('/process')
def process():
    # Pick one:
    token = (request.cookies.get('app1-token') or
             request.form.get('app1-token') or
             request.headers.get('X-app1-token'))
    if jwt.decode(token, secret_key):
        return interesting_data()
    else:
        abort(403)
```

## Public-key encryption

The module `PyJWT` also supports public-key encryption, rather than symmetric-key encryption.  This allows for a slightly different usage pattern.  In the design described above, a login step to obtain a JWT signed by a service is required before utilizing other methods on that service (assuming use of the method is restricted).  By using public-key encryption, this setup step can be avoided.

To illustrate this, I have generated several public/private key pairs, called `server1`, `server2` and `client` in the manner shown:

```
% ssh-keygen -t rsa -b 4096 -m PEM -f client.key
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in client.key
Your public key has been saved in client.key.pub
The key fingerprint is:
SHA256:PkMFmScKohhHL+WI3sBGfr5Ls+VX0kp1wPnyHwsT3iM dmertz@popkdm
The key's randomart image is:
+---[RSA 4096]----+
| o. .   oo.      |
|=.o=.   +=.      |
|oOooo. . o+      |
|= =.  .  + +     |
| . o    S = o    |
|    .  = o E +   |
|   + .. B   = +  |
|  . *  o o   o   |
|   o ..          |
+----[SHA256]-----+

% openssl rsa -in client.key -pubout -outform PEM -out client.key.pub
writing RSA key
```

Under one obvious use pattern, servers will need to lookup or cache the public keys of clients.  In a micro-service or cloud-native architecture, most nodes will act as both server and client, as one node calls another for support.  But once a public key is obtained, it can be used to authenticate the identity of a requester without needing any other credentials or login system.

Technically, we need not use JWT or `PyJWT` to send requests signed using private keys, but doing so bundles up several interaces and capabilities that are convenient.  A key server can distribute public keys upon request.  The below example uses only HTTP, and does not bother with SSL/TLS encryption of connections because a public key is very specifically **public**.  Intercepting it is completely problem free (other security issues like MITM—"man in the middle"—are outside the scope of this course, but should be considered).

We might obtain public keys in a manner similar to this:

In [11]:
import requests
resp = requests.get("http://localhost:5010/getkey?identity=server1")
s1_pub = resp.text
print(resp.text)

-----BEGIN PUBLIC KEY-----
MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAlqxKOuWu0GJs2XcTBnQ3
imFmPM81g3gqLMsfaZVX13QGhc70oJrIAZnEGCUIMpvDvrnYU87x4wEgWN/ya4W5
wBz1+34mWNBhHCBn7wAhbcr8YqJj2/C0IgvglG6WuaWPLDR0sd5+PPlZdCNIxYP7
wabqFT6JM0A0+Nb/OFYz7dcu9yM1vLUgya8mxkVbWKLFz6m5rsM7fa/DGnmSWXwJ
4vglNxnpOnj/0Zpu5sI7yt2Wbr8QwiLm96q1vIdmr1tqS5Ra3msn4pUrpApiGM3H
uYBziRLha79w6sTCzKq4df5M90Cclx4eayrP2uFGGduPNCT4Db0YqDkGF7fPGuxw
Q1SY+fgGBJM/XFeQfYm0DyraqRIM4bO7pS67w8XPV/EOaBZ3rKn7VMBreJ2msJye
/bAv+R2PYsYBUg52qBrUFD+ObyPNuIa0Ss2dEhdamN053YOvTxabYnVv0L1uCRRt
+PDJJl0VUgZao99b3s5rQFuUAF2tXdTv+at44kDuEyyzHKIL7T09381QsZgOcNp7
0OHPWmXJO1j/L7UYDbKcTDwg1ejD50kfk1ghT0ztd8K9d4bZHkp6NIciHzwhfDMV
yHIPh+NTl1c6rBe7P+8mY86nwIzVX9yScnG5/HWBFTjSudtSqJNgUXkk96Kafing
1NCUkUOAp578xmWTwtODB5UCAwEAAQ==
-----END PUBLIC KEY-----



In [12]:
resp = requests.get("http://localhost:5010/getkey?identity=server99")
print(resp.status_code, resp.text)

404 [Errno 2] No such file or directory: 'server99.key.pub'


### Utilizing a signed transaction

Let's say a client wishes to make a request to a service.  They may simply bundle this request as a JWT and send it to the server, signed using their own private key.  For example:

In [13]:
client_private_key = open('client.key').read()
payload = {"query": "Meaning of life, the universe, and everything?", 
           "iss": "client",
           "iat": datetime.utcnow()}

payload_encoded = jwt.encode(payload, client_private_key, algorithm="RS256")
print(payload_encoded)

b'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJxdWVyeSI6Ik1lYW5pbmcgb2YgbGlmZSwgdGhlIHVuaXZlcnNlLCBhbmQgZXZlcnl0aGluZz8iLCJpc3MiOiJjbGllbnQiLCJpYXQiOjE2MTg4NzU4NDR9.agruN1hehMzJmjh6rOoxXJ2U6IpvZNofTLWXVLC-O5Le2u7-ezig6HolfHia28-cEsDb1WPAtSK2t9d2RxizgOMKXbUq4-BmO_V2Q33MPGH9PtKYRPC5Ub_qEQmCGppYSQdM6arLq3STBnf-G9oOu7s1OIg4RMj1QbpMOrw32iam728cnVpLv46pxidAWeFBtUsRu4mRpGI3INnLWzVMAiIm6vQ9FjEIY0fp8YOw8fYSThhXipQFzEqpLxINKEJVJa0KAw6mPKp7apPL9eHs3E9y-BsVc7UKOi-FEaZiFyaBowXQbqt2YxVhvlvgcSdywSCMkXQhw1SWopMvGl45meGNrxxdI-5Qc49xMoEutSEIMjn1fmgfGheQ-MIfkAV5LFxNS_cEAiH04NLZmJOOGbXkKzMHtLdE9zxFKzZn1rFfIpN2oij6EYbgeWfG8pOSCj_SGtkHmLLRxP59-lpCPN4IfuxskY2qrHvC8P2FfWRAvTI8305j4l8CSn_qQpENhjElMEi0fNMPxaPcuvQPq6z5bafpyTOOenS5FqmFdqPOt8RsfMSeEB7pbsdR68S-cU_kl8dcNGjWnzNCIk1CaA67GapU89wfpHmtxtne_UPp2NXV8QNOIS-eV_k_tkIV73r_D0twVqz5FDZ0RFVswQaJ4SWja1UAg3jn5IkKHF8'


So now the client has prepared and signed a request payload.  This payload uses the standard fields `iss` for "issuer" and `iat` for "time of issue" which the service may wish to utilize.  There is also a "query" which describes the kind of data being requested; the "iss" here acts as an "identity" to indicate how to lookup the relevant public key.  In production, a *public key fingerprint* is probably a better choice than an informal name like "client."

Let's send this payload to the server (we should use HTTPS but for simplification that is omitted here):

In [14]:
import json
headers = {'Content-Type': 'application/jwt'}
url = "http://localhost:5005"
resp = requests.post(url, headers=headers, data=payload_encoded)
json.loads(resp.text)

{'Meaning of life, the universe, and everything?': 42}

Let's see the failure mode by creating a payload that has an `iss` mismatched to the private/public keys.  There exists a key for `server2` on the key server, but it is different from `client`.

In [15]:
payload2 = {"query": "Meaning of life, the universe, and everything?", 
           "iss": "server2",
           "iat": datetime.utcnow()}

bad_payload_encoded = jwt.encode(payload2, client_private_key, algorithm="RS256")
resp = requests.post(url, headers=headers, data=bad_payload_encoded)
print(resp.status_code, resp.text)

403 Signature verification failed


In [16]:
bad_payload_encoded

b'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJxdWVyeSI6Ik1lYW5pbmcgb2YgbGlmZSwgdGhlIHVuaXZlcnNlLCBhbmQgZXZlcnl0aGluZz8iLCJpc3MiOiJzZXJ2ZXIyIiwiaWF0IjoxNjE4ODc2MDQ2fQ.jtibk_4TkcYaocEFFYeSGjFZdqi3cgLj0qipyiA4vGqObLnvAOigvEaqwLxsqcjRYYScSXh37Z4oxzjrlBz2Iyfk81YDwbOnwjSj44xKzXfBra4TDm-9zjsCDJaW3iYQrb3z8rEz_04REHJykfxoa7M_oF30ODtM-pSNLMKwETSQU9i0vDGSShPyq-O5U-P7S3kblh5g5rDnQmuLw4Dw2hRQd9tkFwCrScy2nWyRsnFMo8ABhRj8sq-2qgtYpLSTP7_a5Vf9mleDJMw5Kud4AupBiWCkv_Elmjz9v6OHNWAvUgZupmVy9j35aHA5tvg4TUJRise9GMlBvBpCAMUwG86WDENkLXu055_xc0hAHLVOJYeZrrTvVAWxuSVYlYSGPvnWKZ_hm1XazzbYzNcx5Yp2xpfYGRAuwpUfjCxRTN9MoJjtIM255wMggLtmMQDkKsxy-GDHXGPzcsYoqX4BFO49_HolpyEq_jE6fC6N_z_QAKZzooFoLxQTir3Qxz8QKLSkD35DBfKKd2SJ6qnCRBOT5RUV6JdDDm6viKHhG4lVWeOXeHEQtswCRoqrIlSKlWTD6URrqgZTJk0R8T-N0YwSV6KtXN1_jkof_ZLrd3MGbchUgpQ1di51oRQsXAOCQ7t9ULeHlMyPMtgHG8r66WQN4AezFVhx9x3o3Omx_KU'

Let's look at the server that responds to these queries (omitting the same imports and main block we've seen before):

```python
@app.route('/', methods=['POST'])
def query():
    # Look at content without verification first
    payload = jwt.decode(request.data, verify=False)
    
    # Find the public key for the requester
    resp = requests.get(f"{keyserver}?identity={payload['iss']}")
    if resp.status_code != 200:
        abort(401)
    
    # We have found a public key, verify now
    pubkey = resp.text
    try:
        verified = jwt.decode(request.data, pubkey, algorithm="RS256")
        # Real code will do something with fields
        query = verified['query']
        iat = verified['iat']
        return jsonify({query: 42})
    except Exception as err:
        return make_response(str(err), 403)

```