Skip to content

Commit

Permalink
Adds initial auth redirect lambda and built process
Browse files Browse the repository at this point in the history
Why are these changes being introduced:

* This adds the logic to check for JWT tokens and redirect if the token
  is missing, invalid, or expired
* This follows the build process expected by our infrastructure code

Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/GDT-100
  • Loading branch information
JPrevost committed Mar 20, 2024
1 parent 41262a5 commit ee2ac9d
Show file tree
Hide file tree
Showing 8 changed files with 892 additions and 146 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -134,4 +134,7 @@ dmypy.json
.DS_Store

# VSCode
.vscode/
.vscode/

.package/
*.zip
20 changes: 12 additions & 8 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ help: # preview Makefile commands

install: # install Python dependencies
pipenv install --dev
pipenv run pre-commit install
# pipenv run pre-commit install

update: install # update Python dependencies
pipenv clean
Expand All @@ -20,20 +20,18 @@ update: install # update Python dependencies
test: # run tests and print a coverage report
pipenv run coverage run --source=lambdas -m pytest -vv
pipenv run coverage report -m
pipenv run coverage html

coveralls: test # write coverage data to an LCOV report
pipenv run coverage lcov -o ./coverage/lcov.info

## ---- Code quality and safety commands ---- ##

lint: black mypy ruff safety # run linters
lint: black ruff safety # run linters

black: # run 'black' linter and print a preview of suggested changes
pipenv run black --check --diff .

mypy: # run 'mypy' linter
pipenv run mypy .

ruff: # run 'ruff' linter and print a preview of errors
pipenv run ruff check .

Expand All @@ -57,10 +55,16 @@ ruff-apply: # resolve 'fixable errors' with 'ruff'
# developer to match the needs of the application. This is just
# the default zip method for a very simple function.
create-zip: # Create a .zip file of code
rm -rf cf-lambda-geo-auth.py.zip
zip -j cf-lambda-geo-auth.py.zip lambdas/*
# https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-dependencies
rm -rf ./.package
mkdir ./.package
rm -f cf-lambda-geo-auth.zip
pip install --target ./.package pyjwt
# change directory and run the associated command on the same line or it will be excecuted at the initial directory by make
cd ./.package; zip -r ../cf-lambda-geo-auth.zip .
cd ./lambdas; zip ../cf-lambda-geo-auth.zip *.py

upload-zip: # Upload the .zip file to AWS S3 bucket
aws s3api put-object --bucket shared-files-$$(aws sts get-caller-identity --query Account --output text) --body cf-lambda-geo-auth.py.zip --key files/cf-lambda-geo-auth.py.zip
aws s3api put-object --bucket shared-files-$$(aws sts get-caller-identity --query Account --output text) --body cf-lambda-geo-auth.zip --key files/cf-lambda-geo-auth.zip

## End of Terraform Generated Makefile Additions ##
4 changes: 3 additions & 1 deletion Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,16 @@ name = "pypi"

[packages]
sentry-sdk = "*"
boto3 = "*"
pyjwt = "*"

[dev-packages]
black = "*"
coveralls = "*"
mypy = "*"
pre-commit = "*"
pytest = "*"
ruff = "*"
moto = {extras = ["ssm"], version = "*"}

[requires]
python_version = "3.11"
513 changes: 395 additions & 118 deletions Pipfile.lock

Large diffs are not rendered by default.

16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,14 @@ If, for any reason, the name of the base file here changes, the Terraform code m
- Dependabot security updates
- Secret scanning

## Context

This lambda@edge function runs during the `Viewer request` phase of a [CloudFront request](https://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html). This is important as we don't want cached versions of a request being sent to unauthenticated users which would happen if we instead ran in the `Origin request` phase once one authenticated user requested something.

This function checks for authenticated users by looking for a valid JWT token in a domain cookie. That cookie gets set by a separate application which is responsible for Touchstone authentication and JWT creation. The [user flow](https://github.com/MITLibraries/cdn-auth-geo?tab=readme-ov-file#how-does-this-application-integrate-with-others) is documented in that repository.

It is important to note that because we are checking for domain cookies, the CDN Auth Geo application and the CloudFront distribution must both have the same domain for the tier you are testing. In other words, for staging and dev1 tiers `*.mitlibrary.net` and for prod `*.libraries.mit.edu`.

## Development

- To preview a list of available Makefile commands: `make help`
Expand All @@ -35,3 +43,11 @@ If, for any reason, the name of the base file here changes, the Terraform code m
- To lint the repo: `make lint`

## Running locally

Because of the nature of this lambda and it's tight coupling with CloudFront Events, it is generally easier to run it in
Dev1 and make changes directly in the Lambda editor, deploy it, and update the CloudFront distro to see changes. Yes,
that is unfortunate.

It should be possible to run moto in server mode to mock a CloudFront distribution to allow for true local development
of this function. If you figure that out, please update these docs!

109 changes: 103 additions & 6 deletions lambdas/lambda_edge.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,107 @@
import json
import logging
import urllib.parse

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
import boto3
import jwt


def handler(event: dict) -> str:
logger.debug(json.dumps(event))
return "You have successfully called this lambda!"
def parse_cookies(headers):
parsed_cookie = {}
if headers.get("cookie"):
for cookie in headers["cookie"][0]["value"].split(";"):
if cookie:
parts = cookie.split("=")
parsed_cookie[parts[0].strip()] = parts[1].strip()
return parsed_cookie.get("mitlcdnauthjwt", "")


def validate_jwt(jwt, jwt_secret):
if jwt == "":
return False

decoded = decode_jwt(jwt, jwt_secret)
if decoded == "invalid":
return "invalid"

if decoded == "expired":
return "expired"

# valid JWT: user is legit for general access. This is where
# apps needing specific user auth and not general user auth would
# need to check the user that was returned is authorized, not just
# authenticated. For our initial purposes, authenticated is all we
# will be checking for
return "valid"


def decode_jwt(usertoken, jwt_secret):
try:
jwt.decode(usertoken, jwt_secret, algorithms=["HS256"])
except jwt.ExpiredSignatureError:
# Signature has expired
logging.exception("JWT token error: jwt.ExpiredSignatureError")
return "expired"
except jwt.InvalidSignatureError:
# Signature is invaid
logging.exception("JWT token error: jwt.InvalidSignatureError")
return "invalid"
except jwt.DecodeError:
# Bogus JWT token
logging.exception("JWT token error: jwt.DecodeError")
return "invalid"
return "valid"


def handler(event, _context):
request = event["Records"][0]["cf"]["request"]
headers = request["headers"]

"""
Check for session-id in request cookie in viewer-request event,
if session-id is absent, redirect the user to sign in page with original
request sent as redirect_url in query params.
"""

client = boto3.client("ssm", region_name="us-east-1")

ssm_params = client.get_parameters_by_path(
Path="/apps/cf-lambda-geo-auth/", WithDecryption=False
)

logging.info(ssm_params)

for param in ssm_params["Parameters"]:
if param["Name"] == "/apps/cf-lambda-geo-auth/jwt-secret":
jwt_secret = param["Value"]
if param["Name"] == "/apps/cf-lambda-geo-auth/auth-url":
auth_url = param["Value"]

# Check for session-id in cookie, if present, then proceed with request
jwt = parse_cookies(headers)

parsed_jwt = validate_jwt(jwt, jwt_secret)

if parsed_jwt == "valid":
# holy heck a valid user
logging.info("valid user!")
return request

logging.info("need auth")
# URI encode the original request to be sent as redirect_url in query params
redirect_url = "https://{}{}?{}".format(
headers["host"][0]["value"], request["uri"], request["querystring"]
)
encoded_redirect_url = urllib.parse.quote_plus(redirect_url.encode("utf-8"))

return {
"status": "302",
"statusDescription": "Found",
"headers": {
"location": [
{
"key": "Location",
"value": f"{auth_url}?cdn_resource={encoded_redirect_url}",
}
]
},
}
18 changes: 8 additions & 10 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,14 @@
[tool.black]
line-length = 90

[tool.mypy]
disallow_untyped_calls = true
disallow_untyped_defs = true
exclude = ["tests/"]

[tool.pytest.ini_options]
log_level = "INFO"

[tool.ruff]
target-version = "py311"
select = ["ALL", "PT"]
lint.select = ["ALL", "PT"]

ignore = [
lint.ignore = [
# default
"ANN101",
"ANN102",
Expand All @@ -27,6 +22,8 @@ ignore = [
"PTH",

# project-specific
"ANN001",
"ANN201",
"C90",
"D100",
"D101",
Expand All @@ -37,11 +34,12 @@ ignore = [
"PLR0913",
"PLR0915",
"S320",
"S321",
"S321",
"UP017"
]

# allow autofix behavior for specified rules
fixable = ["E", "F", "I", "Q"]
lint.fixable = ["E", "F", "I", "Q"]

# set max line length
line-length = 90
Expand All @@ -66,4 +64,4 @@ fixture-parentheses = false
max-doc-length = 90

[tool.ruff.pydocstyle]
convention = "google"
convention = "google"
Loading

0 comments on commit ee2ac9d

Please sign in to comment.