# Running models on demand using DL & AWS Lambda

In this notebook we will be exploring what using DL in a production pipeline or application might look like. To do this we will deploy a pre-trained ML model on AWS Lambda that uses Sentinel-2 data accessed with the Descartes Labs platform. The hypothetical "user" of our Lambda function will submit a geometry and unique id in a request and receive the model output back.

## Runtime code

To start we need to build our "application". Included in `../app` you will find several files. The relevant ones for this notebook are: `dl_aws_lambda.py`, `model.py`, and `utils.py`. These files have code for accessing imagery from the DL platform, pulling a model from an AWS S3 bucket, applying that model, then returning the model output. Let's take a closer looks at each of these parts:

The main function in `model.py` is `get_field_class`. This function accesses the model, pulls the imagery, and returns a model output. The inputs for this function are a valid json geometry, a unique field identifier, the S3 bucket where the model lives, and the name of them model. This function can be seen below:

```python
def get_field_class(geom, fid, s3_bucket, model_name):
    
    if not osp.exists("/tmp/models"):
        os.makedirs("/tmp/models")

    temp_file_path = osp.join("/tmp/models", model_name)

    logger.info(f"Loading classifier: {model_name}")
    if not osp.exists(temp_file_path):
        s3 = boto3.client("s3")
        logger.info(f"Classifier not found locally at {temp_file_path}. Pulling from s3")
        # Download pickled model from S3 and unpickle
        s3.download_file(s3_bucket, model_name, temp_file_path)
        
    clf = load(temp_file_path)

    logger.info("Retrieving timeseries")
    ndvi_ts, ndvi_dates = get_ndvi_tseries(geom)

    
    result = {
        "class": clf.predict(ndvi_ts.reshape(1,-1))[0], 
        "fid": fid
    }
    
    return result
```

This function first pulls a model (in our case an sklearn crop type classifcation model) from an S3 bucket then accesses imagery over the provided geometry using the `get_ndvi_tseries` function found in `../app/utils.py`.

```python
def get_ndvi_tseries(
    geom, 
    start_date="2019-04-01", 
    end_date="2019-10-01"
):

    logging.info("Searching for S2 L2A scenes")
    scenes, ctx = dl.scenes.search(
        geom,
        products="esa:sentinel-2:l2a:v1",
        start_datetime=start_date,
        end_datetime=end_date,
        limit=None,
    )
    logging.info(f"Found {len(scenes)} scenes for specified geometry")
    
    logging.info(f"Pulling raster data from DL Catalog")
    stack = scenes.stack(
        ["red", "nir", "cloud_mask"],
        ctx,
        flatten=lambda x: x.properties.date.strftime("%Y-%m-%d"),
        scaling="physical",
        progress=False,
    )
    
    logging.info(f"Masking out clouds")
    cmask = np.repeat(
        (stack[:,-1].data==1)[:, np.newaxis],
        stack.shape[1],
        axis=1
    )
    
    stack.mask = (stack.mask) | cmask
    
    logging.info(f"Computing NDVI")
    ndvi = (stack[:,1] - stack[:,0])/(stack[:,1] + stack[:,0])

    ndvi_ts = np.ma.median(ndvi, axis=[1,2])
    dates = list(scenes.groupby("properties.date.day"))
    
    dates = [
        key for key, scene in scenes.groupby(
            lambda x: x.properties.date.strftime("%Y-%m-%d")
        )
    ]
    
    dates_ts = [
        datetime.strptime(date, "%Y-%m-%d").timestamp() for date in dates
    ]
    
    new_dates = np.arange(
        datetime.strptime(start_date, "%Y-%m-%d"),
        datetime.strptime(end_date, "%Y-%m-%d"),
        timedelta(days=6)
    ).astype(datetime)
    
    new_dates_ts = [t.timestamp() for t in new_dates]
    
    tseries_masked = ndvi_ts.data[~ndvi_ts.mask]
    dates_masked = np.array(dates_ts)[~ndvi_ts.mask]
    
    logging.info(f"Interpolating time series from dates: {dates} to new dates: {new_dates.tolist()}")
    
    f_interp = interp1d(
        dates_masked,
        tseries_masked,
        bounds_error=False,
        copy=False,
        fill_value="extrapolate",
    )
    
    return f_interp(new_dates_ts)[1:], new_dates[1:]
```

This function searches for Sentinel-2 L2A imagery over the geometry, pulls that imagery, applies a cloud mask, computes a vegetative index (NDVI), interpolates the imagery onto regular time intervals, and then returns the intepolated array. This array will have shape (n timesteps, n features, pixel x, pixel y). 

The code in `dl_aws_lambda.py` simply takes an event of the type we would expect to receive in Lambda, parses that event to get a geometry and field id, then returns the output of the `get_field_class` function.

We'll need to make sure that we have a model to run! To do this we will put the model found in `../models/classifier.joblib` into an S3 bucket. For more info on using S3 please consult the [AWS docs here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/setting-up-s3.html). We can point to this bucket by setting envinronment variables in our Docker image or by specifying them in the configuration of our Lambda function.

<img src="../images/s3_model_bucket.png" align="center"/>

## Building a Docker image

Now that we have code to run we need to build a Docker Image for Lambda that has the Descartes Labs client installed. We have provided a Dockerfile that will do just that. The Dockerfile can be broken down into a few parts:

```dockerfile
FROM public.ecr.aws/lambda/python:3.9

COPY app ${LAMBDA_TASK_ROOT}/app

COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
RUN pip3 install -U descarteslabs>=1.11.0 --target "${LAMBDA_TASK_ROOT}"

ENV DESCARTESLABS_ENV=aws-production

RUN mkdir /tmp/models

# ENV MODEL_NAME=classifier.joblib
# ENV MODEL_S3_BUCKET=dl-aws-onboarding

CMD [ "app.dl_aws_lambda.run_model" ]

```

We use a base Lambda Python3 image and copy our "application" code into the image. We then install the Descartes Labs client with version >= 1.11.0. Next we install a set of Python package requirements found in `requirements.txt`. We set the `DESCARTESLAB_ENV` variable to use the DL AWS services. Finally we create a temp directory for our model and then specify what code we want executed at runtime (specified by `CMD`.)

We now need to build the Docker image and push to the AWS Elastic Container Registry to be used in Lambda. For more infomation on how to do this please see the following AWS documentation:
- [Creating an ECR repository](https://docs.aws.amazon.com/AmazonECR/latest/userguide/repository-create.html)
- [Build and push your Docker image](https://docs.aws.amazon.com/lambda/latest/dg/images-create.html)

A brief summary of these steps can be found below:

1) `cd ~/dl-ea-aws-onboarding`
2) `docker build -t dl-aws-onboarding-lambda -f dockerfiles/lambda/Dockerfile .` You can specify a different name for your image by swapping our "dl-aws-onboarding-lambda" for something else
3) `docker tag dl-aws-onboarding-lambda:latest {your-container-registry}/dl-aws-onboarding-lambda:latest`
4) `docker push {your-container-registry}/dl-aws-onboarding-lambda:latest`

With our Docker image now in the ECR we can create our Lambda function. For more information on the basics of creating a function on Lambda please see the [AWS docs here](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html).

## Creating our Lambda function

We create a new function and select "Container function".

<img src="../images/create_function_container.png" align="center"/>

Then we can give our function a name and provide the URI pointing to the image we created and pushed to the ECR.

<img src="../images/ecr_uri.png" align="center"/>

You may need to adjust the execution role to [provide access to the S3 bucket you created earlier](https://aws.amazon.com/premiumsupport/knowledge-center/lambda-execution-role-s3-bucket/). 

Finally create the function!

The last step now before we can test our Lambda function is to set some environment variables in the function configuration.

We need to set the following variables:

- MODEL_NAME
- MODEL_S3_BUCKET
- DESCARTESLABS_CLIENT_ID
- DESCARTESLABS_CLIENT_SECRET

Your client ID and secret can be found after authenticating with your DL account at `~/.descarteslabs/token_info.json`. These are required to access data using the DL platform.

<img src="../images/env_vars_lambda.png" align="center"/>

## Testing the Lambda function

We can now test our Lambda function! To do this we can navigate to "Test" and construct an example request. We must provide a valid json geometry and a unique identifier. You can use the request below as a reference:

```json
{
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          -91.55319213867188,
          35.805249625952506
        ],
        [
          -91.54885768890381,
          35.805249625952506
        ],
        [
          -91.54885768890381,
          35.80895624882348
        ],
        [
          -91.55319213867188,
          35.80895624882348
        ],
        [
          -91.55319213867188,
          35.805249625952506
        ]
      ]
    ]
  },
  "fid": "test-field"
}
```

We can then click "Test" and have the Lambda function process the request. You should see a response json that looks something like this:

```json
{
    "class": 0,
    "fid": "test-field"
}
```
Class will be either 0 for not-corn or 1 for corn.

You may also need to adjust the available memory for your function under `General configuration` depending on the size of your test request.

From here you can now explore a variety of ways to trigger your Lambda function. [You can add triggers](https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html) and [create destinations](https://aws.amazon.com/blogs/compute/introducing-aws-lambda-destinations/) for the output of your Lambda function.