# Integrating AWS API Gateway with AWS Kinesis

## Introduction

In this lesson you will learn to create and configure a REST API with an Amazon Kinesis proxy integration. We will build this integration, on the previously created REST API with Kafka REST proxy integration (see AWS API Gateway lesson).


## Create an IAM role for API access to Kinesis
    
To allow the API to invoke Kinesis actions, you must have appropriate IAM policies attached to an IAM role. To enable full access to Kinesis, you can create an IAM role in the IAM console that assumes the **AmazonKinesisFullAccessRole** policy. This will enable both read-write actions in Kinesis.
    
Make sure API Gateway is a trusted entity of the role and has assumed the execution role **sts:AssumRole**. The trust relationships of this role should look like the one below:

<p align="center">
    <img src="images/Kinesis Role.png" width="800" height="300"/>
</p>

## List streams in Kinesis

To begin building our integration navigate to the **Resources** tab of the previously created API. Use the **Create resource** button to start provisioning a new resource:

<p align="center">
    <img src="images/Create Resources 2.png" width="800" height="450"/>
</p>

Under **Resource Name**, type **streams**. Leave the rest as default and then click the **Create resource** button.

Choose the created **streams** resource, and then select the **Create method** button. Select `GET` as the method type.

In the **Create method** page you will need to define the following:

- For **Integration type** select **AWS Service**
- For **AWS Region** choose *us-east-1*
- For **AWS Service** select **Kinesis**,
- For **HTTP method** select `POST` (as we will have to invoke Kinesis's `ListStreams` action)
- For **Action Type** select **User action name**
- For **Action name** type `ListStreams`
- For **Execution role** you should copy the ARN of your Kinesis Access Role (created in the previous section)

<p align="center">
    <img src="images/List Stream Setup 2.png" width="600" height="450"/>
</p>

Finally, click **Create method** to finalize provisioning this method.

This will redirect you to the **Method Execution** page. From here select the **Integration request** panel, click on the **Edit** button at the bottom of the page. This should redirect you to the following page:

<p align="center">
    <img src="images/Edit Integration Request.png" width="650" height="450"/>
</p>

Expand the **URL request headers parameters** panel and select the following options:

- Under **Name** type **Content-Type**
- Under **Mapped form** type **'application/x-amz-json-1.1'**
- Click the **Add request header parameter** button

<p align="center">
    <img src="images/Add Header Parameter.png" width="800" height="200"/>
</p>
   
Expand the **Mapping Templates** panel and select the following options:

- Choose **Add mapping template** button
- Under **Content-Type** type **application/json**
- Under **Template body** type `{}` in the template editor

<p align="center">
    <img src="images/List Streams Mapping Templates 2.png" width="700" height="400"/>
</p>
    
Click on the **Save** button to save the changes.


## Create, describe and delete streams in Kinesis

Under the `streams` resource create a new child resource with the **Resource name** `{stream-name}`. After creating this your **Resources** should look like this:

<p align="center">
    <img src="images/Strea Name 2.png" width="350" height="550"/>
</p>

Create the following three **Methods** for `{stream-name}` resource: `POST`, `GET` and `DELETE`.

### Setting up the `GET` method.

1. In the **Create method** page you will need to define the following:

- For **Integration type** select **AWS Service**
- For **AWS Region** choose us-east-1
- For **AWS Service** select **Kinesis**
- For **HTTP method** select `POST` 
- For **Action Type** select **User action name**
- For **Action name** type `DescribeStream`
- For **Execution role** you should use the same ARN as in the previous step

Finally, click **Create method**. This will redirect you to the **Method Execution** page.

2. From here select the **Integration Request** panel, and then **Edit**. Expand the **URL request headers parameters** panel and select the following options:

- Click on the **Add request header parameter** button
- Under **Name** type **Content-Type**
- Under **Mapped form** type **'application/x-amz-json-1.1'**

3. Expand the **Mapping Ttemplates** panel and select the following options:

- Click on the **Add mapping template** button
- Under **Cotent-Type** type **application/json**
- In the **Template body** include the following:

In [None]:
{
    "StreamName": "$input.params('stream-name')"
}

This template is designed to construct an input payload for sending data to an AWS Kinesis stream. It expects the client to provide the `stream-name` as a parameter in the API request, and it uses this parameter to populate the `StreamName` field in the output payload.

Here's a step-by-step explanation:

- The `$input` variable represents the entire input payload received by the API Gateway

- The `params('stream-name')` function is used to retrieve the value of the `stream-name` parameter from the API request. This parameter should have been defined in the API's method request configuration, either as a query parameter, path parameter, or header parameter.

- The retrieved value of the `stream-name` parameter is then used to populate the `StreamName` field in the output payload

Let's see an example of how this mapping template would work with an API request. The API request would look like this `POST /some-resource?stream-name=my-kinesis-stream`, and the mapping template output would look like this:

In [None]:
{
    "StreamName": "my-kinesis-stream"
}

Finally, choose the **Save** button to these changes.


### Setting up the `POST` method.

Follow step 1 from **Setting up the GET method** section but in the **Action name** section type `CreateStream`. For setting up the **URL request headers parameters** section follow step 2.

For setting up the **Mapping Templates** panel follow step 3 instruction, but add the following mapping template in the template body instead:

In [None]:
{
    "ShardCount": #if($input.path('$.ShardCount') == '') 5 #else $input.path('$.ShardCount') #end,
    "StreamName": "$input.params('stream-name')"
}

Let's breakdown this more complex mapping template:

- `"ShardCount":`: This is the key for the field that will hold the value of the shard count in the output payload

- `#if($input.path('$.ShardCount') == '')`: This is a conditional statement that checks whether the `"ShardCount"` field is empty in the input payload

- `5`: This is the default value to be used for `"ShardCount"` in case the input payload doesn't have a value for it. In this example, if the `"ShardCount"` field is empty, it will be set to 5.

- `#else $input.path('$.ShardCount')`: If the `"ShardCount"` field is not empty in the input payload, this part of the conditional statement will be executed. It retrieves the value of `"ShardCount"` from the input payload using the `$input.path()` function.

- `"StreamName": "$input.params('stream-name')"`: This is similar to what we discussed in the previous example. It sets the `"StreamName"` field in the output payload by retrieving the value of the `"stream-name"` parameter from the API request.

To summarize, this mapping template does the following:

- If the input payload contains a non-empty `"ShardCount"` field, it sets the `"ShardCount"` field in the output payload to the same value

- If the input payload does not contain a `"ShardCount"` field or if it is empty, it sets the `"ShardCount"` field in the output payload to a default value of 5

- It sets the `"StreamName"` field in the output payload based on the value of the `"stream-name"` parameter provided in the API request

Here's an example of how this template would work for an input payload:

In [None]:
Input Payload:
{
    "ShardCount": 10
}

And the output payload would look like this:

In [None]:
{
    "ShardCount": 10,
    "StreamName": "my-kinesis-stream"
}

Or in the case we have an input payload that does not contain `ShardCount`:

In [None]:
Input Payload:
{
    "SomeOtherField": "some value"
}

The output payload will look like this:

In [None]:
{
    "ShardCount": 5,
    "StreamName": "my-kinesis-stream"
}

### Setting up the `DELETE` method.

Follow step 1 from **Setting up the GET method** section but in the **Action* name** section type `DeleteStream`. For setting up the **URL request headers parameters** section follow step 2.

For setting up the **Mapping Templates** panel follow step 3 instruction, but add the following mapping template in the template body instead:

In [None]:
{
    "StreamName": "$input.params('stream-name')"
}

## Add records to streams in Kinesis

Under the `{stream-name}` resource create a two new child resources with the **Resource Name**, `record` and `records`. For both resources create a `PUT` method. 

### Setting up the **record** `PUT` method

Follow step 1 from **Setting up the GET method** section but in the **Action name** section type `PutRecord`. For setting up the **URL request headers parameters** section follow step 2.

For setting up the **Mapping Templates** panel follow step 3 instruction, but add the following mapping template in the template body instead:

In [None]:
{
    "StreamName": "$input.params('stream-name')",
    "Data": "$util.base64Encode($input.json('$.Data'))",
    "PartitionKey": "$input.path('$.PartitionKey')"
}

This mapping template is used to transform an API request payload into the format required for writing a single record to an AWS Kinesis stream. Let's go through it step by step:

- `"StreamName": "$input.params('stream-name')"`: This sets the `"StreamName"` field in the output payload to the value of the `"stream-name"` parameter provided in the API request. This part is similar to previous examples.

- `"Data": "$util.base64Encode($input.json('$.Data'))"`: This line sets the `"Data"` field in the output payload. It uses the `$input.json()` function to retrieve the value of the `"Data"` field from the input payload. The value is then encoded in Base64 format using the `$util.base64Encode()` function. Kinesis requires the data to be in Base64 format when writing records.

- `"PartitionKey": "$input.path('$.PartitionKey')"`: This line sets the `"PartitionKey"` field in the output payload. It uses the `$input.path()` function to retrieve the value of the `"PartitionKey"` field from the input payload.

To summarize, this mapping template does the following:

- It sets the `"StreamName"` field in the output payload based on the value of the `"stream-name"` parameter provided in the API request

- It retrieves the value of the `"Data"` field from the input payload and encodes it in Base64 format, which is required for writing to a Kinesis stream

- It retrieves the value of the `"PartitionKey"` field from the input payload

Here's an example of how this template would work for a sample API request:

In [None]:
POST /some-resource?stream-name=my-kinesis-stream
Request Body:
{
    "Data": "Hello, Kinesis!",
    "PartitionKey": "partition-1"
}

For the above API request, the mapping template output will look like this:

In [None]:
{
    "StreamName": "my-kinesis-stream",
    "Data": "SGVsbG8sIEtpbmVzaXMh",     // Base64 encoded "Hello, Kinesis!"
    "PartitionKey": "partition-1"
}

### Setting up the **records** `PUT` method.

Follow step 1 from **Setting up the GET method** section but in the **Action name** section type `PutRecords`. For setting up the **URL request headers parameters** section follow step 2.

For setting up the **Mapping Templates** panel follow step 3 instruction, but add the following mapping template in the template body instead:

In [None]:
{
    "StreamName": "$input.params('stream-name')",
    "Records": [
       #foreach($elem in $input.path('$.records'))
          {
            "Data": "$util.base64Encode($elem.data)",
            "PartitionKey": "$elem.partition-key"
          }#if($foreach.hasNext),#end
        #end
    ]
}

This mapping template is used to transform an API request payload into the format required for writing records to an AWS Kinesis stream. Let's break it down step by step:

- `"StreamName": "$input.params('stream-name')"`: This sets the `"StreamName"` field in the output payload to the value of the `"stream-name"` parameter provided in the API request. This is similar to what we have seen in previous examples.

- `"Records": [...]`: This is an array key in the output payload that will hold an array of records to be written to the Kinesis stream

- `#foreach($elem in $input.path('$.records'))`: This is a loop that iterates over each element (record) in the `"records"` array from the input payload. The `$input.path()` function retrieves the value of `"records"` array from the input payload.

- `"Data": "$util.base64Encode($elem.data)"`: For each record in the `"records"` array, this line sets the `"Data"` field in the output payload. It uses the `$elem.data` syntax to access the `"data"` field of the current record. It then applies the `$util.base64Encode()` function to encode the data in Base64 format, which is the expected format for the data when writing to a Kinesis stream.

- `"PartitionKey": "$elem.partition-key"`: This line sets the `"PartitionKey" `field in the output payload for each record. It uses the `$elem.partition-key` syntax to access the `"partition-key"` field of the current record.

- `#if($foreach.hasNext),#end`: This conditional statement adds a comma (,) after each record except the last one. This is required to ensure that the output JSON is formatted correctly as an array.

- `#end`: This marks the end of the loop

To summarize, this mapping template does the following:

- It sets the `"StreamName"` field in the output payload based on the value of the `"stream-name"` parameter provided in the API request

- It iterates over the `"records"` array in the input payload and constructs an array of records in the output payload, where each record includes the encoded `"Data"` and `"PartitionKey"` fields required for writing to a Kinesis stream

Here's an example of how this template would work for a sample API request:

In [None]:
POST /some-resource?stream-name=my-kinesis-stream
Request Body:
{
    "records": [
        {
            "data": "Hello, Kinesis!",
            "partition-key": "partition-1"
        },
        {
            "data": "Another message",
            "partition-key": "partition-2"
        }
    ]
}

For the above API request, the mapping template output will look like this:

In [None]:
{
    "StreamName": "my-kinesis-stream",
    "Records": [
        {
            "Data": "SGVsbG8sIEtpbmVzaXMh",     // Base64 encoded "Hello, Kinesis!"
            "PartitionKey": "partition-1"
        },
        {
            "Data": "QW5vdGhlciBtZXNzYWdl",   // Base64 encoded "Another message"
            "PartitionKey": "partition-2"
        }
    ]
}

## API Responses in Python

Now that we have updated our API, we can use the Python requests library to test the new API methods and obtain a response. 

Make sure to deploy the newest version of your API and use the correct API Invoke URL.

Ensure you have a uniquely identifiable PartitionKey, we will need this later to read the correct data from Kinesis into Databricks

In [None]:
import requests
import json

example_df = {"index": 1, "name": "Maya", "age": 25, "role": "engineer"}

# invoke url for one record, if you want to put more records replace record with records
invoke_url = "https://YourAPIInvokeURL/<YourDeploymentStage>/streams/<stream_name>/record"

#To send JSON messages you need to follow this structure
payload = json.dumps({
    "StreamName": "YourStreamName",
    "Data": {
            #Data should be send as pairs of column_name:value, with different columns separated by commas      
            "index": example_df["index"], "name": example_df["name"], "age": example_df["age"], "role": example_df["role"]
            },
            "PartitionKey": "desired-name"
            })

headers = {'Content-Type': 'application/json'}

response = requests.request("PUT", invoke_url, headers=headers, data=payload)


To see whether the request was successfully processed we can `print(response.status_code)`, which should return a status 200, indicating success.
We can view further metadata about the request with `print(response.content)`, here we can see the individual sequence number alongside the shard number.

## Visualise data coming into Kinesis Data Streams

Once you send data to a Kinesis Data Stream and receive a 200 `response.status_code`, you can visualise this data in the **Kinesis** console. 

In the console select the stream you want to look at and then choose the **Data viewer** section. Here, select the **Shard** (data will normally be stored in the first shard `shardId-000000000000`). 

In the **Starting position** section select **At timestamp**. Now you can select the **Start date**, which corresponds to the date at which you send data to your stream and the **Start time**, the time at which you started sending data (this can be an approximation). 

Alternatively, you can select **Trim horizon** as the start position, which will read all the records available in the stream if you've only posted to the stream once. If the stream has already been used, then it will read data from the last checkpoint. 

Once everything is set up, press **Get records** and you will be able to visualise the data that has been send to the stream.
**Note:** You many have to press **Next Records** a few times to see the incoming data

<p align="center">
    <img src="images/Kinesis Records.png" width="900" height="550"/>
</p>


## Conclusion
At this point, we should have a good understanding of:

- How to create necessary permissions for Kinesis and API Gateway communication
- How to create methods that allow to list streams in Kinesis
- How to create methods that allow to create, describe and delete streams in Kinesis
- How to create methods that allow to add records to Kinesis streams