# Stream Filtering

Experiment: can we use Kinesis data analytics to filter the records written to one stream to populate other streams? Or more accurately, how do we do this, and what's the latency for a record written to the main stream to hit the filtered stream?

## Setup

First, we need some streams

In [None]:
import boto3

kinesis_client = boto3.client('kinesis')

In [None]:
# Create some streams
main_stream_response = kinesis_client.create_stream(
    StreamName='main', 
    ShardCount = 1)

In [None]:
kinesis_client.describe_stream(StreamName='main')

In [None]:
kinesis_client.create_stream(StreamName='filtered', ShardCount=1)

In [None]:
kinesis_client.describe_stream(StreamName='filtered')

In [None]:
from datetime import datetime, timezone

def timestamp():
    the_time = datetime.now(timezone.utc)
    return the_time.isoformat()

## Stream Write

In [None]:
import uuid

event = {
    "specversion":"1.0",
    "type":"newFoo",
    "source":"foo",
    "id":str(uuid.uuid4()),
    "time":timestamp(),
    "data":{"foostuff":"foostuffval"}
}

In [None]:
event['source']

In [None]:
import json

prr = kinesis_client.put_record(
    StreamName='main',
    Data=json.dumps(event).encode(),
    PartitionKey=event['source']
)

In [None]:
prr

## Stream Read

In [None]:
## Read from stream

shardId = prr['ShardId']
print('shard id is %s' % shardId)

gsir = kinesis_client.get_shard_iterator(
    StreamName='main',
    ShardId=shardId,
    ShardIteratorType='TRIM_HORIZON'
)
print(gsir)

In [None]:
## Read from currne position of the iterator
grr = kinesis_client.get_records(
    ShardIterator=gsir['ShardIterator']
)

print(grr)

In [None]:
records = grr['Records']
for r in records:
    print(r)

## Analytics App

In [41]:
ka = boto3.client('kinesisanalyticsv2')

In [None]:
car = ka.create_application(
    AppName = 'Dave',
    ApplicationDescription = 'Dave the wonder app',
    RuntimeEnvironment = 'SQL-1_0',
    ServiceExecutionRole = 'uh-oh'
    # Oh crap how do we specify all this stuff - maybe create one from the console and dump it...
)

## Cleanup

In [None]:
kinesis_client.delete_stream(StreamName='main')
kinesis_client.delete_stream(StreamName='filtered')

In [None]:
kinesis_client.list_streams()