# flash-flood User Vignette

[flash-flood](https://github.com/HumanCellAtlas/flash-flood) is an event recorder and streamer built on top of AWS S3, supporting distributed writes and fast distributed bulk reads. It can be used to store and retrieve information about transactions and events in JSON format, which can be quickly filtered with JMESPath. In this notebook we demonstrate basic usage of the flash-flood library.

Let's get started by instantiating an instance of the FlashFlood class:

In [None]:
import boto3
s3 = boto3.resource('s3')

from flashflood import FlashFlood

flash-flood reads and writes events from a journal that is stored in an S3 bucket, so you must provide flash-flood with the name of an S3 bucket you have read/write access to:

In [None]:
ff = FlashFlood(s3, "my-flashflood-test-bucket", "my_prefix")

We can create a flash-flood event by providing flash-flood with event data, a unique event identifier, and a timestamp:

In [None]:
import datetime
import uuid

event_data = b'my event data'
event_uuid = str(uuid.uuid4())
event_date = datetime.datetime.now()

flash-flood exposes a CRUD API to access event information:

In [None]:
# Create
ff.put(event_data, event_uuid, event_date)

# Read
event = ff.get_event(event_uuid)
print("This is the data: " + str(event.data))
print("This is the date: " + str(event.date))
print("This is the event ID: " + event.event_id)

# Update
new_event_data = b'i want to put new data'
ff.update_event(event_uuid, new_event_data)
print("This is the updated data: " + str(ff.get_event(event_uuid).data))

# Delete
ff.delete_event(event_uuid)

All events belong to a journal. Journals can be created ad-hoc or manually:

In [None]:
ff.journal(minimum_number_of_events=1)

Journals can also be listed using the `ff.list_journals()` function in flash-flood.

When events are created, they are assigned a date. You can create a stream of all events that have occurred between two given dates. The code below creates fake events, then creates a stream between two dates:

In [None]:
import json

for i in range(40, 50):
    event_data = json.dumps({'foo': i}).encode()
    event_uuid = str(uuid.uuid4())
    event_date = datetime.datetime.fromtimestamp(10000 * i)
    ff.put(event_data, event_uuid, event_date)

arbitrary_from_date = datetime.datetime.fromtimestamp(10000 * 42)
arbitrary_to_date = datetime.datetime.fromtimestamp(10000 * 48)

for event in ff.replay(from_date=arbitrary_from_date, to_date=arbitrary_to_date):
    print(event.data)

Since the event data is JSON, we can use JMESPath to filter it:

In [None]:
import jmespath

events = []
for event in ff.replay(from_date=arbitrary_from_date, to_date=arbitrary_to_date):
    events.append(json.loads(event.data))

expression = jmespath.compile('events[].foo')
expression.search({'events': events})