# Buckets creation and notifications configuration

First of all, if you've never used a Jupyter notebook before, here are a few information:
- A notebook is an environment where you have *cells* that can display formatted text (like the present cell), or code (as you will see below).
- Code cells contain Python code that can be run interactively. Thats means you can modify the code, as you will do in the first code cell, then run it. The code will not run on you computer or in the browser, but directly on the server you are connected to.
- To run a code cell, just select it (click in the cell, or on the left side of it), and click the "Run/Play" button from the toolbar.
- You can also press CTRL+Enter to run a cell, or Shift+Enter to run the cell and automatically select the following one.
- You can navigate between cells with Up and Down arrows (on your keyboard or in the toolbar).

With this (very) quick turorial, you should be able to run the different parts of this notebook. If you still have a doubt, here is a longer tutorial: https://www.codecademy.com/articles/how-to-use-jupyter-notebooks

# Preparatory steps

## ---==== REALLY IMPORTANT STEP! ====---
### Please read carefully and make all the requested changes. There are 4 changes to make.
## Parameters

We will first start by setting some paramaters that will be used in this notebook. Replace the ones where indicated with the needed values, and run the cell.

In [None]:
# Warning!!!: For all those variables, don't remove the starting/ending quotes when doing copy/paste

# Enter the main namespace name of this lab. It should be in the form xraylab-{number}, like xraylab-3
namespace = 'xraylab-xx'

# Enter you bucket base name. That's the same one you have put in the config map. It should be identical as your namespace if you have followed the instructions.
bucket_base_name = 'xraylab-xx'

# Enter you Access and Secret keys. They are the ones that were displayed in the instructions.
aws_access_key_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
aws_secret_access_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

# Do not change this value, this is the internal location for the RGW
endpoint_url = 'http://rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc.cluster.local'


## Imports
Of course we'll need some libraries to work with, so import them by running the following cell.

In [None]:
!pip install boto3
import boto3
import json
import botocore
import argparse

## S3 and SNS connections
Boto3 is a standard library to interact with cloud services like S3 and SNS. As Ceph is compatible with S3 and SNS, we can directly use the library to work with the storage. First, let's create the clients to connect to the storage (you can see we are using some parameters we defined earlier).

In [None]:
s3 = boto3.client('s3',
                endpoint_url = endpoint_url,
                aws_access_key_id = aws_access_key_id,
                aws_secret_access_key = aws_secret_access_key,
                region_name = 'default',
                config=botocore.client.Config(signature_version = 's3'))

sns = boto3.client('sns', 
                endpoint_url = endpoint_url, 
                aws_access_key_id = aws_access_key_id,
                aws_secret_access_key= aws_secret_access_key,
                region_name='default', 
                config=botocore.client.Config(signature_version = 's3'))

## Create buckets
Now that we can connect to the storage, we can create our buckets. Run the first cell, which will define a "creation function" (an S3 API call using the client we created). Then the second cell that will create the 3 buckets we will need.

In [None]:
def create_bucket(bucket_name):
    result = s3.create_bucket(Bucket=bucket_name)
    return result

In [None]:
create_bucket(bucket_base_name)
create_bucket(bucket_base_name+'-processed')
create_bucket(bucket_base_name+'-anonymized')

### Verification
As the previous output may have been cryptic (and anyway it's always good to check), let's list all the buckets and verify they indeed have been created.

In [None]:
for bucket in s3.list_buckets()['Buckets']:
    print(bucket['Name'])

## Make buckets public read
Our Grafana dashboard will display the last image from each bucket. Instead of setting up a dedicated web server, we can directly query our object stores to retrieve the images. For this to work we have to make our bucket "public-readable". This is done by applying to each this bucket policy.

In [None]:
for bucket in s3.list_buckets()['Buckets']:
    bucket_policy = {
                      "Version":"2012-10-17",
                      "Statement":[
                        {
                          "Sid":"AddPerm",
                          "Effect":"Allow",
                          "Principal": "*",
                          "Action":["s3:GetObject"],
                          "Resource":["arn:aws:s3:::{0}/*".format(bucket['Name'])]
                        }
                      ]
                    }
    bucket_policy = json.dumps(bucket_policy)
    s3.put_bucket_policy(Bucket=bucket['Name'], Policy=bucket_policy)

# Bucket Notifications configuration

## First, let's define our endpoint (where we will send our notifications) through a small array.

In [None]:
attributes = {}
attributes['push-endpoint'] = 'kafka://my-cluster-kafka-bootstrap.'+namespace+':9092'
attributes['kafka-ack-level'] = 'broker'

## Now, we define a function that will create a topic with those attributes (I know we will create only one topic, so a function may seem too much, but now you have a reusable snippet for when you have lots to create).

In [None]:
def create_topic(topic):
    topic_arn = sns.create_topic(Name=topic, Attributes=attributes)['TopicArn']
    return topic_arn

## Create the notification topic

In [None]:
create_topic('xray-images')

### And as always, a quick check that it has been created.

In [None]:
sns.list_topics()

## Next step is to define a notification configuration, i.e. when our topic should be used. Here it's whenever a new object is being created ("Events": \["s3:ObjectCreated:*"\]), in which case we use our topic, refering to it through its ARN (unique id, 'arn:aws:sns:s3a::xray-images'). And we apply this configuration to our base bucket, the one where the images will arrive: 

In [None]:
bucket_notifications_configuration = {
            "TopicConfigurations": [
                {
                    "Id": 'xray-images',
                    "TopicArn": 'arn:aws:sns:s3a::xray-images',
                    "Events": ["s3:ObjectCreated:*"]
                }
            ]
        }

s3.put_bucket_notification_configuration(Bucket = bucket_base_name,
        NotificationConfiguration=bucket_notifications_configuration)

### Last quick verfication that the configuration has been applied to our bucket.

In [None]:
s3.get_bucket_notification_configuration(Bucket = bucket_base_name)

# You're done!
Buckets have been created, notifications have been configured. You're now ready to run the demo. You can leave the notebook opened or close the tab, and go back to the Bookbag for the instructions on how to run the demo.