# Data Lake Example 4 - Event Notifications
For event notifications, there are several useful scenarios in which notifications can inform us about changes or actions taken on objects in a bucket. Notifications are triggered when specific actions (like object creation, deletion, or updates) occur, and they can initiate workflows in other systems like Node-RED or NiFi.

Since Kafka and RabbitMQ aren’t available in the current setup of the data lake system, we can focus on using webhook-based notifications with Node-RED.
* Example of use-case: Trigger an automated workflow when uploading a new file. For example, if a new video file (.webm) is uploaded, Node-RED (or NiFi) can start a workflow to process the file, such as generating thumbnails or analysing metadata.
* Notification Event: s3:ObjectCreated:* (triggers for any type of object creation event).


## 0. Create a Webhook endpoint in Node-RED
There is a basic example at: https://youtu.be/HzO4wsL2Eio?si=NDiqUGyHaKv5hm4K
### 1. *Open Node-RED* in your browser 

* Typically at https://nodered.scene-dl.satrdlab.upv.e unless configured otherwise.

### 2. *Add an HTTP In Node*:

* Drag an http in node onto the canvas. This will act as the webhook endpoint.
* Set the method to POST and URL to /test-minio (or any path you prefer).
* This endpoint URL will be https://nodered.scene-dl.satrdlab.upv.es/test-minio (assuming Node-RED runs through the proxy).

### 3. *Add a Debug Node*:

* Drag a debug node onto the canvas.
* Connect the http in node to the debug node.
* Set the debug node to display the full message object (msg.payload). This will allow us to see the incoming notification data in the debug pane.

### 4. *Add an HTTP Response Node*:

* Drag an http response node to the canvas.
* Connect the http in node to the http response node.
* This node will send a response back to Minio, confirming that the notification was received.

### 5. *Deploy the Flow*:

* Click the Deploy button to save and activate the flow.
* Once deployed, Node-RED will be ready to receive event notifications at http://localhost:1880/s3-webhook.

## 1. Load libraries and common configuration
For this example, it is better to use the minio library, otherwise you might get errors due to not-compatibility formats with the webhook-functionality

Example for not-compatibbility: WebhookConfigurations, CloudFunctionConfigurations, and EventBridgeConfigurations are not valid keys in the NotificationConfiguration dictionary for the native AWS S3 API. These configurations were specific to Minio's custom webhook functionality, which isn't compatible with the standard boto3 API.

In [9]:
# Install necessary packages
!pip install minio certifi
!pip install --upgrade minio



In [9]:
from minio import Minio
from datetime import datetime

import requests
from requests.auth import HTTPBasicAuth
import json

import os
import ssl
import certifi
import sys
import warnings
warnings.filterwarnings('ignore')

#Some issues might appear (SSL verification error) with yhe client if python is not properly configured. 
# You might find this line useful to skip the error 
ssl._create_default_https_context = ssl._create_unverified_context


# MinIO server connection information
minio_url = 's3api.scene-dl.satrdlab.upv.es'  # Replace with your MinIO instance URL
access_key = 'testuser'       # Replace with your actual access key
secret_key = 'testscene'       # Replace with your actual secret key


# Initialize Minio client
minio_client = Minio(
    minio_url,  
    access_key,
    secret_key,
    secure=True  # Set to True if using HTTPS
)

# Test connection by listing buckets
try:
    buckets = minio_client.list_buckets()
    for bucket in buckets:
        print(f"Bucket: {bucket.name}")
except S3Error as e:
    print(f"Error: {e}")

Bucket: coldbucket
Bucket: testbucket


## 2. Configuring S3 Event Notifications

In [10]:
# Parameters
bucket_name = "testbucket"
node_red_webhook_url = "https://nodered.scene-dl.satrdlab.upv.es/test-minio"  # Node-RED webhook URL for notifications

# Define the notification configuration JSON
notification_config = {
    "QueueConfigurations": [
        {
            "QueueArn": "arn:minio:sqs::NodeRedQueue",
            "Events": ["s3:ObjectCreated:*"],
            "Filter": {
                "Key": {
                    "FilterRules": [
                        {"Name": "prefix", "Value": ""},
                        {"Name": "suffix", "Value": ""}
                    ]
                }
            },
            "QueueUrl": node_red_webhook_url
        }
    ]
}

# Convert to JSON
json_config = json.dumps(notification_config)

# Get current date and time in the required format (e.g., 20231106T123456Z)
current_date = datetime.utcnow().strftime('%Y%m%dT%H%M%SZ')

# Set the notification using MinIO's REST API
headers = {
    'Content-Type': 'application/json',
    'x-amz-content-sha256': 'UNSIGNED-PAYLOAD',
    'x-amz-date': current_date
}

# Replace URL to match your MinIO server and region
response = requests.put(
    f"https://{minio_url}/{bucket_name}?notification",
    headers=headers,
    data=json_config,
    auth=HTTPBasicAuth(access_key, secret_key)  # Use HTTPBasicAuth for authentication
)

if response.status_code == 200:
    print("Notification successfully configured")
else:
    print("Error setting notification:", response.content)


Error setting notification: b'<?xml version="1.0" encoding="UTF-8"?>\n<Error><Code>InvalidRequest</Code><Message>The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.</Message><Resource>/testbucket</Resource><RequestId>1805AF8445567211</RequestId><HostId>dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8</HostId></Error>'


## 2. List all files from a bucket within a Data Lake

In [6]:
# List objects in the bucket
response = client.list_objects(Bucket=bucket_name)

# Print each file name (key)
if 'Contents' in response:
    for file in response['Contents']:
        print(file['Key'])
else:
    print("No files found in the bucket.")

athens.jpg
images/athens2.png


## 3. Download file from a Data Lake

In [7]:
# File details
download_path='athens_download.png'
bucket_name='testbucket'
object_name = 'images/athens2.png'    

# Download the file
client.download_file(bucket_name, object_name, download_path)
print(f"Downloaded {object_name} to {download_path}")

Downloaded images/athens2.png to athens_download.png


## 4. Delete a file from the Data Lake

In [8]:
# Delete the file
client.delete_object(Bucket=bucket_name, Key=object_name)
print("Delete successful")

Delete successful
