# Getting Started with GlassFlow Python SDK: Jupyter Notebook Guide

This Jupyter Notebook guide provides a hands-on approach to understanding how to work with the GlassFlow Python SDK. By following the steps outlined above, you should be able to publish and consume data efficiently using Python SDK in your local environment.

## Prerequisites
Before starting, ensure you have:

- A GlassFlow account. [Sign up here](http://app.glassflow.dev/) if you don't have one.
- Python 3.x installed on your system.
- Download and Install [Pip](https://pip.pypa.io/en/stable/installation/) to manage project packages.

## Creating a Pipeline using GlassFlow WebApp

### Step 1: Log in to GlassFlow WebApp

1. Navigate to the [GlassFlow WebApp](https://app.glassflow.dev/).
2. Log in with your credentials.

### Step 2: Create a New Pipeline

1. Once logged in, click on the **"Create New Pipeline"** button.
2. Provide a name for your pipeline in the **"Pipeline Name"** field. This name should be descriptive of the pipeline's purpose (e.g., "PII detection").

### Step 3: Configure a Data Source

1. After naming your pipeline, you will be prompted to configure a data source.
2. Select "SDK" as a data source for your pipeline:
    - The GlassFlow SDK option requires you to implement the logic for sending data from a custom data source to the GlassFlow pipeline in Python.
    - For built-in connectors like Webhook, Amazon SQS, Google Pub/Sub, etc., select the relevant option and provide the required connection details.

### Step 4: Define the Transformer

1. In the transformation step, you can define the logic that will be applied to the data as it passes through the pipeline.
2. You will see a built-in editor to write code for the transformer. There is also an option to choose a sample transformer from the "Template" dropdown menu. 
3. Select the "PII Detection" function template:

    - The **handler** function is mandatory and is where you'll implement your transformation logic.

### Step 5: Configure a Data Sink

1. Next, configure where you want the transformed data to go by selecting a "SDK" Data Sink.
2. The GlassFlow SDK option requires you to implement the logic for consuming data from the GlassFlow pipeline in Python.
3. For built-in connectors like Webhook, Amazon SQS, Google Pub/Sub, etc., select the relevant option and provide the required connection details.

### Step 6: Confirm the Pipeline

1. Review your pipeline configuration and ensure all settings are correct.
2. Once satisfied, click on the **"Create Pipeline"** button to finalize the setup.

### Step 7: Copy the Pipeline Credentials

1. After the pipeline is created, you will receive **Pipeline ID** and **Access Token**.
2. Copy these credentials as they will be needed for interacting with the pipeline using the GlassFlow Python SDK.

## Publish and Consume data from the Pipeline

### Step 1: Install GlassFlow SDK

Install the GlassFlow Python SDK using pip.

In [41]:
pip install glassflow

You should consider upgrading via the '/Users/boburumurzokov/glassflow-examples/.venv/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


### Step 2: Create a Data Producer

In this step, you'll create a data producer to publish events to your GlassFlow pipeline.

1. Import GlassFlow module

In [42]:
import glassflow

2. Set pipeline credentials

Replace `your_pipeline_id` and `your_pipeline_access_token` with appropriate values obtained in the previous steps.

In [43]:
PIPELINE_ID="<your_pipeline_id>"
PIPELINE_ACCESS_TOKEN="<your_pipeline_access_token>"

3. Initialize GlassFlow Pipeline Client

In [44]:
client = glassflow.GlassFlowClient()
pipeline_client = client.pipeline_client(
    pipeline_id=PIPELINE_ID,
    pipeline_access_token=PIPELINE_ACCESS_TOKEN
)

4. Publish Data

Here's how you can publish data to your pipeline:

In [45]:
sample_event = {
    "event_type": "user_signup",
    "user_id": "12345",
    "timestamp": "2024-01-01T12:34:56Z"
}

response = pipeline_client.publish(sample_event)

if response.status_code == 200:
    print("Event published successfully")
else:
    print(f"Failed to publish event: {response.text}")


Event published successfully


### Step 3: Create a Data Consumer

Now, let's create a data consumer to retrieve and process events from your GlassFlow pipeline.

In [47]:
response = pipeline_client.consume()

if response.status_code == 200:
    event_data = response.json()
    print(f"Consumed event: {event_data}")
elif response.status_code == 204:
    print("No new events to consume.")
else:
    print(f"Failed to consume events: {response.text}")


Consumed event: {'event_type': 'user_signup', 'timestamp': '2024-01-01T12:34:56Z', 'user_id': '12345'}


Congratulations! You've set up a real-time pipeline using GlassFlow.

## Conclusion

In this guide you've learned the following:

- How to install GlassFlow and set up a new project.
- How to create a data pipeline using the GlassFlow Web App.
- How to publish data into the pipeline.
- How to consume data from the pipeline.