# Getting Started with GlassFlow

This Jupyter Notebook guide provides a hands-on approach to understanding how to get started with GlassFlow and **create your first pipeline** and interact with the pipeline via the [Python SDK](https://pysdk.docs.glassflow.dev/). By following the steps, you should be able to publish and consume data efficiently from the new pipeline in your local environment.

## Prerequisites
Before starting, ensure you have:

- A GlassFlow account. [Sign up here](http://app.glassflow.dev/) if you don't have one.
- A personal access token from the GlassFlow account you just created
- Python 3.x installed on your system.
- Download and Install [Pip](https://pip.pypa.io/en/stable/installation/) to manage project packages.

In [None]:
import glassflow
import os

In [None]:
# Please edit this variable with your own personal access token from https://app.glassflow.dev/profile

glassflow_personal_access_token = ""

## Creating a Pipeline
A pipeline is how you interact with GlassFlow. Pipeline consists of a **Transform function** that is automatically run by GlassFlow for every event entering the pipeline.
GlassFlow pipelines ingests json data and can do so from multiple sources. The transformed data can then be consumed from the pipeline or automatically sent to data sinks.  
In this guide we are going to setup a pipeline with a basic transform function and send-recieve data from the pipeline via the python SDK.


### Define a basic echo transform function to use with GlassFlow

The transform function is a python function that you want the GlassFlow pipeline to execute for every event that enters the pipeline. It follows a basic structure with a `handler` method that is the entry point for the function. 

In [None]:
transform_function = """
import json
# Write a mandatory 'handler' function
def handler(data, log):
    data["transformed_by"] = "glassflow"
    return data
"""

In [None]:
client = glassflow.GlassFlowClient(personal_access_token=glassflow_personal_access_token)

In [None]:
# create a space  - Spaces are a way to organize pipelines within glassflow

In [None]:
example_space = client.create_space(name="examples")

In [None]:
# Create a pipeline. A pipeline needs a name, a python code and a space where the pipeline should live

In [None]:
pipeline = client.create_pipeline(name="echo-pipeline", transformation_code=transform_function, 
                                  space_id=example_space.id)

In [None]:
# show the created pipeline id 

In [None]:
print(pipeline.id)

### Send events to the pipeline

In [None]:
# Create a random data generator to generate some dummy events data
from faker import Faker
def random_datagen():
    fake = Faker()
    return {
        "name": fake.name(),
        "email": fake.email(),
        "id": fake.uuid4()
    }

Each pipeline provides a source and a sink to publish and consume data from the pipeline. We will use those to send and recieve events from the pipeline

In [None]:
data_source = pipeline.get_source()

In [None]:
# Generate some data and send it to the pipeline. Store it locally to compare 
input_data = []
for i in range(100):
    d = random_datagen()
    input_data.append(d)
    r = data_source.publish(d)

In [None]:
display(input_data)

### Consume events from the pipeline 
Get the sink to consume transformed events from the pipeline

In [None]:
data_sink = pipeline.get_sink()

In [None]:
output_data = []
for i in range(100):
    resp = data_sink.consume()
    output_data.append(resp.json())

In [None]:
display(output_data)

In [None]:
## Explore the pipeline on the web-UI 
pipeline_url = f"https://app.glassflow.dev/pipelines/{pipeline.id}"
print(pipeline_url)

## Conclusion 

**Congratulations!** You have now setup a events driven data pipeline completely in python using GlassFlow. 
In this getting-started guide we 
- setup a new pipeline with a basic transform function
- Published events to the pipeline
- (GlassFlow executed the transform function in real time on the published events)
- consumed back the transformed events from the pipeline

As the next step, you can try some more examples in the `examples` directory of this repository, or explore the [GlassFlow web-app](https://app.glassflow.dev) to setup pipelines. 

GlassFlow also has managed connectors to several data sources and sinks. You can take a look at the supported connectors at [Docs Link]() or try an example with `clickhouse` managed sink at examples/clickhouse-sink