# Data Enrichment - Geo Location Data

This example demonstrates how to use GlassFlow to enrich events data by calling a Geo Location API during the transformation stage. 

## Pre-requisites

- Create your free GlassFlow account via the [GlassFlow WebApp](https://app.glassflow.dev).
- Get your [Personal Access Token](https://app.glassflow.dev/profile) to authorize the Python SDK to interact with GlassFlow Cloud.
- Get your Geoapify API key from https://myprojects.geoapify.com/ . This key is used in the transformation stage to get geo location data



In [None]:
%pip install "glassflow>=2.0.5" pandas Faker

In [None]:
import glassflow

In [None]:
# fill credentials
personal_access_token = ""
GEOAPI_KEY = ""


## Create Pipeline

In [None]:
client = glassflow.GlassFlowClient(
    personal_access_token=personal_access_token
)

In [None]:
# Get the space named "examples" (or create one if no space is found)
list_spaces = client.list_spaces()

space_name = "examples"
for s in list_spaces.spaces:
    if s["name"] == space_name:
        space = glassflow.Space(
            personal_access_token=client.personal_access_token,
            id=s["id"], 
            name=s["name"]
        )
        break
else:
    space = client.create_space(name=space_name)

print(f"Space \"{space.name}\" with ID: {space.id}")

### Transformation Function

In [None]:
%pycat transform.py

### Env Variables needed for transformation

In [None]:
env_vars = [{
  "name": "GEOAPI_KEY",
  "value": GEOAPI_KEY
}]

### Requirements txt

In [None]:
with open("requirements.txt") as f:
    requirements_txt = f.read()
display(requirements_txt)

### Create Pipeline

In [None]:
pipeline_name = "data_enrichment-example"

pipeline = client.create_pipeline(
    name=pipeline_name, 
    transformation_file='transform.py',
    space_id=space.id, 
    env_vars=env_vars, 
    requirements=requirements_txt
)
print("Pipeline ID:", pipeline.id)

## Produce data and send it to your pipeline

### Create a dummy data generator using python faker library

In [None]:
from faker import Faker

def geo_data_generator():
    fake = Faker()
    return {
        'address': fake.address(),
        'source': 'example-pipeline'
    }

In [None]:
### Get pipeline data source object to publish events to the pipeline

In [None]:
data_source = pipeline.get_source()

In [None]:
# Generate some data and send it to the pipeline. Store it locally to compare
n_events = 10
input_events = []
for i in range(n_events):
    event = geo_data_generator()
    input_events.append(event)
    data_source.publish(event)

In [None]:
## Display data sent to the pipeline

In [None]:
import pandas as pd

display(pd.DataFrame(input_events))

## Consume events from the pipeline 

Get pipeline data sink to consume the transformed events from the pipeline.

In [None]:
data_sink = pipeline.get_sink()

In [None]:
output_events = []
while True:
    resp = data_sink.consume()
    if resp.status_code == 200:
        output_events.append(resp.json())
    if len(output_events) == n_events:
        # all events have been consumed
        break

In [None]:
import pandas as pd

display(pd.DataFrame(output_events))

## Explore the pipeline on the web-UI


In [None]:
pipeline_url = f"https://app.glassflow.dev/pipelines/{pipeline.id}"
print(pipeline_url)