<a href="https://colab.research.google.com/github/JotaBlanco/QuixStreamsNotebooks/blob/main/Conferences/DSF/Quix_Streams_PROCESS_CHAT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install Quix Streams
Just use pip install to download the Quix Streams library. 

[Quix Streams](https://github.com/quixio/quix-streams) is an open source Python library for processing streaming data. It’s aimed at people who work with time-series data streams — from developers and ML engineers to data scientists and data engineers.

In [1]:
! pip install quixstreams

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting quixstreams
  Downloading quixstreams-0.5.3-py3-none-manylinux2014_x86_64.whl (30.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.3/30.3 MB[0m [31m27.9 MB/s[0m eta [36m0:00:00[0m
Collecting Deprecated<2,>=1.1 (from quixstreams)
  Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Installing collected packages: Deprecated, quixstreams
Successfully installed Deprecated-1.2.13 quixstreams-0.5.3


# Import the libraries
We will be using mainly pandas, quix, matplotlib and seaborn.

In [2]:
import pandas as pd
import quixstreams as qx

# 1 - Create client
Let's start by creating a Quix client that we'll use to publish and subscribe to Kafka topics.

In [3]:
# Initiating Quix managed token, but it could be your own kafka
token = 'sdk-296f2b9decff4770a525ff7d8855a78d'
client = qx.QuixStreamingClient(token)
client

<quixstreams.quixstreamingclient.QuixStreamingClient at 0x7f2e5de11d80>

# 2 - Clients
Create producer and consumer clients

In [4]:
topic_name = "chat-messages"
topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer

<quixstreams.topicconsumer.TopicConsumer at 0x7f2e5ed36350>

In [5]:
topic_name = "chat-messages-enriched"
topic_producer = client.get_topic_producer(topic_name)
topic_producer

<quixstreams.topicproducer.TopicProducer at 0x7f2e38999de0>

In [6]:
stream_id = "dsf"
stream_out = topic_producer.get_or_create_stream(stream_id)
stream_out

<quixstreams.streamproducer.StreamProducer at 0x7f2362242b60>

# 3 - Listen to some data
Let's listen to some data

In [7]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
  stream_received.timeseries.on_dataframe_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, df_i: pd.DataFrame):
  display(df_i)

topic_consumer = client.get_topic_consumer("chat-messages")
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

Unnamed: 0,timestamp,chat-message,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1684541260682000000,hi,dsf,Customer,Javi,,


Unnamed: 0,timestamp,chat-message,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1684541325710000000,hola,dsf,Customer,Javi,,


# 4 - Process data

In [8]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
  stream_received.timeseries.on_dataframe_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, df: pd.DataFrame):
  df["chat-message"] = df["chat-message"].str.upper()

  stream_out = topic_producer.get_or_create_stream("dsf")
  stream_out.timeseries.publish(df)

topic_consumer = client.get_topic_consumer("chat-messages")
topic_producer = client.get_topic_producer("chat-messages-enriched")
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()