<a href="https://colab.research.google.com/github/JotaBlanco/QuixStreamsNotebooks/blob/main/Tutorials/Quix_Streams_SUB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install Quix Streams
Just use pip install to download the Quix Streams library. 

[Quix Streams](https://github.com/quixio/quix-streams) is an open source Python library for processing streaming data. It’s aimed at people who work with time-series data streams — from developers and ML engineers to data scientists and data engineers.

In [None]:
! pip install quixstreams

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting quixstreams
  Downloading quixstreams-0.5.0-py3-none-manylinux2014_x86_64.whl (47.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.8/47.8 MB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
Collecting Deprecated<2,>=1.1
  Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Installing collected packages: Deprecated, quixstreams
Successfully installed Deprecated-1.2.13 quixstreams-0.5.0


# Import the libraries
We will be using mainly pandas, quix, matplotlib and seaborn.

In [None]:
import pandas as pd
import quixstreams as qx

# 1 - Create client
Let's start by creating a Quix client that we'll use to publish and subscribe to Kafka topics.

In [None]:
token = 'sdk-296f2b9decff4770a525ff7d8855a78d'
client = qx.QuixStreamingClient(token)
# client.api_url = "https://portal-api.dev.quix.ai"
client

<quixstreams.quixstreamingclient.QuixStreamingClient at 0x7ff4b73559a0>

# 2 - Consumer client
To suscribe to data from one topic, we will need to create a consumer client pointing to that topic.

In [None]:
topic_name = "test-topic"
topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer

<quixstreams.topicconsumer.TopicConsumer at 0x7ff4b7355040>

# 3 - Suscribing to topics
Once you have the TopicConsumer instance you can start receiving data. These are the steps:

## 3.1 - Subscribing to streams
For each stream received, the TopicConsumer will execute the callback you define. This callback will be invoked every time you receive a new stream.

In [None]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
  """
  My callback to new streams received is defined here
  """
  print("New stream just received:" + stream_received.stream_id)

topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

## 3.2 - Subscribing to Timeseries data
You can subscribe to time-series data from streams using the on_data_received callback of the StreamConsumer instance.

### 3.2.1 - qx.TimeseriesData
This is how you read the data in the standard TimeseriesData format:

In [None]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
    stream_received.timeseries.on_data_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, data: qx.TimeseriesData):
  print("Data from stream " + stream.stream_id)
  with data:
    print(data)

topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

### 3.2.2 - pd.DataFrame
This is how you read the data in pandas dataframe format.

In [None]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
  stream_received.timeseries.on_dataframe_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, df: pd.DataFrame):
  print("Data from stream " + stream.stream_id)
  display(df)

topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

In [None]:
df= pd.DataFrame()

def on_stream_received_handler(stream_received: qx.StreamConsumer):
  stream_received.timeseries.on_dataframe_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, df_i: pd.DataFrame):
  global df
  df = df.append(df_i)
  print("Data from stream " + stream.stream_id)
  display(df_i)

topic_name = "chat-messages-enriched"
topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

Data from stream javi test


Unnamed: 0,timestamp,score,sentiment,average_sentiment,chat-message,label,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1678611430219000000,0.96304,-0.96304,0.068536,hola hola pepsicola,NEGATIVE,javi test,Customer,Javier,,


Data from stream javi test


Unnamed: 0,timestamp,score,sentiment,average_sentiment,chat-message,label,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1678611436148000000,0.99628,0.99628,0.134804,buenas noches señora,POSITIVE,javi test,Customer,Javier,,


Data from stream javi test


Unnamed: 0,timestamp,score,sentiment,average_sentiment,chat-message,label,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1678611441609000000,0.744404,-0.744404,0.07619,buenas noches señoooooooooora,NEGATIVE,javi test,Customer,Javier,,


Data from stream javi test


Unnamed: 0,timestamp,score,sentiment,average_sentiment,chat-message,label,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1678611467183000000,0.998928,0.998928,0.133861,buenas noches señoraa!,POSITIVE,javi test,Customer,Javier,,


Data from stream javi test


Unnamed: 0,timestamp,score,sentiment,average_sentiment,chat-message,label,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1678611473567000000,0.989889,0.989889,0.184216,buenas noches señoooooooooooooora!!!!!!!!!,POSITIVE,javi test,Customer,Javier,,


In [None]:
df

Unnamed: 0,timestamp,score,sentiment,average_sentiment,chat-message,label,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,1678611430219000000,0.96304,-0.96304,0.068536,hola hola pepsicola,NEGATIVE,javi test,Customer,Javier,,
0,1678611436148000000,0.99628,0.99628,0.134804,buenas noches señora,POSITIVE,javi test,Customer,Javier,,
0,1678611441609000000,0.744404,-0.744404,0.07619,buenas noches señoooooooooora,NEGATIVE,javi test,Customer,Javier,,
0,1678611467183000000,0.998928,0.998928,0.133861,buenas noches señoraa!,POSITIVE,javi test,Customer,Javier,,
0,1678611473567000000,0.989889,0.989889,0.184216,buenas noches señoooooooooooooora!!!!!!!!!,POSITIVE,javi test,Customer,Javier,,


In [None]:
df_2 = df
df_2["timestamp"] = [pd.Timestamp.now() for i in df_2["timestamp"]]
df_2

Unnamed: 0,timestamp,score,sentiment,average_sentiment,chat-message,label,TAG__room,TAG__role,TAG__name,TAG__phone,TAG__email
0,2023-03-12 09:15:38.079699,0.96304,-0.96304,0.068536,hola hola pepsicola,NEGATIVE,javi test,Customer,Javier,,
0,2023-03-12 09:15:38.079918,0.99628,0.99628,0.134804,buenas noches señora,POSITIVE,javi test,Customer,Javier,,
0,2023-03-12 09:15:38.079929,0.744404,-0.744404,0.07619,buenas noches señoooooooooora,NEGATIVE,javi test,Customer,Javier,,
0,2023-03-12 09:15:38.079934,0.998928,0.998928,0.133861,buenas noches señoraa!,POSITIVE,javi test,Customer,Javier,,
0,2023-03-12 09:15:38.079939,0.989889,0.989889,0.184216,buenas noches señoooooooooooooora!!!!!!!!!,POSITIVE,javi test,Customer,Javier,,


In [None]:
df = pd.DataFrame({
    "timestamp": [pd.Timestamp.now()],
    "score": [0.8],
    "sentiment": [-0.8],
    "average_sentiment": [0.5],
    "chat-message": ["qué tal guapis, como esta la peñita"],
    "TAG__name": ["Bot on behalf of Javi"]
})


topic_name = "chat-messages"
topic_producer = client.get_topic_producer(topic_name)
stream_chat = topic_producer.get_or_create_stream("javi test")
stream_chat.timeseries.publish(df)