<a href="https://colab.research.google.com/github/JotaBlanco/QuixStreamsNotebooks/blob/main/Conferences/BerlinTimeseriesMeetup/Quix_Streams_SUB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install Quix Streams
Just use pip install to download the Quix Streams library. 

[Quix Streams](https://github.com/quixio/quix-streams) is an open source Python library for processing streaming data. It’s aimed at people who work with time-series data streams — from developers and ML engineers to data scientists and data engineers.

In [1]:
! pip install quixstreams

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting quixstreams
  Downloading quixstreams-0.5.0-py3-none-manylinux2014_x86_64.whl (47.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.8/47.8 MB[0m [31m15.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting Deprecated<2,>=1.1
  Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Installing collected packages: Deprecated, quixstreams
Successfully installed Deprecated-1.2.13 quixstreams-0.5.0


# Import the libraries
We will be using mainly pandas, quix, matplotlib and seaborn.

In [2]:
import pandas as pd
import quixstreams as qx

# 1 - Create client
Let's start by creating a Quix client that we'll use to publish and subscribe to Kafka topics.

In [3]:
token = 'sdk-296f2b9decff4770a525ff7d8855a78d'
client = qx.QuixStreamingClient(token)
# client.api_url = "https://portal-api.dev.quix.ai"
client

<quixstreams.quixstreamingclient.QuixStreamingClient at 0x7fc999052fd0>

# 2 - Consumer client
To suscribe to data from one topic, we will need to create a consumer client pointing to that topic.

In [4]:
topic_name = "test-topic"
topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer

<quixstreams.topicconsumer.TopicConsumer at 0x7fc96b8eaf40>

# 3 - Suscribing to topics
Once you have the TopicConsumer instance you can start receiving data. These are the steps:

## 3.1 - Subscribing to streams
For each stream received, the TopicConsumer will execute the callback you define. This callback will be invoked every time you receive a new stream.

In [5]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
  """
  My callback to new streams received is defined here
  """
  print("New stream just received: " + stream_received.stream_id)

topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

New stream just received:test-stream_1
New stream just received:test-stream_2


## 3.2 - Subscribing to Timeseries data
You can subscribe to time-series data from streams using the on_data_received callback of the StreamConsumer instance.

### 3.2.1 - qx.TimeseriesData
This is how you read the data in the standard TimeseriesData format:

In [6]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
    stream_received.timeseries.on_data_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, data: qx.TimeseriesData):
  print("Data from stream " + stream.stream_id)
  with data:
    print(data)

topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

Data from stream test-stream_2
  Length:1
    Time:1678726201999970048
      Tags: {}
      Params:
        Param A: 16.0
        Param B: 1.0
Data from stream test-stream_2
  Length:1
    Time:1678726201999970048
      Tags: {}
      Params:
        Param A: 16.0
        Param B: 1.0
Data from stream test-stream_2
  Length:1
    Time:1678726201999970048
      Tags: {}
      Params:
        Param A: 16.0
        Param B: 1.0
Data from stream test-stream_2
  Length:1
    Time:1678726201999970048
      Tags: {}
      Params:
        Param A: 16.0
        Param B: 1.0
Data from stream test-stream_2
  Length:1
    Time:1678726201999970048
      Tags: {}
      Params:
        Param A: 16.0
        Param B: 1.0
Data from stream test-stream_1
  Length:1
    Time:1678726201999970048
      Tags: {}
      Params:
        Param A: 16.0
        Param B: 1.0


### 3.2.2 - pd.DataFrame
This is how you read the data in pandas dataframe format.

In [7]:
def on_stream_received_handler(stream_received: qx.StreamConsumer):
  stream_received.timeseries.on_dataframe_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, df: pd.DataFrame):
  print("Data from stream " + stream.stream_id)
  display(df)

topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

Data from stream test-stream_1


Unnamed: 0,timestamp,Param A,Param B
0,1678726201999970048,16.0,1.0


Data from stream test-stream_1


Unnamed: 0,timestamp,Param A,Param B
0,1678726201999970048,16.0,1.0


Data from stream test-stream_1


Unnamed: 0,timestamp,Param A,Param B
0,1678726201999970048,16.0,1.0


In [8]:
df= pd.DataFrame()

def on_stream_received_handler(stream_received: qx.StreamConsumer):
  stream_received.timeseries.on_dataframe_received = on_timeseries_data_received_handler

def on_timeseries_data_received_handler(stream: qx.StreamConsumer, df_i: pd.DataFrame):
  global df
  df = df.append(df_i)
  print("Data from stream " + stream.stream_id)
  display(df_i)

topic_consumer = client.get_topic_consumer(topic_name)
topic_consumer.on_stream_received = on_stream_received_handler
qx.App.run()

Data from stream test-stream_1


  df = df.append(df_i)


Unnamed: 0,timestamp,Param A,Param B
0,1678726737081316096,15.0,9.0


Data from stream test-stream_1


  df = df.append(df_i)


Unnamed: 0,timestamp,Param A,Param B
0,1678726737792115968,14.0,2.0


Data from stream test-stream_1


  df = df.append(df_i)


Unnamed: 0,timestamp,Param A,Param B
0,1678726741138672896,12.0,3.0


Data from stream test-stream_1


  df = df.append(df_i)


Unnamed: 0,timestamp,Param A,Param B
0,1678726743268410880,19.0,6.0


Data from stream test-stream_1


  df = df.append(df_i)


Unnamed: 0,timestamp,Param A,Param B
0,1678726745204865024,15.0,2.0


In [9]:
df

Unnamed: 0,timestamp,Param A,Param B
0,1678726737081316096,15.0,9.0
0,1678726737792115968,14.0,2.0
0,1678726741138672896,12.0,3.0
0,1678726743268410880,19.0,6.0
0,1678726745204865024,15.0,2.0
