<a href="https://colab.research.google.com/github/vu-topics-in-big-data-2023/examples/blob/main/example-pulsar-zookeeper/pulsar_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pulsar Demo

In this notebook we will demonstrate how to work with Apache Pulsar.

Apache Pulsar documentation: http://pulsar.apache.org/docs/en/pulsar-2.0/

# 1. Pulsar Setup

Setup your own  standalone Pulsar instance using the docker instructions at http://pulsar.apache.org/docs/en/standalone-docker/



# 2. Install Python pulsar-client

The Python pulsar-client is used to connect with a running pulsar instance. 

Documentation links for additional detail:
http://pulsar.apache.org/docs/en/client-libraries-python/

http://pulsar.apache.org/api/python/2.5.0-SNAPSHOT/



In [None]:
!pip install pulsar-client==2.5.0

# 3. Pulsar Producer

Pulsar producers are used to send publish messages to Pulsar. Here I create a python script that will run in the background and will send a message every 1 second to pulsar on a topic, which we are calling 'my-pulsar-topic'.

In [None]:
%%file producer.py
#!/usr/bin/python3.6

import pulsar
import datetime
import time
import argparse

def main_run():
  # connect to client
  # replace this with the connection to your client. 
  
  client = pulsar.Client('pulsar://34.67.159.115:6650')

  # set topic we will publish to
  topic = 'my-pulsar-topic'

  # create the producer
  producer = client.create_producer(topic)

  # send a message every second
  message_num = 0
  while True:
    cur_time = datetime.datetime.now().strftime("%H:%M:%S")
    message = 'hello-pulsar, message number: {}, time: {}'.format(message_num, 
                                                                  cur_time)
    producer.send(message.encode('utf-8'))
    message_num += 1
    time.sleep(1)

if __name__ == "__main__": 
  main_run()

Overwriting producer.py


In [None]:
#note that nohup and & allow us to run program in background in detached mode.
!nohup /usr/bin/python3 producer.py &

nohup: appending output to 'nohup.out'


# 4. Read messages from Pulsar

There are two interfaces for reading messages from Pulsar. The reader interface is the easiest, it follows the dumb broker/smart client model. In this model the client (ie the script below) tells Pulsar where it wants to start reading messages (I start at the earliest message using "pulsar.MessageId.earliest"). 

To learn more about reading data from Pulsar and the various APIs see documenation at: http://pulsar.apache.org/docs/en/concepts-clients/

In [None]:
import pulsar

client = pulsar.Client('pulsar://34.67.159.115:6650')
topic = 'my-pulsar-topic'

reader = client.create_reader(topic, pulsar.MessageId.earliest)

numoftimestorun=10
for i in range(numoftimestorun):
    msg = reader.read_next()
    print("Received message '{}' id='{}'".format(msg.data(), msg.message_id()))
    # No acknowledgment

Received message 'b'hello-pulsar, message number: 0, time: 18:04:01'' id='(135,0,-1,-1)'
Received message 'b'hello-pulsar, message number: 1, time: 18:04:02'' id='(135,1,-1,-1)'
Received message 'b'hello-pulsar, message number: 2, time: 18:04:03'' id='(135,2,-1,-1)'
Received message 'b'hello-pulsar, message number: 3, time: 18:04:04'' id='(135,3,-1,-1)'
Received message 'b'hello-pulsar, message number: 4, time: 18:04:05'' id='(135,4,-1,-1)'
Received message 'b'hello-pulsar, message number: 5, time: 18:04:07'' id='(135,5,-1,-1)'
Received message 'b'hello-pulsar, message number: 6, time: 18:04:08'' id='(135,6,-1,-1)'
Received message 'b'hello-pulsar, message number: 7, time: 18:04:09'' id='(135,7,-1,-1)'
Received message 'b'hello-pulsar, message number: 8, time: 18:04:10'' id='(135,8,-1,-1)'
Received message 'b'hello-pulsar, message number: 9, time: 18:04:11'' id='(135,9,-1,-1)'


In [None]:
!pkill producer.py
!cat nohup.out

2020-03-18 18:04:01.517 INFO  ConnectionPool:85 | Created connection for pulsar://34.67.159.115:6650
2020-03-18 18:04:01.557 INFO  ClientConnection:330 | [172.28.0.2:55800 -> 34.67.159.115:6650] Connected to broker
2020-03-18 18:04:01.641 INFO  HandlerBase:53 | [persistent://public/default/my-pulsar-topic, ] Getting connection from pool
2020-03-18 18:04:01.680 INFO  ConnectionPool:85 | Created connection for pulsar://localhost:6650
2020-03-18 18:04:01.720 INFO  ClientConnection:332 | [172.28.0.2:55804 -> 34.67.159.115:6650] Connected to broker through proxy. Logical broker: pulsar://localhost:6650
2020-03-18 18:04:01.827 INFO  ProducerImpl:151 | [persistent://public/default/my-pulsar-topic, ] Created producer on broker [172.28.0.2:55804 -> 34.67.159.115:6650] 
