# Exercise 04 : Structured Streaming with Azure EventHub or Kafka
You can use structured streaming (see "Exercise 02 : Structured Streaming") with Azure EventHub or HDInsight Kafka. Here we show using Azure EventHub.

Before starting,
1. Create Event Hub Namespace resource in Azure Portal
2. Create new Event Hub in the previous namespace
3. Create SAS policy and copy connection string on generated Event Hub entity
4. Install Event Hub library as follows
    - On workspace, right-click "Shared". From the context menu, select "Create" > "Library"
    - Install "com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.1" on "Maven Coordinate" source
    - Attach installed library to your cluster

In [2]:
# Read Event Hub's stream
conf = {}
conf["eventhubs.connectionString"] = "Endpoint=sb://myhub01.servicebus.windows.net/;SharedAccessKeyName=testpolicy01;SharedAccessKey=5sDXk9yYTG...;EntityPath=hub01"

read_df = (
  spark
    .readStream
    .format("eventhubs")
    .options(**conf)
    .load()
)

The following will continue to run as background jobs ...

In [4]:
# Write streams into memory
from pyspark.sql.types import *
import  pyspark.sql.functions as F

read_schema = StructType([
  StructField("event_name", StringType(), True),
  StructField("event_time", StringType(), True)])
decoded_df = read_df.select(F.from_json(F.col("body").cast("string"), read_schema).alias("payload"))

query1 = (
  decoded_df
    .writeStream
    .format("memory")
    .queryName("read_hub")
    .start()
)

In [5]:
%sql select payload.event_name, payload.event_time from read_hub

event_name,event_time
Open,1540601000
Open,1540601000
Open,1540601000


Publish event into EventHub and see the previous results again ! (See how streaming is displayed in the previous background job.)

In [7]:
from pyspark.sql import Row

write_schema = StructType([StructField("body", StringType())])
write_row = [Row(body="{\"event_name\":\"Open\",\"event_time\":\"1540601000\"}")]
write_df = spark.createDataFrame(write_row, write_schema)

(write_df
  .write
  .format("eventhubs")
  .options(**conf)
  .save())

After completed, cancel (stop) previous jobs.