# Consuming Bluesky's Jetstream Websocket

In this notebook we want to consume Bluesky's Jetstream Websocket and insert the events into a Redis Stream.

The Jetstream Websocket is a stream of events that are sent by the Bluesky server. The events are sent in JSON format and can be consumed by any client that supports Websockets.

Redis Streams are a data structure that allows you to store and consume a stream of events. They are similar to Kafka topics, but are much simpler to use. Redis Streams are a great way to store and consume events in a distributed system.

## Consuming Bluesky's Jetstream Websocket

Reusable function to consume the Jetstream Websocket.

- This function will connect to the Jetstream Websocket and listen for events.
- The events will be passed to the `onEvent` function.
- The function will stop consuming events after `limit` events have been received.

In [21]:
%use coroutines
%use serialization

In [22]:
import kotlinx.serialization.SerialName
import kotlinx.serialization.Serializable

@Serializable
data class JetStreamEvent(
    val did: String,
    @SerialName("time_us") val timeUs: Long,
    val kind: String? = null,
    val commit: Commit? = null
) {
    @Serializable
    data class Commit(
        val rev: String? = null,
        val operation: String? = null,
        val collection: String? = null,
        val rkey: String? = null,
        val record: Record? = null,
        val cid: String? = null
    )

    @Serializable
    data class Record(
        @SerialName("\$type") val type: String? = null,
        val timeUs: String? = null,
        val text: String? = null,
        val langs: List<String>? = null,
        val facets: List<Facet>? = null,
        val reply: Reply? = null,
        val embed: Embed? = null
    )

    @Serializable
    data class Reply(
        val parent: PostRef? = null,
        val root: PostRef? = null
    )

    @Serializable
    data class PostRef(
        val cid: String? = null,
        val uri: String? = null
    )

    @Serializable
    data class Facet(
        @SerialName("\$type") val type: String? = null,
        val features: List<Feature>? = null,
        val index: Index? = null
    )

    @Serializable
    data class Feature(
        @SerialName("\$type") val type: String? = null,
        val did: String? = null
    )

    @Serializable
    data class Index(
        val byteStart: Int? = null,
        val byteEnd: Int? = null
    )

    @Serializable
    data class Embed(
        @SerialName("\$type") val type: String? = null,
        val images: List<EmbedImage>? = null
    )

    @Serializable
    data class EmbedImage(
        val alt: String? = null,
        val aspectRatio: AspectRatio? = null,
        val image: Image? = null
    )

    @Serializable
    data class AspectRatio(
        val height: Int? = null,
        val width: Int? = null
    )

    @Serializable
    data class Image(
        @SerialName("\$type") val type: String? = null,
        val ref: Ref? = null,
        val mimeType: String? = null,
        val size: Int? = null
    )

    @Serializable
    data class Ref(
        @SerialName("\$link") val link: String? = null
    )
}

In [23]:
import dev.raphaeldelio.*
import io.ktor.client.plugins.websocket.webSocket
import io.ktor.websocket.Frame
import io.ktor.websocket.readText

suspend fun consumeJetstream(limit: Int = 1000, onEvent: (JetStreamEvent) -> Unit) {
    webSocketClient.webSocket("wss://jetstream2.us-east.bsky.network/subscribe?wantedCollections=app.bsky.feed.post") {
        repeat(limit) {
            incoming.receive().let { message ->
                if (message is Frame.Text) {
                    val event: JetStreamEvent = jsonParser.decodeFromString<JetStreamEvent>(message.readText())
                    onEvent(event)
                }
            }
        }
    }
}

Example of consuming the Jetstream Websocket and printing the event's `did` and `text` fields.

In [24]:
runBlocking {
    consumeJetstream(limit = 5) { event ->
        println("${event.did}-${event.commit?.record?.text}")
    }
}

did:plc:iympv5mtzubu54r6bqpn4gni-Lovely butt and beautiful soles
did:plc:ceotnfex74ghkz3yvdadgxvp-but with a kick in the head like putting wings on lead
did:plc:jcthpxmtx7ott5uroe2ietsm-
did:plc:dux2vh2wiybyg2qdvd44nemv-Mais a fica difícil te ajudar né Leona! Você fica falando que a prioridade é o emprego mais na primeira oportunidade pega o dono do patrão na porta da casa?!? O mulher podia ser mais inteligente né!! 

#DonadeMim
did:plc:s6vhyuqnexl5r6cp76d4c4nw-Biri tarafından yok sayıldığınızı farkettiğiniz de, onu bir daha rahatsız etmeyin.
Virginia Woolf👌


## Inserting into Redis Streams

Redis Streams are a data structure that allows you to store and consume a stream of events. They are similar to Kafka topics, but are much simpler to use. Redis Streams are a great way to store and consume events in a distributed system.

To connect to Redis, we're going to use Jedis, a Java client for Redis. Jedis is a simple and easy to use client that supports all Redis commands.

In [25]:
@file:DependsOn("redis.clients:jedis:6.0.0")

Creating a reusable Jedis client.

JedisPooled is a connection pool that allows you to create multiple connections to Redis. This is useful when you want to create multiple threads that can access Redis at the same time.

In [26]:
import redis.clients.jedis.JedisPooled
val jedisPooled = JedisPooled()

Redis Streams is a stream of hashes:
- Each hash is a map of key-value pairs.
- The keys are strings and the values are strings.
- The keys are used to identify the fields in the hash and the values are the data that is stored in the hash.

Let's create an extension function to convert the Event object to a Map<String, String> (Hash) for Redis Streams.

In [27]:
fun JetStreamEvent.toMap() = mapOf(
        "did" to this.did,
        "timeUs" to this.commit?.record?.timeUs.toString(),
        "text" to this.commit?.record?.text.toString(),
        "langs" to this.commit?.record?.langs.toString(),
        "operation" to this.commit?.operation.toString(),
        "rkey" to this.commit?.rkey.toString(),
        "parentUri" to (this.commit?.record?.reply?.parent?.uri ?: ""),
        "rootUri" to (this.commit?.record?.reply?.root?.uri ?: ""),
        "uri" to "at://${this.did}/app.bsky.feed.post/${this.commit?.rkey}",
    )

To add an entry to a Redis Stream, we need to use the `XADD` command:

`XADD streamName id entry [field value] [field value] ...`

Let's create a function that encapsulates the `XADD` command and takes a stream name and a hash as parameters:

In [28]:
import redis.clients.jedis.StreamEntryID
import redis.clients.jedis.params.XAddParams

fun addToStream(streamName: String, hash: Map<String, String>) {
    jedisPooled.xadd(
        streamName,
        XAddParams.xAddParams().id(StreamEntryID.NEW_ENTRY),
        hash
    )
}

Now let's consume the Jetstream Websocket and insert the events into a Redis Stream using the function `addToStream`.

In [31]:
runBlocking {
    consumeJetstream(limit = 4000) { event ->
        addToStream("jetstream", event.toMap())
    }
}

Open Redis Insight and show the stream.