Skip to content

tspannhw/pulsar-mastodon-sink

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

pulsar-mastodon-sink

Mastodon data streaming

Run the app

python3 stream.py

2023-01-17 17:28:17.197 INFO  [0x16ec2b000] HandlerBase:72 | [persistent://public/default/mastodon-partition-0, ] Getting connection from pool
2023-01-17 17:28:17.200 INFO  [0x16ec2b000] ProducerImpl:190 | [persistent://public/default/mastodon-partition-0, ] Created producer on broker [127.0.0.1:56776 -> 127.0.0.1:6650]
20230117222817.0

JSON Schema

class mastodondata(Record):
    language = String()
    created_at = String()
    ts = Float()
    uuid = String()
    uri = String()
    url = String()
    favourites_count = Integer()
    replies_count = Integer()
    reblogs_count = Integer()
    content = String()
    username = String()
    accountname = String()
    displayname = String()
    note = String()
    followers_count = Integer()
    statuses_count = Integer()



Schemafied Data

bin/pulsar-client consume "persistent://public/default/mastodon" -s "mreader2" -n 0

----- got message -----
key:[20230117222421_069e9afb-7b4a-481e-b5d3-40c03e421415], properties:[], content:{
 "language": "ja",
 "created_at": "2023-01-17 22:24:18+00:00",
 "ts": 20230117222421.0,
 "uuid": "20230117222421_069e9afb-7b4a-481e-b5d3-40c03e421415",
 "uri": "https://mstdn.jp/users/hikara/statuses/109706887715535436",
 "url": "https://mstdn.jp/@hikara/109706887715535436",
 "favourites_count": 0,
 "replies_count": 0,
 "reblogs_count": 0,
 "content": "<p>\u3093\u30fc\u307e\u3063\uff01</p>",
 "username": "hikara",
 "accountname": "hikara@mstdn.jp",
 "displayname": "\u3070\u3076\u30732023",
 "note": "<p>B70 \u541b\u306e\u6b66\u52c7\u4f1d\u3067\u3059</p>",
 "followers_count": 1640,
 "statuses_count": 51576
}

----- got message -----
key:[20230117223725_f68a866e-1df1-4d5d-b977-eb151535f240], properties:[], content:{
 "language": "en",
 "created_at": "2023-01-17 22:37:25.347000+00:00",
 "ts": 20230117223725.0,
 "uuid": "20230117223725_f68a866e-1df1-4d5d-b977-eb151535f240",
 "uri": "https://mastodon.social/users/PaaSDev/statuses/109706939291797415",
 "url": "https://mastodon.social/@PaaSDev/109706939291797415",
 "favourites_count": 0,
 "replies_count": 0,
 "reblogs_count": 0,
 "content": "<p>I am working on an Apache Pulsar streaming application in Python to ingest mastodon messages.   <a href=\"https://github.com/tspannhw/pulsar-mastodon-sink\" target=\"_blank\" rel=\"nofollow noopener noreferrer\"><span class=\"invisible\">https://</span><span class=\"ellipsis\">github.com/tspannhw/pulsar-mas</span><span class=\"invisible\">todon-sink</span></a></p>",
 "username": "PaaSDev",
 "accountname": "PaaSDev",
 "displayname": "",
 "note": "",
 "followers_count": 0,
 "statuses_count": 1
}

Get Schema from Pulsar Schema Registry


bin/pulsar-admin schemas get persistent://public/default/mastodo
n
{
  "version": 5,
  "schemaInfo": {
    "name": "mastodon",
    "schema": {
      "type": "record",
      "name": "mastodondata",
      "fields": [
        {
          "name": "language",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "created_at",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "ts",
          "type": [
            "null",
            "float"
          ]
        },
        {
          "name": "uuid",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "uri",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "url",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "favourites_count",
          "type": [
            "null",
            "int"
          ]
        },
        {
          "name": "replies_count",
          "type": [
            "null",
            "int"
          ]
        },
        {
          "name": "reblogs_count",
          "type": [
            "null",
            "int"
          ]
        },
        {
          "name": "content",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "username",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "accountname",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "displayname",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "note",
          "type": [
            "null",
            "string"
          ]
        },
        {
          "name": "followers_count",
          "type": [
            "null",
            "int"
          ]
        },
        {
          "name": "statuses_count",
          "type": [
            "null",
            "int"
          ]
        }
      ]
    },
    "type": "JSON",
    "properties": {}
  }
}

REFERENCE

https://hachyderm.io/api/v1/streaming/public

About

Mastodon data streaming

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages