Add efficient webhook support to your Lemmy instance. Especially useful for bots and AutoModerators.
- Lemmy Webhooks
Make the docker image part of your docker-compose stack, add this to your compose file:
services:
# ...
redis:
image: redis
ports: # you don't need to bind ports if you don't want to
- 6379:6379
webhooks:
image: ghcr.io/rikudousage/lemmy-webhook:latest
environment:
- LEMMY_HOST=postgres # the hostname of the postgres database
- REDIS_HOST=redis # the hostname of the redis server, you can use the above redis container if you define it as part of this stack
- LEMMY_PASSWORD=superSecr3t # the password to the postgres database
- API_REGISTRATION_ENABLED=1 # whether to allow users to register themselves via the api
- CORS_ALLOW_ORIGIN=^.*$$ # a regex for cors (you need to escape $ with another $)
- LARGE_PAYLOAD_SIZE=1024 # payloads larger than this size (in bytes) will be stored in a temporary table instead of fed directly to the consumer, default is 4096. If set to 0, all payloads will be stored.
ports:
- 8080:80 # you can skip this, if you don't use the management api
volumes:
- ./volumes/database:/opt/database # bind a directory where the SQLite database will be created
Afterwards, run docker-compose up -d
and you're done!
The
LARGE_PAYLOAD_SIZE
is important to avoid "payload string too long" errors in Postgres. By default, Postgres allows 8000 bytes in the payload. You can set this to 0 to send every payload into the table first.
You can either use the api, or insert webhooks directly into the database. You can read more on the api at a separate readme.
The table is quite simple and consists of these fields:
url
- the URL of the webhookmethod
- can beGET
,POST
,PATCH
,DELETE
,PUT
(taken from the RequestMethod enum)body_expression
(optional) - an expression that will be converted to JSON and sent as a body of the request, more on expressions belowfilter_expression
(optional) - an expression that must evaluate to true if this webhook is to run, more on expressions belowobject_type
- the type of object this webhook is interested in, currently:post
comment
instance
private_message
(onlyINSERT
operation)person
registration_application
private_message_report
local_user
community_follower
- a subscription by a user to a community
operation
(optional) - the kind of operation this webhook is interested in, can beINSERT
,UPDATE
,DELETE
(taken from the DatabaseOperation enum)headers
(optional) - a JSON object with keys as header names and values as header valuesenhanced_filter
(optional) - an expression that must evaluate to true if this webhook is to run, more on expressions belowenabled
- whether the webhook is enabled or not
Expressions allow better interaction with the webhooks, for example filtering and setting the request body.
The basic syntax is very similar to JavaScript.
In every expression you have access to the data
variable which contains the fields of the object the webhook was triggered for.
This is an example data object:
{
"timestamp": {
"date": "2024-01-05 23:15:09.811926",
"timezone_type": 1,
"timezone": "+00:00"
},
"operation": "INSERT",
"schema": "public",
"table": "comment",
"data": {
"id": 4763628,
"creatorId": 2,
"postId": 4435272,
"content": "teeest",
"removed": false,
"deleted": false,
"apId": "http://changeme.invalid/52570b072a832e6a986330de",
"local": true,
"distinguished": false,
"path": "0.123.456"
},
"previous": null
}
Note that the
timestamp
property is in fact a PHP DateTimeImmutable object, including its methods and properties, the above is just its JSON representation.
So for example, if you only want to trigger a webhook for comments by local users, you would use this as your filter expression:
data.data.local
The timestamp
, operation
, schema
and table
properties have the same structure for every type of object, but the data
property varies
based on what you're being notified about. Here's a list of all table
values currently possible and link to the DTO that will be passed as the
data
property:
post
- PostDatacomment
- CommentDatainstance
- InstanceDataprivate_message
- PrivateMessageDataperson
- PersonDataregistration_application
- RegistrationApplicationDataprivate_message_report
- PrivateMessageReportDatalocal_user
- LocalUserDatacommunity_follower
- CommunitySubscriptionData
If the operation is an UPDATE, you'll also get access to the previous
property which contains the data from the previous version of the object.
If the operation is not an UPDATE, the previous
property is null
.
There are two kinds of expressions, basic and enhanced. Enhanced expressions have access to additional functions
for interacting with the database, while simple expressions are limited to accessing only the data
variable and
a few simple functions.
Simple expressions have access to these functions:
lowercase(text)
- returns the string converted to lowercasetransliterate(text)
- returns the string transliterated to standard latin characters:- example:
transliterate("Hélľö, hów ärě ýöů?")
->Hello, how are you?
- example:
transliterate("𝐻𝐞𝒍𝓁𝓸 𝔱𝕙𝖊𝗋𝚎!")
->Hello there!
- example:
merge(arrayOrDictionary1, arrayOrDictionary2, ..., arrayOrDictionaryN)
- recursively merges an arbitrary number of arrays or dictionariescomment_parent_id(commentOrPath)
- returns the comment's parent id as an integer or null if it's a top level comment, can accept either the whole comment data object, or just the path
Note: Previous version contained the function
string_contains
. The function still exists for backwards compatibility, but shouldn't be used for new stuff, instead use the built-incontains
like this:"some string" contains "another string"
, e.g.data.data.content contains '@my_bot@my_instance'
Enhanced expressions, in addition to the above, have access to these functions:
community(communityId)
- returns the CommunityData DTO for community with given ID (or null if no such community exists)instance(instanceId)
- returns the InstanceData DTO for instance with given ID (or null if no such instance exists)post(postId)
- returns the PostData DTO for post with given ID (or null if no such post exists)person(personId)
- returns the PersonData DTO for a person with given ID (or null if no such person exists)comment(commentId)
- returns the CommentData DTO for a comment with given ID (or null if no such comment exists)local_user(userId)
- returns the LocalUserData DTO for a local user with given ID (or null if no such user exists)private_message(privateMessageId)
- returns the PrivateMessageData DTO for a private message with given ID (or null if no such private message exists)global_ban(personId)
- returns a ModBanData DTO for the given user ornull
if no ban exists
note that in all the cases above, null will also be returned if you don't have permission to access any of the given object types
Simple expressions can be used everywhere, but enhanced expressions cannot be used in the filter_expression
field.
That's because filter_expression
runs synchronously on the main thread and could potentially block further processing if it took too long.
If you need to filter on more complex expressions, you can use the enhanced_filter
field. You can also use both fields,
it will be first filtered based on filter_expression
on the main thread and then on the enhanced_filter
in the worker thread.
The filter expressions use the Symfony ExpressionLanguage, read more on the syntax in the official documentation.
data.data.local
!data.data.local
data.data.creatorId === 2
lowercase(data.data.content) contains "@chatgpt@lemmings.world"
(I use that one for my ChatGPT bot)
data.data.content !== data.previous.content
Lemmy first creates the comment with placeholder values, for example
path
is always0
for INSERT. You can use this expression to only trigger when the final path has been resolved.
data.data.path !== data.previous.path
data
{title: data.data.name, hasUrl: data.data.url !== null}
{id: data.data.id, banReason: global_ban(data.data.id)?.reason}
{
title: data.data.name,
community: community(data.data.communityId).name,
instance: instance(community(data.data.communityId).instanceId).domain
}
{commentId: data.data.id, mentionedBot: "ChatGPT@lemmings.world"}
(I use that one for my ChatGPT bot)
instance(community(post(data.data.postId).communityId).instanceId).domain === 'my.instance.org'
The webhooks work by first filtering based on your operation and type criteria, meaning if a new post is created,
all webhooks that are created with post
as the value of object_type
and INSERT
as operation
(or without any operation
specified) will be fetched.
Afterwards all webhooks are checked for their filter_expression
, if it evaluates to true
, the webhook is triggered in a worker.
The worker then checks for the result of enhanced_filter
expression and continues only if it evaluates to true.
A http request is then constructed with optional body (from body_expression
) and headers.
So, this is a full SQL insert for getting only new local posts using a POST request:
INSERT INTO webhooks (url, method, body_expression, filter_expression, object_type, operation, headers, enhanced_filter)
VALUES ('https://example.com/webhook', 'POST', 'data.data', 'data.data.local', 'comment', 'INSERT', null, null);