Implement a RabbitMQ-based AMQP connector #3546

mavam · 2023-10-02T08:39:06Z

This PR adds a RabbitMQ connector, making it possible to produce and consume messages via AMQP.

Definition of Done

Give feedback

Implement saver

Implement saver
Options
Successfully updated the issue's project

There was an error updating the issue's project
Implement loader

Implement loader
Options
Successfully updated the issue's project

There was an error updating the issue's project
Write operator docs

Write operator docs
Options
Successfully updated the issue's project

There was an error updating the issue's project
Get CI to build

Get CI to build
Options
Successfully updated the issue's project

There was an error updating the issue's project
Add `<url>` as optional positional argument

Add <url> as optional positional argument
Options
Successfully updated the issue's project

There was an error updating the issue's project
Expose specific consumer and producer controls

Expose specific consumer and producer controls
Options
Successfully updated the issue's project

There was an error updating the issue's project
Fix confusion of routing key vs. queue name

Fix confusion of routing key vs. queue name
Options
Successfully updated the issue's project

There was an error updating the issue's project
Consider renaming plugin to `amqp`

Consider renaming plugin to amqp
Options
Successfully updated the issue's project

There was an error updating the issue's project
Add changelog entry

Add changelog entry
Options
Successfully updated the issue's project

There was an error updating the issue's project
Implement `--set` options (after merging `fluent-bit` PR that factors the implementation)

Implement --set options (after merging fluent-bit PR that factors the implementation)
Options
Successfully updated the issue's project

There was an error updating the issue's project
Options

mavam · 2023-10-03T05:07:28Z

@satta not urgent, but when you get a chance I'd be curious to hear your feedback on the rabbitmq connector. When I went through the C library abstractions, it felt pretty natural to implement a source as consumer and a sink a producer. Does that work for you?

Also, I've exposed only a few basic tuning knobs, like queue, exchange, and channel in the operator invocation itself. See the README at https://github.com/tenzir/tenzir/blob/cc6bfdfc5c9a32b4dc948a7a7e2ebeeff2285999/web/docs/connectors/rabbitmq.md.

Settings like hostname, port, vhost, etc., are orthogonal and part of the plugin-specific configuration file, as my intuition is that these mostly shared between operators. As with kafka, it's possible to override them via --set as well.

Finally, I thought about adding an optional, positional argument that is a URL of the form:

amqp://[$USERNAME[:$PASSWORD]\@]$HOST[:$PORT]/[$VHOST]

I'm not sure how useful it is right now. Exposing this probably only makes sense when using multiple different RabbitMQ deployments. So I'd only add that if it's really needed and a pain to work with --set or the config file, as it clutters the operator invocation.

satta · 2023-10-03T08:50:23Z

@satta not urgent, but when you get a chance I'd be curious to hear your feedback on the rabbitmq connector. When I went through the C library abstractions, it felt pretty natural to implement a source as consumer and a sink a producer. Does that work for you?

Sounds good to me, but I guess I'll need to play around with it to get a better idea of how it feels in practice.

Also, I've exposed only a few basic tuning knobs, like queue, exchange, and channel in the operator invocation itself. See the README at https://github.com/tenzir/tenzir/blob/cc6bfdfc5c9a32b4dc948a7a7e2ebeeff2285999/web/docs/connectors/rabbitmq.md.

Yup, these are the most important ones. Definitely mandatory is some way of specifying if a queue is to be temporary (i.e. removed when the client disconnects) or persistent (storing incoming deliveries until the client comes back). Both styles of behaviour can be useful in practice. This configuration is usually done with the 'auto_delete' option that is either used when declaring a queue, or via a server-side policy (which is based on a name pattern and does not require any client parameterization). I'll have to test whether this parameter can be specified on the client side using the --set parameter. Is there a list of the supported keys allowed in this context? The README still lists XXX.

Also note that declaring queues only makes sense for sources; sinks (i.e. components that send to RabbitMQ) can only specify exchanges; this should be appropriately handled. I haven't looked at the code yet, but that's one point that comes to mind.

Settings like hostname, port, vhost, etc., are orthogonal and part of the plugin-specific configuration file, as my intuition is that these mostly shared between operators. As with kafka, it's possible to override them via --set as well.

This is maybe a simplification that suggests a bit too obviously how the plugin should be used: it kind of assumes that there is one cluster per node that is usually interacted with. Might be the case for me, but maybe not for everyone.

Finally, I thought about adding an optional, positional argument that is a URL of the form:
amqp://[$USERNAME[:$PASSWORD]\@]$HOST[:$PORT]/[$VHOST]

That would probably be better (and also what many other client interfaces do). I'd prefer this, but I can see one also has to keep clarity of the operators in mind.

I'll try to get my build environment up and running to test it out and send/receive some data. It might also be helpful to also allow access to message headers, i.e. to determine if, for instance, the payload needs to be decompressed or not. Not sure how such OOB values would fit into the pipeline pattern Tenzir uses.

satta · 2023-10-03T11:38:41Z

FTR: In order to get the plugin to build with Debian's librabbitmq-dev (which is rabbitmq-c) I had to change some paths since in Debian the headers are named differently:

diff --git a/plugins/rabbitmq/src/plugin.cpp b/plugins/rabbitmq/src/plugin.cpp
index 79c91a06d9..bfe1d13c67 100644
--- a/plugins/rabbitmq/src/plugin.cpp
+++ b/plugins/rabbitmq/src/plugin.cpp
@@ -12,8 +12,8 @@
 
 #include <caf/expected.hpp>
 
-#include <rabbitmq-c/amqp.h>
-#include <rabbitmq-c/tcp_socket.h>
+#include <amqp.h>
+#include <amqp_tcp_socket.h>
 
 using namespace std::chrono_literals;

mavam · 2023-10-03T11:55:51Z

Definitely mandatory is some way of specifying if a queue is to be temporary (i.e. removed when the client disconnects) or persistent (storing incoming deliveries until the client comes back). Both styles of behaviour can be useful in practice. This configuration is usually done with the 'auto_delete' option that is either used when declaring a queue, or via a server-side policy (which is based on a name pattern and does not require any client parameterization).

These are currently hard-coded for the consumer before I declare the queue:

    auto passive = amqp_boolean_t{0};
    auto durable = amqp_boolean_t{0};
    auto exclusive = amqp_boolean_t{0};
    auto auto_delete = amqp_boolean_t{1};
    // and before consume
    auto no_local = amqp_boolean_t{0};
    auto no_ack = amqp_boolean_t{1};

And this for the producer:

    auto mandatory = amqp_boolean_t{0};
    auto immediate = amqp_boolean_t{0};

Which of those should I expose to the operator?

Also note that declaring queues only makes sense for sources; sinks (i.e. components that send to RabbitMQ) can only specify exchanges; this should be appropriately handled. I haven't looked at the code yet, but that's one point that comes to mind.

Yep, that's the way I've implemented it.

That would probably be better (and also what many other client interfaces do). I'd prefer this, but I can see one also has to keep clarity of the operators in mind.

Okay, I'll make the URL as optional positional argument.

It might also be helpful to also allow access to message headers, i.e. to determine if, for instance, the payload needs to be decompressed or not. Not sure how such OOB values would fit into the pipeline pattern Tenzir uses.

We can expose the list of headers simply in the schema layout, e.g., as headers: list<record<key: string, value: string>> or whatever the headers look like. We can also make it such that only a given flag introduces the extra headers into the schema.

mavam · 2023-10-03T12:02:17Z

I'll have to test whether this parameter can be specified on the client side using the --set parameter. Is there a list of the supported keys allowed in this context? The README still lists XXX.

Fixed this. Now looks as follows:

satta · 2023-10-03T12:17:26Z

Definitely mandatory is some way of specifying if a queue is to be temporary (i.e. removed when the client disconnects) or persistent (storing incoming deliveries until the client comes back). Both styles of behaviour can be useful in practice. This configuration is usually done with the 'auto_delete' option that is either used when declaring a queue, or via a server-side policy (which is based on a name pattern and does not require any client parameterization).

These are currently hard-coded for the consumer before I declare the queue:
    auto passive = amqp_boolean_t{0};
    auto durable = amqp_boolean_t{0};
    auto exclusive = amqp_boolean_t{0};
    auto auto_delete = amqp_boolean_t{1};
    // and before consume
    auto no_local = amqp_boolean_t{0};
    auto no_ack = amqp_boolean_t{1};
And this for the producer:
    auto mandatory = amqp_boolean_t{0};
    auto immediate = amqp_boolean_t{0};
Which of those should I expose to the operator?

I'd suggest all of them! They are required to implement various use cases in which the connecting client plays a specific role and needs to behave accordingly.

It's surely fine to set the defaults as above (corresponding to a typical temporary consumer) but they should be adjustable IMHO.

We'd also need a routing key for the consumer, in case one wants to bind to a topic exchange.

Also note that declaring queues only makes sense for sources; sinks (i.e. components that send to RabbitMQ) can only specify exchanges; this should be appropriately handled. I haven't looked at the code yet, but that's one point that comes to mind.

Yep, that's the way I've implemented it.

Great 👍🏻

It might also be helpful to also allow access to message headers, i.e. to determine if, for instance, the payload needs to be decompressed or not. Not sure how such OOB values would fit into the pipeline pattern Tenzir uses.

We can expose the list of headers simply in the schema layout, e.g., as headers: list<record<key: string, value: string>> or whatever the headers look like. We can also make it such that only a given flag introduces the extra headers into the schema.

The headers are, IIRC, indeed key-value pairs of strings. So that would work.

satta · 2023-10-03T12:42:45Z

I have not yet been able to receive data from a local instance. The connection and channel are set up, but queue handling still seems to be a bit off.

I'm not sure why amqp_queue_declare() appears to be always called with an empty queue name, and the queue name is later used as the routing_key to bind to the exchange (which are different things). The routing key is not used with all exchange types, and BTW should also be configurable when one wants to bind to an exchange dynamically.

Also, if one is not using a temporary queue, one might not need to care about the exchange since the binding has already been set up and one only needs the queue name.

I'd suggest to:

Use the given queue name when declaring the queue. Only use declare->queue (which I assume is the automatically given name for an anonymous queue) if no queue name has been specified.
Make the exchange parameter optional. Only accept a routing key parameter and bind to an exchange when an exchange has been set -- because that means that the client wants to set up the binding.

satta · 2023-10-03T12:52:41Z

I also tried:

diff --git a/plugins/rabbitmq/src/plugin.cpp b/plugins/rabbitmq/src/plugin.cpp
index 79c91a06d9..6584218b79 100644
--- a/plugins/rabbitmq/src/plugin.cpp
+++ b/plugins/rabbitmq/src/plugin.cpp
@@ -237,14 +237,14 @@ public:
   // TODO: need a better name for this function.
   auto consume(amqp_channel_t channel, std::string_view exchange,
                std::string_view queue) -> caf::error {
-    TENZIR_DEBUG("declaring queue");
+    TENZIR_DEBUG("declaring queue {}", queue);
     auto passive = amqp_boolean_t{0};
     auto durable = amqp_boolean_t{0};
     auto exclusive = amqp_boolean_t{0};
     auto auto_delete = amqp_boolean_t{1};
     auto arguments = amqp_empty_table;
     auto* declare
-      = amqp_queue_declare(conn_, channel, amqp_empty_bytes, passive, durable,
+      = amqp_queue_declare(conn_, channel, as_amqp_bytes(queue), passive, durable,
                            exclusive, auto_delete, arguments);
     if (auto err = to_error(amqp_get_rpc_reply(conn_)))
       return err;

but just got a segfault, apparently during error logging:

...
Thread 18 "caf.thread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffda7f26c0 (LWP 2150108)]
0x00007fffe1137c51 in operator()(_ZZNK6tenzir7plugins8rabbitmq12_GLOBAL__N_115rabbitmq_loader11instantiateERNS_22operator_control_planeEENUlNS2_14connector_argsENS2_11amqp_engineEE_clES6_S7_.Frame *) (frame_ptr=0x616000140480)
    at ./plugins/rabbitmq/src/plugin.cpp:425
...

mavam · 2023-10-03T12:58:15Z

I'm not sure why amqp_queue_declare() appears to be always called with an empty queue name, and the queue name is later used as the routing_key to bind to the exchange (which are different things). The routing key is not used with all exchange types, and BTW should also be configurable when one wants to bind to an exchange dynamically.

I hear you. I took this approach from official example at https://github.com/alanxz/rabbitmq-c/blob/master/examples/amqp_consumer.c. It highly confused me as well. I thought "it's the way to do it" but I am not surprised it doesn't work.

However, this worked for me locally after starting RabbitMQ:

tenzir 'from rabbitmq'
tenzir 'show operators | to rabbitmq'

I'm not quite sure what to make of it. At first I wanted to get the scaffold in place before going deeper. I guess it's now time to do that. :-)

mavam · 2023-10-03T19:20:55Z

@satta I exposed a bunch more options for saver and loader, plus added the ability to provide an optional URL. Mind taking a look at the Markdown file whether the new options work for you?

I'll take a look at the routing-key vs. queue name next.

EDIT: I fixed the confusion of declared queues vs. routing keys. We now only allow setting routing keys for the publisher, and queue name plus routing key for the consumer. Moreover, the default queue name is the empty string, resulting and randomly generated queue names by the AMQP server. The -q flag allows for setting a dedicated queue name.

The -r flag defaults to tenzir. Should it be the empty string?

mavam · 2023-10-04T04:14:46Z

@satta we have a problem with injecting headers into the loader: the reason is that the loader only forwards blocks of bytes to a parser. However, the headers are structured data that we can't simply add in the current framework. For example, if the payload is CSV or JSON, there might be a chance to simply add header fields. But what do you do if the payload is a PCAP file? It's simply not possible to inject structured data into a stream of bytes.

I'm not quite sure how we can solve this.

Perhaps a side-channel, so that a loader can also produce events that will be fed into the downstream operator? But then we lose the binding of the headers to a message.
Expose the headers as metrics? Then we can't make in-stream decisions based on their values.

I would need to understand how important this is. Then we should discuss the solution space with @dominiklohmann. The only option I see currently is we make it possible to pass structured, per-chunk metadata when we communicate between connector and format.

mavam · 2023-10-04T04:16:44Z

@tobim mind taking a quick look at the Nix build failure in CI? I can't make sense of the seemingly unrelated Perl linker errors.

tobim · 2023-10-04T07:02:06Z

@tobim mind taking a quick look at the Nix build failure in CI? I can't make sense of the seemingly unrelated Perl linker errors.

That seems to be an error in the rabbitmq-c package definition. The following patch should fix it:

diff --git a/nix/overlay.nix b/nix/overlay.nix
index 09fdf28ece..b77cebadf8 100644
--- a/nix/overlay.nix
+++ b/nix/overlay.nix
@@ -182,6 +182,13 @@ in {
         configureFlags = old.configureFlags ++ ["--enable-prof" "--enable-stats"];
         doCheck = !isStatic;
       });
+  rabbitmq-c =
+    if !isStatic
+    then prev.rabbitmq-c
+    else
+      prev.rabbitmq-c.override {
+        xmlto = null;
+      };
   tenzir-source = inputs.nix-filter.lib.filter {
     root = ./..;
     include = [

Co-authored-by: Tobias Mayer <tobim@fastmail.fm>

satta · 2023-10-31T09:26:47Z

tcache_bin_flush_edatas_lookup seems jemalloc-related. I installed jemalloc-dev (5.3.0) from Debian, maybe that caused an issue. Will rebuild with jemalloc disabled.

dominiklohmann · 2023-10-31T11:02:13Z

@satta there's a known issue with the AWS C++ SDK used in Arrow before version 13; can you double-check whether you're running the newest version of Arrow?

satta · 2023-10-31T11:28:41Z

@satta there's a known issue with the AWS C++ SDK used in Arrow before version 13; can you double-check whether you're running the newest version of Arrow?

At least I'm on 13:

-- Arrow version: 13.0.0
-- Found the Arrow shared library: /usr/lib/x86_64-linux-gnu/libarrow.so.1300.0.0
-- Found the Arrow import library: ARROW_IMPORT_LIB-NOTFOUND
-- Found the Arrow static library: /usr/lib/x86_64-linux-gnu/libarrow.a

Disabling jemalloc didn't make a difference.

satta · 2023-11-01T22:04:22Z

FYI regarding the segfault, removing libtenzir/builtins/connectors/s3.cpp made it work again for me. Not a permanent solution of course but allowing me to continue testing.

mavam · 2023-11-02T06:16:14Z

@Dakostu is there an upstream issue tracking this leak?

Dakostu · 2023-11-02T07:58:35Z

@mavam @satta
(testing this on Linux)
My Arrow version is also 13.0.0:

[cmake] -- Arrow version: 13.0.0
[cmake] -- Found the Arrow shared library: /usr/local/lib/libarrow.so.1300.0.0
[cmake] -- Found the Arrow import library: ARROW_IMPORT_LIB-NOTFOUND
[cmake] -- Found the Arrow static library:

For some reason, my static library doesn't get shown? Not available? Nonetheless, I was able to build this branch and launch the tenzir-node without any segfaults:

./tenzir-node
      _____ _____ _   _ ________ ____       
     |_   _| ____| \ | |__  /_ _|  _ \      
       | | |  _| |  \| | / / | || |_) |     
       | | | |___| |\  |/ /_ | ||  _ <      
       |_| |_____|_| \_/____|___|_| \_\     

        v4.3.0-157-g125d873746-dirty        
Visit https://app.tenzir.com to get started.

[08:50:10.333] loaded configuration file: /home/dakostu/.config/tenzir/tenzir.yaml
[08:50:10.726] node is listening on 127.0.0.1:5158

And the s3 connector works for me:

./tenzir 'from s3 sentinel-cogs/sentinel-s2-l2a-cogs/1/C/CV/2023/1/S2B_1CCV_20230101_0_L2A/tileinfo_metadata.json | write json'        
[08:51:45.209] loaded configuration file: /home/dakostu/.config/tenzir/tenzir.yaml
{
  "path": "tiles/1/C/CV/2023/1/1/0",
  "timestamp": "2023-01-01T21:05:55.632000",
  "utmZone": 1,
(...)

For the s3 connector, we just use two calls to arrow's s3 filesystem to (de)initialize everything. Any issues arising "under the hood" of Arrow - especially the strange arrow_strptime call chain - is unfortunately out of our hands.

@satta Just to make sure - did you install Arrow manually? What were the options for your Arrow installation?

Dakostu · 2023-11-02T08:09:52Z

@mavam No upstream issue yet. This smells like an issue for the Arrow repository to me. But I need to know how @satta's Arrow set up looks like because I can't reproduce this.

satta · 2023-11-02T08:44:49Z

@satta Just to make sure - did you install Arrow manually? What were the options for your Arrow installation?

I got the debs from their repo:

$ cat /etc/apt/sources.list.d/apache-arrow.sources
Types: deb deb-src
URIs: https://apache.jfrog.io/artifactory/arrow/debian/
Suites: bookworm
Components: main
Signed-By: /usr/share/keyrings/apache-arrow-apt-source.gpg

$ dpkg -l libarrow1300 libarrow-dev
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name               Version      Architecture Description
+++-==================-============-============-======================================================
ii  libarrow-dev:amd64 13.0.0-1     amd64        Apache Arrow is a data processing library for analysis
ii  libarrow1300:amd64 13.0.0-1     amd64        Apache Arrow is a data processing library for analysis

in Debian, the -dev packages usually ship the headers, the version-independent symlink to the library and also a static library. Since I installed the libarrow-dev package I have it.

BTW Arrow 14 seems to be available just now from the repo. Would it make sense to try that?

Dakostu · 2023-11-02T08:59:58Z

Our Dockerfile & CI on GitHub are also using Debian packages to install Arrow (even the same version), and during automated tests the segfault does not happen. Something seems strange here.
@satta Installing Arrow 14 might be worth a try, even though I haven't tried it myself.

satta · 2023-11-02T09:05:29Z

Our Dockerfile & CI on GitHub are also using Debian packages to install Arrow (even the same version), and during automated tests the segfault does not happen. Something seems strange here. @satta Installing Arrow 14 might be worth a try, even though I haven't tried it myself.

OK I'll give it a try. Here we go:

-- Arrow version: 14.0.0
-- Found the Arrow shared library: /usr/lib/x86_64-linux-gnu/libarrow.so.1400.0.0
-- Found the Arrow import library: ARROW_IMPORT_LIB-NOTFOUND
-- Found the Arrow static library: /usr/lib/x86_64-linux-gnu/libarrow.a

Just in case it helps, here's a ldd output of the previous binary that I found in my terminal history:

$ ldd bin/tenzir
	linux-vdso.so.1 (0x00007fffb3bef000)
	libasan.so.8 => /lib/x86_64-linux-gnu/libasan.so.8 (0x00007f0167600000)
	libfluent-bit.so => /usr/lib/fluent-bit/libfluent-bit.so (0x00007f0166200000)
	libtenzir.so.2819.0 => /home/satta/tmp/tenzir/build/lib/libtenzir.so.2819.0 (0x00007f015c800000)
	libcaf_openssl.so.0.18.7 => /home/satta/tmp/tenzir/build/lib/libcaf_openssl.so.0.18.7 (0x00007f016be9d000)
	libcaf_io.so.0.18.7 => /home/satta/tmp/tenzir/build/lib/libcaf_io.so.0.18.7 (0x00007f015c000000)
	libcaf_core.so.0.18.7 => /home/satta/tmp/tenzir/build/lib/libcaf_core.so.0.18.7 (0x00007f015b200000)
	libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007f0167d56000)
	libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007f015ac00000)
	libyaml-cpp.so.0.7 => /lib/x86_64-linux-gnu/libyaml-cpp.so.0.7 (0x00007f016be47000)
	libxxhash.so.0 => /lib/x86_64-linux-gnu/libxxhash.so.0 (0x00007f016be32000)
	libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f0167ca6000)
	libspdlog.so.1.10 => /lib/x86_64-linux-gnu/libspdlog.so.1.10 (0x00007f016757c000)
	libfmt.so.9 => /lib/x86_64-linux-gnu/libfmt.so.9 (0x00007f016755c000)
	libsimdjson.so.16 => /home/satta/tmp/tenzir/build/lib/libsimdjson.so.16 (0x00007f015c6d1000)
	libarrow.so.1300 => /lib/x86_64-linux-gnu/libarrow.so.1300 (0x00007f0158600000)
	libboost_filesystem.so.1.81.0 => /lib/x86_64-linux-gnu/libboost_filesystem.so.1.81.0 (0x00007f0167538000)
	libboost_atomic.so.1.81.0 => /lib/x86_64-linux-gnu/libboost_atomic.so.1.81.0 (0x00007f016be26000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f0158200000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0166121000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f015c6b1000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f015841f000)
	libyaml-0.so.2 => /lib/x86_64-linux-gnu/libyaml-0.so.2 (0x00007f015c690000)
	libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 (0x00007f015bf30000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f015c671000)
	libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f015c654000)
	libpq.so.5 => /lib/x86_64-linux-gnu/libpq.so.5 (0x00007f015b1ab000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f016bf7c000)
	libre2.so.9 => /lib/x86_64-linux-gnu/libre2.so.9 (0x00007f015b132000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f015bf01000)
	libunwind.so.8 => /lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f015bee5000)
	libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f015b103000)
	libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f015b0d2000)
	librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f015b0b3000)
	libssh2.so.1 => /lib/x86_64-linux-gnu/libssh2.so.1 (0x00007f015abbf000)
	libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f016610d000)
	libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f015ab6d000)
	libldap-2.5.so.0 => /lib/x86_64-linux-gnu/libldap-2.5.so.0 (0x00007f015ab0e000)
	liblber-2.5.so.0 => /lib/x86_64-linux-gnu/liblber-2.5.so.0 (0x00007f015b0a3000)
	libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f015aa52000)
	libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007f015b096000)
	libbrotlienc.so.1 => /lib/x86_64-linux-gnu/libbrotlienc.so.1 (0x00007f015816f000)
	libprotobuf.so.32 => /lib/x86_64-linux-gnu/libprotobuf.so.32 (0x00007f0157e00000)
	libutf8proc.so.2 => /lib/x86_64-linux-gnu/libutf8proc.so.2 (0x00007f0157da9000)
	libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f015b083000)
	liblz4.so.1 => /lib/x86_64-linux-gnu/liblz4.so.1 (0x00007f0158149000)
	libabsl_bad_optional_access.so.20220623 => /lib/x86_64-linux-gnu/libabsl_bad_optional_access.so.20220623 (0x00007f016be17000)
	libabsl_str_format_internal.so.20220623 => /lib/x86_64-linux-gnu/libabsl_str_format_internal.so.20220623 (0x00007f0158130000)
	libabsl_time.so.20220623 => /lib/x86_64-linux-gnu/libabsl_time.so.20220623 (0x00007f0157d97000)
	libabsl_strings.so.20220623 => /lib/x86_64-linux-gnu/libabsl_strings.so.20220623 (0x00007f0157d79000)
	libabsl_strings_internal.so.20220623 => /lib/x86_64-linux-gnu/libabsl_strings_internal.so.20220623 (0x00007f0167ca0000)
	libabsl_throw_delegate.so.20220623 => /lib/x86_64-linux-gnu/libabsl_throw_delegate.so.20220623 (0x00007f0167531000)
	libabsl_time_zone.so.20220623 => /lib/x86_64-linux-gnu/libabsl_time_zone.so.20220623 (0x00007f0157d5f000)
	libabsl_bad_variant_access.so.20220623 => /lib/x86_64-linux-gnu/libabsl_bad_variant_access.so.20220623 (0x00007f0166108000
	libsnappy.so.1 => /lib/x86_64-linux-gnu/libsnappy.so.1 (0x00007f0157d53000)
	libcap.so.2 => /lib/x86_64-linux-gnu/libcap.so.2 (0x00007f0157d47000)
	libgcrypt.so.20 => /lib/x86_64-linux-gnu/libgcrypt.so.20 (0x00007f0157c00000)
	libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f0157a4a000)
	libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f0157800000)
	libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x00007f01577b7000)
	libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x00007f0157769000)
	libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f01576e8000)
	libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f015760e000)
	libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f0157a1d000)
	libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f015c64c000)
	libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f0157600000)
	libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007f01575dd000)
	libabsl_int128.so.20220623 => /lib/x86_64-linux-gnu/libabsl_int128.so.20220623 (0x00007f01575d6000)
	libabsl_base.so.20220623 => /lib/x86_64-linux-gnu/libabsl_base.so.20220623 (0x00007f015bedd000)
	libabsl_raw_logging_internal.so.20220623 => /lib/x86_64-linux-gnu/libabsl_raw_logging_internal.so.20220623 (0x00007f015841a000)
	libgpg-error.so.0 => /lib/x86_64-linux-gnu/libgpg-error.so.0 (0x00007f01575ae000)
	libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f015747a000)
	libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f0157465000)
	libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f015745e000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f015744d000)
	libabsl_spinlock_wait.so.20220623 => /lib/x86_64-linux-gnu/libabsl_spinlock_wait.so.20220623 (0x00007f0157448000)
	libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007f015743c000)

I can't see anything that stands out, but I guess you know better what to expect ;)

satta · 2023-11-02T10:20:15Z

@satta Installing Arrow 14 might be worth a try, even though I haven't tried it myself.

Segfault still present:

$ ./bin/tenzir
tenzir-v4.3.0-140-ge4c9341587: Error: signal 11 (Segmentation fault)
0x7f71d3b09c0b: (fatal_handler+0x93)
0x7f71ca05afd0: (__sigaction+0x40)
0x7f71d865a26e: (tcache_bin_flush_edatas_lookup.constprop.0+0x17e)
0x7f71d865bb97: (je_tcache_bin_flush_small+0xb7)
0x7f71d85f7207: (je_sdallocx_default+0x537)
0x7f71cbcb9eb6: (arrow_strptime+0x5a8a16)
0x7f71cbca5a0e: (arrow_strptime+0x59456e)
0x7f71cbca5b62: (arrow_strptime+0x5946c2)
0x7f71cbc704bf: (arrow_strptime+0x55f01f)
0x7f71cbbf1e95: (arrow_strptime+0x4e09f5)
0x7f71cbbc9bfa: (arrow_strptime+0x4b875a)
0x7f71cbb9661a: (arrow_strptime+0x48517a)
0x7f71cbb34497: (arrow_strptime+0x422ff7)
0x7f71cbae1f9d: (arrow_strptime+0x3d0afd)
0x7f71cb5fde80: (arrow::fs::EnsureS3Initialized()+0x250)
0x55972456cda9: (tenzir::plugins::s3::plugin::initialize(tenzir::detail::vector_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tenzir::data, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tenzir::data> >, tenzir::detail::stable_map_policy> const&, tenzir::detail::vector_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tenzir::data, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tenzir::data> >, tenzir::detail::stable_map_policy> const&)+0x10f)
0x7f71d5225dcb: (tenzir::plugins::initialize(caf::actor_system_config&)+0x31c4)
0x5597252c2689: (main+0x2252)
0x7f71ca0461ca: (__libc_init_first+0x8a)
0x7f71ca046285: (__libc_start_main+0x85)
0x559723db65f1: (_start+0x21)
zsh: segmentation fault  ./bin/tenzir

satta · 2023-11-02T11:40:37Z

FTR the issue was fixed by using a workaround in the FluentBit library, see fluent/fluent-bit#8011

Dakostu

I set up a basic RabbitMQ server and was able to send/receive events. I approve, with some comments left.

plugins/amqp/src/plugin.cpp

web/docs/connectors/amqp.md

plugins/amqp/src/plugin.cpp

mavam added feature New functionality connector Loader and saver labels Oct 2, 2023

Implement a RabbitMQ connector

cc6bfdf

mavam force-pushed the topic/rabbitmq branch from 207563a to cc6bfdf Compare October 3, 2023 04:58

mavam added 2 commits October 3, 2023 13:43

Adapt to platform-specific dependency paths

e05f024

Install dependencies

f2788f5

mavam force-pushed the topic/rabbitmq branch from 682ca87 to f2788f5 Compare October 3, 2023 11:48

Include example config in docs

5fb0235

Build plugin in CI

e66e29a

mavam added 2 commits October 3, 2023 19:56

Add optional URL parameter

2bb9d07

Expose more options

f9e2ac0

mavam force-pushed the topic/rabbitmq branch from 8288884 to f9e2ac0 Compare October 3, 2023 19:19

Fix confusion between queue naming and routing keys

0464e08

mavam and others added 2 commits October 4, 2023 09:33

Add AMQP diagram

bcaf2ad

Patch broken rabbitmq-c package definition

fa608fb

Co-authored-by: Tobias Mayer <tobim@fastmail.fm>

mavam force-pushed the topic/rabbitmq branch from 5469943 to fa608fb Compare October 4, 2023 07:33

Use the empty string as default routing key

7f19d1b

mavam added 2 commits October 31, 2023 10:35

Rename plugin to 'amqp'

ce64a4f

Add changelog entry

8578a32

Implement --set

125d873

mavam marked this pull request as ready for review October 31, 2023 12:46

Merge branch 'origin/main' into topic/rabbitmq

960b3b0

Dakostu self-requested a review November 2, 2023 13:57

Dakostu approved these changes Nov 3, 2023

View reviewed changes

mavam added 2 commits November 3, 2023 21:33

Fix typos

e1d0f2a

Make conditional includes portable

b5e7c37

mavam enabled auto-merge November 4, 2023 08:19

mavam added 2 commits November 6, 2023 05:14

Adjust plugin names in CI

cdc03d6

Sort plugins alphabetically

3fbe680

mavam force-pushed the topic/rabbitmq branch 2 times, most recently from fa02274 to 3fbe680 Compare November 6, 2023 11:02

Merge branch 'main' into topic/rabbitmq

2ba4c76

dominiklohmann disabled auto-merge November 6, 2023 13:06

dominiklohmann merged commit 19a72b6 into main Nov 6, 2023
11 of 12 checks passed

dominiklohmann deleted the topic/rabbitmq branch November 6, 2023 13:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a RabbitMQ-based AMQP connector #3546

Implement a RabbitMQ-based AMQP connector #3546

mavam commented Oct 2, 2023 •

edited

Definition of Done

mavam commented Oct 3, 2023

satta commented Oct 3, 2023 •

edited

satta commented Oct 3, 2023

mavam commented Oct 3, 2023

mavam commented Oct 3, 2023

satta commented Oct 3, 2023 •

edited

satta commented Oct 3, 2023 •

edited

satta commented Oct 3, 2023

mavam commented Oct 3, 2023

mavam commented Oct 3, 2023 •

edited

mavam commented Oct 4, 2023

mavam commented Oct 4, 2023 •

edited by tobim

tobim commented Oct 4, 2023

satta commented Oct 31, 2023

dominiklohmann commented Oct 31, 2023

satta commented Oct 31, 2023 •

edited

satta commented Nov 1, 2023

mavam commented Nov 2, 2023

Dakostu commented Nov 2, 2023 •

edited

Dakostu commented Nov 2, 2023 •

edited

satta commented Nov 2, 2023

Dakostu commented Nov 2, 2023 •

edited

satta commented Nov 2, 2023 •

edited

satta commented Nov 2, 2023

satta commented Nov 2, 2023

Dakostu left a comment

Implement a RabbitMQ-based AMQP connector #3546

Implement a RabbitMQ-based AMQP connector #3546

Conversation

mavam commented Oct 2, 2023 • edited

Definition of Done

mavam commented Oct 3, 2023

satta commented Oct 3, 2023 • edited

satta commented Oct 3, 2023

mavam commented Oct 3, 2023

mavam commented Oct 3, 2023

satta commented Oct 3, 2023 • edited

satta commented Oct 3, 2023 • edited

satta commented Oct 3, 2023

mavam commented Oct 3, 2023

mavam commented Oct 3, 2023 • edited

mavam commented Oct 4, 2023

mavam commented Oct 4, 2023 • edited by tobim

tobim commented Oct 4, 2023

satta commented Oct 31, 2023

dominiklohmann commented Oct 31, 2023

satta commented Oct 31, 2023 • edited

satta commented Nov 1, 2023

mavam commented Nov 2, 2023

Dakostu commented Nov 2, 2023 • edited

Dakostu commented Nov 2, 2023 • edited

satta commented Nov 2, 2023

Dakostu commented Nov 2, 2023 • edited

satta commented Nov 2, 2023 • edited

satta commented Nov 2, 2023

satta commented Nov 2, 2023

Dakostu left a comment

Choose a reason for hiding this comment

mavam commented Oct 2, 2023 •

edited

satta commented Oct 3, 2023 •

edited

satta commented Oct 3, 2023 •

edited

satta commented Oct 3, 2023 •

edited

mavam commented Oct 3, 2023 •

edited

mavam commented Oct 4, 2023 •

edited by tobim

satta commented Oct 31, 2023 •

edited

Dakostu commented Nov 2, 2023 •

edited

Dakostu commented Nov 2, 2023 •

edited

Dakostu commented Nov 2, 2023 •

edited

satta commented Nov 2, 2023 •

edited