## Runner Example

Initialize the runner by loading a config.


1. prepare the kafka cluster

In [2]:
%%bash
# starting kafka container
docker compose -f ../../../../../examples/compose/docker-compose.yml down -v
docker compose -f ../../../../../examples/compose/docker-compose.yml up -d kafka
# creating the topic
docker exec -i kafka /bin/bash -c "/opt/bitnami/kafka/bin/kafka-topics.sh --create --topic consumer --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1"
# waiting for kafka to be ready
sleep 10

 Container kafka  Stopping
 Container kafka  Stopped
 Container kafka  Removing
 Container kafka  Removed
 Network compose_kafka  Removing
 Network compose_kafka  Removed
 Network compose_kafka  Creating
 Network compose_kafka  Created
 Container kafka  Creating
 Container kafka  Created
 Container kafka  Starting
 Container kafka  Started
[2025-10-02 13:55:24,107] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
[2025-10-02 13:55:24,211] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
[2025-10-02 13:55:24,312] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
[2025-10-02 13:55:24,514]

Created topic consumer.


2. produce messages


In [3]:
%%bash
# producing 3 events to kafka topic consumer
docker exec -i kafka /bin/bash -c "echo '{\"message\": \"the message\"}' | /opt/bitnami/kafka/bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic consumer "
# showing events in kafka
# docker exec -i kafka /bin/bash -c "/opt/bitnami/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic consumer --from-beginning --max-messages 10"




In [5]:
from logprep.ng.runner import Runner
from logprep.ng.util.configuration import Configuration

def get_config():
    return {
        "process_count": 2,
        "pipeline": [
            {
                "processor_0": {
                    "type": "ng_generic_adder",
                    "rules": [
                        {
                            "filter": "*",
                            "generic_adder": {"add": {"event.tags": "generic added tag"}},
                        }
                    ],
                }
            },

            {
                "processor_1": {
                    "type": "ng_pseudonymizer",
                    "pubkey_analyst": "../../../../../examples/exampledata/rules/pseudonymizer/example_analyst_pub.pem",
                    "pubkey_depseudo": "../../../../../examples/exampledata/rules/pseudonymizer/example_depseudo_pub.pem",
                    "regex_mapping": "../../../../../examples/exampledata/rules/pseudonymizer/regex_mapping.yml",
                    "hash_salt": "a_secret_tasty_ingredient",
                    "outputs": [{"opensearch": "pseudonyms"}],
                    "rules": [
                        {
                            "filter": "user.name",
                            "pseudonymizer": {
                                "id": "pseudonymizer-1a3c69b2-5d54-4b6b-ab07-c7ddbea7917c",
                                "mapping": {"user.name": "RE_WHOLE_FIELD"},
                            },
                        }
                    ],
                    "max_cached_pseudonyms": 1000000,
                },
            },
        ],
        "input": {
            "kafka": {
                "type": "ng_confluentkafka_input",
                "topic": "consumer",
                "kafka_config": {
                    "bootstrap.servers": "127.0.0.1:9092",
                    "group.id": "cgroup3",
                    "enable.auto.commit": "true",
                    "auto.commit.interval.ms": "10000",
                    "enable.auto.offset.store": "false",
                    "queued.min.messages": "100000",
                    "queued.max.messages.kbytes": "65536",
                    "statistics.interval.ms": "60000"
                }
            },
        },
        "output": {
            "kafka": {
                "type": "ng_confluentkafka_output",
                "topic": "consumer",
                "flush_timeout": 300,
                "send_timeout": 0,
                "kafka_config": {
                    "bootstrap.servers": "127.0.0.1:9092"
                },
            },
        },
    }


runner = Runner(configuration=Configuration(**get_config()))
runner.run()

No error output configured.


KeyboardInterrupt: 