Open
Description
Query engine
No response
Question
Upon increasing the tasks.max
for our connector from 1
to 4
, we saw no improvement in connector performance despite the increased parallelism. In fact, it appears that throughput decreased by a factor of 4. The source topic has 8 partitions.
Given the discussion in #11818 and in the blog https://javaagile.blogspot.com/2025/05/iceberg-and-kafka-connect.html, is the recommendation to only have one task? If so, how do we scale the connector?
Here is the deployment for our connector.
apiVersion: platform.confluent.io/v1beta1
kind: Connector
metadata:
name: <cluster name>
namespace: <namespace>
spec:
class: org.apache.iceberg.connect.IcebergSinkConnector
configs:
errors.deadletterqueue.context.headers.enable: "true"
errors.deadletterqueue.topic.name: <dlq topic>
errors.tolerance: all
iceberg.catalog: iceberg
iceberg.catalog.client.region: <region>
iceberg.catalog.credential: <credentials>
iceberg.catalog.header.X-Iceberg-Access-Delegation: "true"
iceberg.catalog.scope: PRINCIPAL_ROLE:ALL
iceberg.catalog.token-refresh-enabled: "true"
iceberg.catalog.type: rest
iceberg.catalog.uri: <catalog uri>
iceberg.catalog.warehouse: <warehouse>
iceberg.control.commit.interval-ms: "300000"
iceberg.control.topic: <control topic>
iceberg.tables: <table>
iceberg.tables.auto-create-enabled: "true"
iceberg.tables.evolve-schema-enabled: "true"
iceberg.tables.schema-force-optional: "true"
key.converter: org.apache.kafka.connect.converters.ByteArrayConverter
topics: <topic>
value.converter: io.confluent.connect.json.JsonSchemaConverter
value.converter.auto.register.schemas: "false"
value.converter.basic.auth.credentials.source: USER_INFO
value.converter.basic.auth.user.info: <credentials>
value.converter.decimal.format: NUMERIC
value.converter.schema.registry.url: <registry url>
value.converter.schemas.enable: "false"
value.converter.use.latest.version: "true"
value.converter.use.optional.for.nonrequired: "true"
connectClusterRef:
name: <cluster name>
taskMax: 4