Skip to content

Connector throughput decreased after setting a higher tasks.max #13399

Open
@aaronphilip

Description

@aaronphilip

Query engine

No response

Question

Upon increasing the tasks.max for our connector from 1 to 4, we saw no improvement in connector performance despite the increased parallelism. In fact, it appears that throughput decreased by a factor of 4. The source topic has 8 partitions.

Given the discussion in #11818 and in the blog https://javaagile.blogspot.com/2025/05/iceberg-and-kafka-connect.html, is the recommendation to only have one task? If so, how do we scale the connector?

Here is the deployment for our connector.

apiVersion: platform.confluent.io/v1beta1
kind: Connector
metadata:
  name: <cluster name>
  namespace: <namespace>
spec:
  class: org.apache.iceberg.connect.IcebergSinkConnector
  configs:
    errors.deadletterqueue.context.headers.enable: "true"
    errors.deadletterqueue.topic.name: <dlq topic>
    errors.tolerance: all
    iceberg.catalog: iceberg
    iceberg.catalog.client.region: <region>
    iceberg.catalog.credential: <credentials>
    iceberg.catalog.header.X-Iceberg-Access-Delegation: "true"
    iceberg.catalog.scope: PRINCIPAL_ROLE:ALL
    iceberg.catalog.token-refresh-enabled: "true"
    iceberg.catalog.type: rest
    iceberg.catalog.uri: <catalog uri>
    iceberg.catalog.warehouse: <warehouse>
    iceberg.control.commit.interval-ms: "300000"
    iceberg.control.topic: <control topic>
    iceberg.tables: <table>
    iceberg.tables.auto-create-enabled: "true"
    iceberg.tables.evolve-schema-enabled: "true"
    iceberg.tables.schema-force-optional: "true"
    key.converter: org.apache.kafka.connect.converters.ByteArrayConverter
    topics: <topic>
    value.converter: io.confluent.connect.json.JsonSchemaConverter
    value.converter.auto.register.schemas: "false"
    value.converter.basic.auth.credentials.source: USER_INFO
    value.converter.basic.auth.user.info: <credentials> 
    value.converter.decimal.format: NUMERIC
    value.converter.schema.registry.url: <registry url>
    value.converter.schemas.enable: "false"
    value.converter.use.latest.version: "true"
    value.converter.use.optional.for.nonrequired: "true"
  connectClusterRef:
    name: <cluster name>
  taskMax: 4

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions