Skip to content

Fixes 25437: For kafka message consuming, switch to using poll() instead of consume()#25838

Merged
TeddyCr merged 9 commits intoopen-metadata:mainfrom
LasseGravesenSaxo:fix/kafka-consuming-poll-consume
Feb 21, 2026
Merged

Fixes 25437: For kafka message consuming, switch to using poll() instead of consume()#25838
TeddyCr merged 9 commits intoopen-metadata:mainfrom
LasseGravesenSaxo:fix/kafka-consuming-poll-consume

Conversation

@LasseGravesenSaxo
Copy link
Copy Markdown
Contributor

@LasseGravesenSaxo LasseGravesenSaxo commented Feb 12, 2026

Describe your changes:

Fixes #25437: For kafka message consuming, switch to using poll() instead of consume().

DeserializingConsumer does not implement consume, it raises a NotImplementedError. Instead of using that, switch to using poll. In yield_topic_sample_data in CommonBrokerSource.
See here for the documentation about DeserializingConsumer.consume.

Type of change:

  • Bug fix

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Bug fix

  • I have added a test that covers the exact scenario we are fixing. For complex issues, comment the issue number in the test for future reference.

I need help with this. I'm not too familiar with the codebase.



Summary by Gitar

  • Fixed Kafka sample data collection: Replaced unsupported consume() call with polling loop using poll() method
    • Switched from DeserializingConsumer.consume() (raises NotImplementedError) to poll() in a loop with deadline-based timeout
    • Added robust error handling for ConsumeError and deserialization errors (KeyDeserializationError, ValueDeserializationError)
    • Maintains original semantics: fetch up to 10 messages within 10-second timeout when generateSampleData=True

This will update automatically on new commits.

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

TeddyCr
TeddyCr previously approved these changes Feb 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 20, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

Vulnerabilities (40)

Package Vulnerability ID Severity Installed Version Fixed Version
libpng-dev CVE-2026-22695 🚨 HIGH 1.6.39-2+deb12u1 1.6.39-2+deb12u2
libpng-dev CVE-2026-22801 🚨 HIGH 1.6.39-2+deb12u1 1.6.39-2+deb12u2
libpng-dev CVE-2026-25646 🚨 HIGH 1.6.39-2+deb12u1 1.6.39-2+deb12u3
libpng16-16 CVE-2026-22695 🚨 HIGH 1.6.39-2+deb12u1 1.6.39-2+deb12u2
libpng16-16 CVE-2026-22801 🚨 HIGH 1.6.39-2+deb12u1 1.6.39-2+deb12u2
libpng16-16 CVE-2026-25646 🚨 HIGH 1.6.39-2+deb12u1 1.6.39-2+deb12u3
linux-libc-dev CVE-2024-46786 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-21946 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22022 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22083 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22107 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22121 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-37926 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38022 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38129 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38361 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38718 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-39871 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-68340 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-68349 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-68369 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-68800 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-71085 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-71116 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-22984 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-22990 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23001 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23010 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23054 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23074 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23097 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23120 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23121 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23124 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23126 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23133 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23139 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23140 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23144 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23156 🚨 HIGH 6.1.159-1 6.1.162-1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (9)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 20, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (19)

Package Vulnerability ID Severity Installed Version Fixed Version
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.5 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2025-61726 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61728 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61730 🚨 HIGH v1.25.5 1.24.12, 1.25.6

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Feb 20, 2026

🔍 CI failure analysis for 6018475: After latest merge: 21 checks passing, 2 failing. Primary blocker remains SonarCloud test coverage (7.4% vs 20%). Playwright E2E test failures (1 failed, 12 flaky) are unrelated frontend issues.

Issue

Current Status:

  • ✅ 21 checks passing (up from 17)
  • ❌ 2 checks failing:
    1. SonarCloud Quality Gate (test coverage)
    2. Playwright E2E shard 4 (frontend tests)
  • Label verification: ✅ PASSED

Root Causes

1. SonarCloud Quality Gate - Primary Blocker

Insufficient test coverage persists:

7.4% Coverage on New Code (required ≥ 20%)

The Kafka consumer polling logic still lacks unit tests.

2. Playwright E2E Tests - Unrelated to PR (Not a Blocker)

Shard 4 of 6 Results:

  • 1 failed: "Right Entity Panel - Data Steward User Flow › Data Steward - Custom Properties Tab - View Custom Properties"
  • 12 flaky:
    • Table Sorting on Services page
    • Custom Properties tests (Integer, String, Email for various entities)
    • Data Contracts With Persona tests (timeout on beforeAll hook)
    • Domain Rename with owners/experts (element not found)
  • 656 passed
  • 4 skipped
  • Duration: 1.1 hours

Failure Analysis:

  • Custom Property test: Expected "Not set" value not found, suggesting UI rendering or data loading issue
  • Data Contracts test: 60-second timeout on beforeAll hook (login/setup)
  • Domain Rename test: Missing entity-header-name element after rename operation

Important: These Playwright failures are NOT caused by this PR. The PR only modifies:

  • File: ingestion/src/metadata/ingestion/source/messaging/common_broker_source.py (backend Python)
  • Functionality: Kafka message consumption API (switching from consume() to poll())

The failing tests are:

  • Language: TypeScript/JavaScript (frontend)
  • Functionality: Custom Properties UI, Domain Management UI, Data Contracts UI, User Permissions
  • Root causes: Frontend test flakiness, timing issues, possible backend API changes in main branch unrelated to this PR

The 12 flaky tests indicate environmental/infrastructure issues with the E2E test suite, not code defects.

Details

PR Status:

  • Current commit: 601847560aafa4083afddac56af181e9e61f73d6
  • Labels: ✅ safe to test
  • File changed: ingestion/src/metadata/ingestion/source/messaging/common_broker_source.py (backend Python only)
  • Change type: Bug fix for Kafka consumer API

CI Progress:

  • ✅ py-checkstyle: Passing
  • ✅ Label verification gate: Passed
  • ✅ 21 other CI checks: Passing
  • ❌ Test coverage: Still 7.4% vs required 20% (BLOCKING)
  • ❌ Playwright E2E: Frontend test failures (NOT BLOCKING - unrelated)

Fix Required

For the PR Author (Blocking):

Add unit tests for the Kafka poll() implementation in ingestion/src/metadata/ingestion/source/messaging/common_broker_source.py:

  1. Successful message polling
  2. Timeout/no messages (poll returns None)
  3. Deserialization errors (KeyDeserializationError, ValueDeserializationError)
  4. Consumer errors (ConsumeError)
  5. Deadline timeout (time budget exhausted)

Test File Location: ingestion/tests/unit/source/messaging/test_kafka.py

Example Test:

from unittest.mock import Mock, patch
from confluent_kafka.error import ConsumeError

class TestKafkaConsumerPolling(TestCase):
    @patch('time.monotonic')
    def test_yield_topic_sample_data_poll_success(self, mock_time):
        """Test successful message collection via poll()"""
        # Setup
        mock_time.side_effect = [0, 1, 2, 3]  # Simulated time progression
        mock_consumer = Mock()
        mock_msg = Mock()
        mock_consumer.poll.return_value = mock_msg
        
        # Test and assert
        # ...

Run Tests:

cd ingestion
make install_dev_env
pytest tests/unit/source/messaging/test_kafka.py -v \
  --cov=metadata.ingestion.source.messaging.common_broker_source

For Maintainers (Frontend E2E Issues - Not Blocking This PR):

The Playwright failures should not block this backend PR:

  • Consider whether flaky E2E tests should block unrelated PRs
  • Investigate frontend test suite stability separately
  • 13 failed/flaky tests across shards indicate systemic E2E test issues

Context

The PR author stated:

"I need help with this. I'm not too familiar with the codebase."

Specifically about test coverage. The author may benefit from guidance on mocking Kafka consumer behavior.

Historical Progress

Excellent progress achieved:

  1. ✅ Unblocked from "safe to test" label (6 merge iterations)
  2. ✅ Fixed Black code formatting
  3. ✅ Merged with latest main (6th merge)
  4. ✅ 21 of 23 checks passing (91% pass rate)
  5. ❌ 1 blocking issue: Test coverage 7.4% vs 20%
  6. ❌ 1 non-blocking issue: Unrelated frontend E2E test failures

The PR is 95% complete - just needs unit tests for the Kafka polling logic to meet the coverage threshold.

Code Review 👍 Approved with suggestions 1 resolved / 1 findings

Clean fix replacing the unimplemented consume() with a well-structured poll() loop. Error handling for DeserializingConsumer-specific exceptions is appropriate, and the timeout logic correctly mimics the original behavior.

✅ 1 resolved
Quality: Magic numbers for poll count and timeout should be named constants

📄 ingestion/src/metadata/ingestion/source/messaging/common_broker_source.py:299
The values n_poll = 10 and total_timeout = 10 are local variables, but they mirror the original consume(num_messages=10, timeout=10) behavior and are configuration-like values. Consider extracting them as class-level or module-level constants for better discoverability and consistency, e.g., _SAMPLE_DATA_MAX_MESSAGES = 10 and _SAMPLE_DATA_POLL_TIMEOUT_SECS = 10. This is a minor style point - the current implementation is functional and correct.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed for 'open-metadata-ingestion'

Failed conditions
7.4% Coverage on New Code (required ≥ 20%)

See analysis details on SonarQube Cloud

@TeddyCr TeddyCr merged commit 5e7a5a8 into open-metadata:main Feb 21, 2026
23 of 25 checks passed
@LasseGravesenSaxo
Copy link
Copy Markdown
Contributor Author

@TeddyCr , excellent thank you for fixing it up and merging it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kafka Connection cannot import topic sample data- using an unimplemented Confluent DeserializingConsumer method

2 participants