Skip to content

Fix Trivy scans#24867

Merged
pmbrull merged 15 commits intomainfrom
fix-trivy-scans
Dec 19, 2025
Merged

Fix Trivy scans#24867
pmbrull merged 15 commits intomainfrom
fix-trivy-scans

Conversation

@SumanMaharana
Copy link
Copy Markdown
Contributor

@SumanMaharana SumanMaharana commented Dec 17, 2025

Security CVE Fixes

Summary

This PR addresses 10 CVEs reported from Docker Scout and Trivy scans.

Phase 1: Direct Dependencies

CVE Component Severity Status Action
CVE-2024-47561 Apache Avro 9.2 Critical Fixed Updated to >=1.11.4
CVE-2023-44981 Apache ZooKeeper 9.1 Critical N/A Not a direct dependency
CVE-2025-58367 DeepDiff 10.0 Critical Already Safe Resolves to 8.6.1
CVE-2025-58068 Eventlet 9.1 Critical Already Safe Airflow 3.1.2 has 0.40.3

Phase 2: Airflow Constraint Updates

CVE Component Severity Status Action
CVE-2025-62727 starlette HIGH Fixed Updated 0.48.0 → 0.49.1
CVE-2025-66418 urllib3 HIGH Fixed Updated 2.5.0 → 2.6.0
CVE-2025-66471 urllib3 HIGH Fixed Updated 2.5.0 → 2.6.0
CVE-2024-34069 Werkzeug HIGH Cannot Fix Flask 2.2.5 incompatible with 3.x
CVE-2025-62593 ray CRITICAL Fixed Updated 2.47.1 → 2.52.0

Why Airflow 3.x Helps

Upgrading to Airflow 3.1.2 already addresses several vulnerabilities out-of-the-box:

CVE Package Airflow 3.1.2 Status
CVE-2025-58068 Eventlet Already patched (0.40.3)
CVE-2025-58367 DeepDiff Already safe (8.6.1 via transitive deps)

However, we still needed to update the constraint file for:

  • starlette, urllib3, ray - minor/patch version bumps for security fixes
  • Werkzeug - blocked by Flask 2.2.5 dependency (requires Flask 3.x)

Files Changed

File Change
ingestion/setup.py Updated Avro version constraint
ingestion/airflow-constraints-3.1.2.txt Updated starlette, urllib3, ray versions

Changes Made

CVE-2024-47561 - Apache Avro RCE (Fixed)

Vulnerability: Schema parsing in Apache Avro Java SDK allows arbitrary code execution when reading Avro data.

File Changed: ingestion/setup.py

- "avro": "avro>=1.11.3,<1.12",
+ "avro": "avro>=1.11.4,<1.12",

References:


No Changes Required

CVE-2023-44981 - Apache ZooKeeper (Not Applicable)

Vulnerability: Authorization bypass in SASL Quorum Peer authentication.

Status: ZooKeeper is not a direct dependency of OpenMetadata.

Analysis:

  • No zookeeper dependency found in any pom.xml or Python requirements
  • OpenMetadata uses confluent-kafka Python client (not the Kafka/ZooKeeper server)
  • The CVE only affects ZooKeeper servers with quorum.auth.enableSasl=true (disabled by default)
  • OpenMetadata does not run a ZooKeeper server - it's a metadata platform

Conclusion: False positive. If flagged by scanner, it's from the Apache Airflow base Docker image or transitive dependencies, not from OpenMetadata code.

References:


CVE-2025-58367 - DeepDiff (Already Safe)

Vulnerability: Class pollution via Delta class enables Pickle deserialization RCE.

Status: DeepDiff is a transitive dependency via dbt-commoncollate-data-diff.

Current Version: Resolves to 8.6.1 (the patched version)

No action required - dependency resolution already pulls in the fixed version.

References:


CVE-2025-58068 - Eventlet (Already Safe)

Vulnerability: HTTP Request Smuggling in WSGI parser due to improper trailer handling.

Status: The Airflow 3.1.2 constraints already include the patched version.

Current Version: eventlet==0.40.3 in airflow-constraints-3.1.2.txt

No action required - already using patched version.

References:


Phase 2 Changes - Airflow Constraints

File Changed: ingestion/airflow-constraints-3.1.2.txt

CVE-2025-62727 - Starlette Request Smuggling (Fixed)

Vulnerability: HTTP request smuggling via malformed requests in Starlette.

- starlette==0.48.0
+ starlette==0.49.1

Compatibility: Verified safe - no breaking API changes, minor version bump.

References:


CVE-2025-66418 & CVE-2025-66471 - urllib3 DoS (Fixed)

Vulnerability: Denial of Service via unbounded decompression chain and resource consumption.

- urllib3==2.5.0
+ urllib3==2.6.0

Compatibility: Verified safe - urllib3.util.Url class API unchanged in 2.6.0. Only usage in codebase is in ingestion/src/metadata/ingestion/source/database/deltalake/clients/s3.py.

References:


CVE-2024-34069 - Werkzeug CSRF in Debugger (Cannot Fix)

Vulnerability: CSRF vulnerability in Werkzeug debugger allows arbitrary code execution if debugger is exposed.

Status: CANNOT UPDATE - Airflow 3.1.2 uses Flask 2.2.5, which requires Werkzeug < 3.0

Compatibility Analysis:

  • Flask 2.2.x requires Werkzeug >= 2.2, < 3.0
  • Flask 3.0+ is required for Werkzeug 3.x support
  • Updating Werkzeug to 3.0.3 would break Airflow

Risk Mitigation:

  • LOW RISK in production: The Werkzeug debugger should never be enabled in production
  • Only exploitable if debug mode is enabled AND attacker can reach the debugger
  • Ensure FLASK_DEBUG=0 and FLASK_ENV=production in all deployments

Future Fix: Requires waiting for Apache Airflow to upgrade to Flask 3.x

References:


CVE-2025-62593 - Ray Arbitrary Code Execution (Fixed)

Vulnerability: Ray allows arbitrary code execution via insecure deserialization or exposed dashboard.

- ray==2.47.1
+ ray==2.52.0

Compatibility: Verified safe - Ray is not directly used in OpenMetadata ingestion code, only as an Airflow dependency for distributed task execution.

References:


Testing

  1. Running cd ingestion && make install_dev_env
  2. Running make unit_ingestion to ensure tests pass
  3. Rebuilding Docker images and re-running Trivy/Docker Scout scans

Related Issues

  • Reported via Office Hours + OSS community feedback
  • Detected by Docker Scout and Trivy scans on PRs

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 17, 2025

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (6)

Package Vulnerability ID Severity Installed Version Fixed Version
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
deepdiff CVE-2025-58367 🔥 CRITICAL 7.0.1 8.6.1
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 17, 2025

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.12)

Vulnerabilities (6)

Package Vulnerability ID Severity Installed Version Fixed Version
libpng-dev CVE-2025-64720 🚨 HIGH 1.6.39-2 1.6.39-2+deb12u1
libpng-dev CVE-2025-65018 🚨 HIGH 1.6.39-2 1.6.39-2+deb12u1
libpng-dev CVE-2025-66293 🚨 HIGH 1.6.39-2 1.6.39-2+deb12u1
libpng16-16 CVE-2025-64720 🚨 HIGH 1.6.39-2 1.6.39-2+deb12u1
libpng16-16 CVE-2025-65018 🚨 HIGH 1.6.39-2 1.6.39-2+deb12u1
libpng16-16 CVE-2025-66293 🚨 HIGH 1.6.39-2 1.6.39-2+deb12u1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (3)

Package Vulnerability ID Severity Installed Version Fixed Version
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Dec 19, 2025

🔍 CI failure analysis for 4e6b41b: Five CI jobs failed: py-run-tests (IBM timeout), py-run-build-tests (Airflow API timeout), playwright (5,6) (timezone), playwright (6,6) (login timeout), and playwright (2,6) (browser crash after 415 tests, 99.5% pass rate). All are flaky/infrastructure issues unrelated to PR.

Issue

Five separate CI jobs have failed due to a combination of infrastructure issues and flaky test failures.


Summary of All Failures

  1. py-run-tests (3.10): IBM server timeout - Infrastructure issue
  2. py-run-build-tests: Airflow API timeout - Potentially related to Airflow 3.1.5 upgrade
  3. playwright (5,6): Time value timezone mismatch - Flaky test
  4. playwright (6,6): Login/refresh test timeout - Flaky test
  5. playwright (2,6): Browser crash during test - NEW FAILURE

Failure 5: playwright-ci-postgresql (2, 6) - Job 58526724449

Root Cause

Flaky Playwright Test - Browser Crash During Click

Error:

Test timeout of 180000ms exceeded.
Error: locator.click: Target page, context or browser has been closed

Failing Test

Test: Dashboard › Validate restore with Inherited domain and data products assigned

  • File: playwright/e2e/Features/RestoreEntityInheritedFields.spec.ts:75:9
  • Timeout: 180,000ms (3 minutes)
  • Pass Rate: 415 tests passed, 2 flaky (99.5% pass rate)
  • Duration: 1.5 hours runtime

Details

The test failed at line 81 in entity.ts while attempting to click on a dashboard element:

at visitEntityPage (playwright/utils/entity.ts:81:63)
at DashboardClass.visitEntityPage (DashboardClass.ts:230:5)
at RestoreEntityInheritedFields.spec.ts:80:7

Code location:

79 |   await waitForSearchResponse;
80 |
81 |   await page.getByTestId(dataTestId).getByTestId('data-name').click();
82 |   await page.waitForLoadState('networkidle');
83 |   await page.waitForSelector('[data-testid="loader"]', {
84 |     state: 'detached',

What happened:

  1. Test was attempting to visit a dashboard entity page
  2. Waited for search API response to complete
  3. Attempted to click on the dashboard element with test ID pw-dashboard-service-c121fc2a-pw-dashboard-1a4d2e35
  4. Browser/page/context crashed or closed before the click could complete
  5. Test timed out after 3 minutes

Why the Browser Crashed

The error "Target page, context or browser has been closed" indicates:

  • Browser process crashed: Out of memory, segfault, or resource exhaustion
  • Page closed unexpectedly: Navigation error, JavaScript exception that killed the page
  • Context destroyed: Test infrastructure issue that invalidated the browser context
  • Resource exhaustion: After 1.5 hours and 415 tests, system resources may be depleted

Flakiness Indicators

  1. High success rate: 415 tests passed (99.5% pass rate)
  2. Multiple flaky tests: 2 other tests marked as "flaky" in the same run
  3. Long runtime: 1.5 hours of continuous testing before failure
  4. Browser instability: Browser crashes are classic symptoms of resource exhaustion
  5. No code-level error: The test logic is sound - this is an infrastructure/environment issue

Historical Pattern

This is another timeout/crash failure similar to previous issues in this PR:

  • Job 58526724451 (playwright 6,6): Login test timeout after 5 minutes
  • Job 58526724472 (playwright 5,6): Timezone test flakiness
  • Job 58519402724 (Ingestion Bot): Navigation timeout stuck on signin

All playwright failures in this PR show the same characteristics:

  • High pass rates (99%+)
  • Long runtimes (1-1.5 hours)
  • Timeout or crash at the end of test suite
  • Multiple flaky tests per run

Relation to PR

This failure is completely unrelated to the PR changes (Apache Avro security fix and Airflow 3.1.5 upgrade). The PR modifies:

  • docker/development/docker-compose.yml
  • ingestion/Dockerfile
  • ingestion/Dockerfile.ci
  • ingestion/airflow-constraints-3.1.5.txt
  • ingestion/setup.py

None of these changes affect:

  • Dashboard functionality
  • Playwright browser stability
  • Entity restoration features
  • Frontend component behavior
  • Test infrastructure

Solution

The solution is to retry the CI job. This is a flaky test pattern where:

  1. Browser crashes after extended test runs are environmental issues
  2. After 415 tests (1.5 hours), browser memory/resources may be exhausted
  3. The test code is correct - the browser simply crashed during execution
  4. Single failure among 415 passed tests confirms environmental flakiness

Combined Analysis - All Five Failures

Summary Table

Job Failure Type Root Cause Related to PR?
py-run-tests (3.10) Infrastructure IBM server timeout No
py-run-build-tests Dependency Airflow API timeout Potentially
playwright (5,6) Flaky Test Timezone mismatch No
playwright (6,6) Flaky Test Login timeout (5 min) No
playwright (2,6) Flaky Test Browser crash No

Common Patterns Across All Playwright Failures

  1. Extremely high pass rates: 99-99.7% tests passing
  2. Multiple flaky tests: 2-4 flaky tests per run
  3. Long runtimes: 1.1-1.5 hours before failure
  4. Timeout/crash at end: Failures occur late in test execution
  5. Infrastructure issues: Resource exhaustion, timing problems, browser instability

Root Cause Analysis

The pattern is clear: CI environment degradation after extended test runs

  • Tests pass consistently at the start
  • After 1+ hours and 400+ tests, resources are depleted
  • Memory leaks, browser instability, timing issues emerge
  • Random tests fail with timeouts or crashes
  • Different test fails each time (login, timezone, dashboard, etc.)

Solution

All five jobs should be retried:

  1. Job 1 (py-run-tests 3.10): IBM infrastructure timeout - transient
  2. Job 2 (py-run-build-tests): Airflow API timeout - needs retry to confirm if related to 3.1.5
  3. Job 3 (playwright 5,6): Timezone flakiness - transient
  4. Job 4 (playwright 6,6): Login timeout - transient
  5. Job 5 (playwright 2,6): Browser crash - transient

No code changes are needed - all failures appear to be transient infrastructure/environmental issues. The Playwright test suite may benefit from:

  • Breaking into smaller test groups to reduce runtime
  • Adding browser restart intervals to prevent resource exhaustion
  • Increasing timeouts for known problematic tests

However, these are test infrastructure improvements, not fixes for code regressions. The PR code itself is not causing these failures.

Code Review ✅ Approved

Clean Airflow 3.1.2 → 3.1.5 upgrade with consistent updates across all files and new constraints file.

What Works Well

  • Consistent version updates across all Dockerfiles (Dockerfile, Dockerfile.ci) and setup.py
  • Properly updated constraints file references to use the new airflow-constraints-3.1.5.txt
  • Added configurable logging level via environment variable in docker-compose for development flexibility

Minor Notes

  • The old airflow-constraints-3.1.2.txt file remains in the repo but is no longer referenced. Consider removing it in a follow-up cleanup PR if not needed for backwards compatibility.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off Gitar will not commit updates to this branch.
✅ Code review is on Gitar will review this change.
Display: compact Hiding non-applicable rules.

Comment with these commands to change:

Auto-apply ✅ Code review Compact
gitar auto-apply:on         
gitar code-review:off         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | This comment will update automatically (Docs)

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

Changes have been cherry-picked to the 1.11.4 branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants