MINOR: use stats tables for MySQL and PSQL profiler by TeddyCr · Pull Request #25724 · open-metadata/OpenMetadata

TeddyCr · 2026-02-05T21:04:08Z

MINOR: use stats tables for MySQL and PSQL profiler

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

Checklist:

I have read the CONTRIBUTING document.
My PR title is Fixes <issue-number>: <short explanation>
I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

Performance optimization:
- Eliminated expensive COUNT(*) queries by using database system statistics tables for table profiling
New PostgreSQL profiler:
- PostgresTableMetricComputer queries pg_catalog.pg_class for instant row count and size metrics
MySQL profiler improvement:
- Removed COUNT(*) correction logic, now trusts information_schema.tables statistics directly
Test coverage:
- Added test_table_metric_computer.py with 4 integration tests for PostgreSQL profiler

_{This will update automatically on new commits.}

gitar-bot · 2026-02-05T21:08:21Z

ingestion/tests/integration/postgres/test_table_metric_computer.py

+@pytest.fixture(scope="module")
+def pg_engine(postgres_container):  # noqa: F811
+    engine = create_engine(postgres_container.get_connection_url())
+    engine.execute(


⚠️ Bug: Deprecated Engine.execute() usage in test fixture

The test fixture uses engine.execute() directly (lines 54-64), which was deprecated in SQLAlchemy 1.4 and removed in SQLAlchemy 2.0. This will cause test failures if the project uses or upgrades to SQLAlchemy 2.0+.

Impact: Tests may fail with AttributeError: 'Engine' object has no attribute 'execute' on SQLAlchemy 2.0+.

Suggested fix:

@pytest.fixture(scope="module") def pg_engine(postgres_container): # noqa: F811 engine = create_engine(postgres_container.get_connection_url()) with engine.connect() as conn: conn.execute(text( "CREATE TABLE IF NOT EXISTS public.metric_computer_test " "(id INTEGER PRIMARY KEY, name VARCHAR(256))" )) conn.execute(text( "INSERT INTO public.metric_computer_test (id, name) " "SELECT g, 'name_' || g FROM generate_series(1, 100) AS g" )) conn.execute(text("ANALYZE public.metric_computer_test")) conn.commit() yield engine with engine.connect() as conn: conn.execute(text("DROP TABLE IF EXISTS public.metric_computer_test")) conn.commit() engine.dispose()

Also add text to the imports from sqlalchemy.

_{Was this helpful? React with 👍 / 👎}

github-actions · 2026-02-05T21:13:48Z

🛡️ TRIVY SCAN RESULT 🛡️

Target: `openmetadata-ingestion-base-slim:trivy (debian 12.13)`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `Java`

Vulnerabilities (33)

Package	Vulnerability ID	Severity	Installed Version	Fixed Version
`com.fasterxml.jackson.core:jackson-core`	CVE-2025-52999	🚨 HIGH	2.12.7	2.15.0
`com.fasterxml.jackson.core:jackson-core`	CVE-2025-52999	🚨 HIGH	2.13.4	2.15.0
`com.fasterxml.jackson.core:jackson-databind`	CVE-2022-42003	🚨 HIGH	2.12.7	2.12.7.1, 2.13.4.2
`com.fasterxml.jackson.core:jackson-databind`	CVE-2022-42004	🚨 HIGH	2.12.7	2.12.7.1, 2.13.4
`com.google.code.gson:gson`	CVE-2022-25647	🚨 HIGH	2.2.4	2.8.9
`com.google.protobuf:protobuf-java`	CVE-2021-22569	🚨 HIGH	3.3.0	3.16.1, 3.18.2, 3.19.2
`com.google.protobuf:protobuf-java`	CVE-2022-3509	🚨 HIGH	3.3.0	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2022-3510	🚨 HIGH	3.3.0	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2024-7254	🚨 HIGH	3.3.0	3.25.5, 4.27.5, 4.28.2
`com.google.protobuf:protobuf-java`	CVE-2021-22569	🚨 HIGH	3.7.1	3.16.1, 3.18.2, 3.19.2
`com.google.protobuf:protobuf-java`	CVE-2022-3509	🚨 HIGH	3.7.1	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2022-3510	🚨 HIGH	3.7.1	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2024-7254	🚨 HIGH	3.7.1	3.25.5, 4.27.5, 4.28.2
`com.nimbusds:nimbus-jose-jwt`	CVE-2023-52428	🚨 HIGH	9.8.1	9.37.2
`com.squareup.okhttp3:okhttp`	CVE-2021-0341	🚨 HIGH	3.12.12	4.9.2
`commons-beanutils:commons-beanutils`	CVE-2025-48734	🚨 HIGH	1.9.4	1.11.0
`commons-io:commons-io`	CVE-2024-47554	🚨 HIGH	2.8.0	2.14.0
`dnsjava:dnsjava`	CVE-2024-25638	🚨 HIGH	2.1.7	3.6.0
`io.netty:netty-codec-http2`	CVE-2025-55163	🚨 HIGH	4.1.96.Final	4.2.4.Final, 4.1.124.Final
`io.netty:netty-codec-http2`	GHSA-xpw8-rcwv-8f8p	🚨 HIGH	4.1.96.Final	4.1.100.Final
`io.netty:netty-handler`	CVE-2025-24970	🚨 HIGH	4.1.96.Final	4.1.118.Final
`net.minidev:json-smart`	CVE-2021-31684	🚨 HIGH	1.3.2	1.3.3, 2.4.4
`net.minidev:json-smart`	CVE-2023-1370	🚨 HIGH	1.3.2	2.4.9
`org.apache.avro:avro`	CVE-2024-47561	🔥 CRITICAL	1.7.7	1.11.4
`org.apache.avro:avro`	CVE-2023-39410	🚨 HIGH	1.7.7	1.11.3
`org.apache.derby:derby`	CVE-2022-46337	🔥 CRITICAL	10.14.2.0	10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
`org.apache.ivy:ivy`	CVE-2022-46751	🚨 HIGH	2.5.1	2.5.2
`org.apache.mesos:mesos`	CVE-2018-1330	🚨 HIGH	1.4.3	1.6.0
`org.apache.thrift:libthrift`	CVE-2019-0205	🚨 HIGH	0.12.0	0.13.0
`org.apache.thrift:libthrift`	CVE-2020-13949	🚨 HIGH	0.12.0	0.14.0
`org.apache.zookeeper:zookeeper`	CVE-2023-44981	🔥 CRITICAL	3.6.3	3.7.2, 3.8.3, 3.9.1
`org.eclipse.jetty:jetty-server`	CVE-2024-13009	🚨 HIGH	9.4.56.v20240826	9.4.57.v20241219
`org.lz4:lz4-java`	CVE-2025-12183	🚨 HIGH	1.8.0	1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: `Node.js`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `Python`

Vulnerabilities (10)

Package	Vulnerability ID	Severity	Installed Version	Fixed Version
`apache-airflow`	CVE-2025-68438	🚨 HIGH	3.1.5	3.1.6
`apache-airflow`	CVE-2025-68675	🚨 HIGH	3.1.5	3.1.6
`jaraco.context`	CVE-2026-23949	🚨 HIGH	5.3.0	6.1.0
`jaraco.context`	CVE-2026-23949	🚨 HIGH	6.0.1	6.1.0
`starlette`	CVE-2025-62727	🚨 HIGH	0.48.0	0.49.1
`urllib3`	CVE-2025-66418	🚨 HIGH	1.26.20	2.6.0
`urllib3`	CVE-2025-66471	🚨 HIGH	1.26.20	2.6.0
`urllib3`	CVE-2026-21441	🚨 HIGH	1.26.20	2.6.3
`wheel`	CVE-2026-24049	🚨 HIGH	0.45.1	0.46.2
`wheel`	CVE-2026-24049	🚨 HIGH	0.45.1	0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/etc/ssl/private/ssl-cert-snakeoil.key`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/extended_sample_data.yaml`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/lineage.yaml`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/sample_data.json`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/sample_data.yaml`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/sample_data_aut.yaml`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/sample_usage.json`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/sample_usage.yaml`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/ingestion/pipelines/sample_usage_aut.yaml`

No Vulnerabilities Found

github-actions · 2026-02-05T21:15:16Z

🛡️ TRIVY SCAN RESULT 🛡️

Target: `openmetadata-ingestion:trivy (debian 12.12)`

Vulnerabilities (4)

Package	Vulnerability ID	Severity	Installed Version	Fixed Version
`libpam-modules`	CVE-2025-6020	🚨 HIGH	1.5.2-6+deb12u1	1.5.2-6+deb12u2
`libpam-modules-bin`	CVE-2025-6020	🚨 HIGH	1.5.2-6+deb12u1	1.5.2-6+deb12u2
`libpam-runtime`	CVE-2025-6020	🚨 HIGH	1.5.2-6+deb12u1	1.5.2-6+deb12u2
`libpam0g`	CVE-2025-6020	🚨 HIGH	1.5.2-6+deb12u1	1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: `Java`

Vulnerabilities (33)

Package	Vulnerability ID	Severity	Installed Version	Fixed Version
`com.fasterxml.jackson.core:jackson-core`	CVE-2025-52999	🚨 HIGH	2.12.7	2.15.0
`com.fasterxml.jackson.core:jackson-core`	CVE-2025-52999	🚨 HIGH	2.13.4	2.15.0
`com.fasterxml.jackson.core:jackson-databind`	CVE-2022-42003	🚨 HIGH	2.12.7	2.12.7.1, 2.13.4.2
`com.fasterxml.jackson.core:jackson-databind`	CVE-2022-42004	🚨 HIGH	2.12.7	2.12.7.1, 2.13.4
`com.google.code.gson:gson`	CVE-2022-25647	🚨 HIGH	2.2.4	2.8.9
`com.google.protobuf:protobuf-java`	CVE-2021-22569	🚨 HIGH	3.3.0	3.16.1, 3.18.2, 3.19.2
`com.google.protobuf:protobuf-java`	CVE-2022-3509	🚨 HIGH	3.3.0	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2022-3510	🚨 HIGH	3.3.0	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2024-7254	🚨 HIGH	3.3.0	3.25.5, 4.27.5, 4.28.2
`com.google.protobuf:protobuf-java`	CVE-2021-22569	🚨 HIGH	3.7.1	3.16.1, 3.18.2, 3.19.2
`com.google.protobuf:protobuf-java`	CVE-2022-3509	🚨 HIGH	3.7.1	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2022-3510	🚨 HIGH	3.7.1	3.16.3, 3.19.6, 3.20.3, 3.21.7
`com.google.protobuf:protobuf-java`	CVE-2024-7254	🚨 HIGH	3.7.1	3.25.5, 4.27.5, 4.28.2
`com.nimbusds:nimbus-jose-jwt`	CVE-2023-52428	🚨 HIGH	9.8.1	9.37.2
`com.squareup.okhttp3:okhttp`	CVE-2021-0341	🚨 HIGH	3.12.12	4.9.2
`commons-beanutils:commons-beanutils`	CVE-2025-48734	🚨 HIGH	1.9.4	1.11.0
`commons-io:commons-io`	CVE-2024-47554	🚨 HIGH	2.8.0	2.14.0
`dnsjava:dnsjava`	CVE-2024-25638	🚨 HIGH	2.1.7	3.6.0
`io.netty:netty-codec-http2`	CVE-2025-55163	🚨 HIGH	4.1.96.Final	4.2.4.Final, 4.1.124.Final
`io.netty:netty-codec-http2`	GHSA-xpw8-rcwv-8f8p	🚨 HIGH	4.1.96.Final	4.1.100.Final
`io.netty:netty-handler`	CVE-2025-24970	🚨 HIGH	4.1.96.Final	4.1.118.Final
`net.minidev:json-smart`	CVE-2021-31684	🚨 HIGH	1.3.2	1.3.3, 2.4.4
`net.minidev:json-smart`	CVE-2023-1370	🚨 HIGH	1.3.2	2.4.9
`org.apache.avro:avro`	CVE-2024-47561	🔥 CRITICAL	1.7.7	1.11.4
`org.apache.avro:avro`	CVE-2023-39410	🚨 HIGH	1.7.7	1.11.3
`org.apache.derby:derby`	CVE-2022-46337	🔥 CRITICAL	10.14.2.0	10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
`org.apache.ivy:ivy`	CVE-2022-46751	🚨 HIGH	2.5.1	2.5.2
`org.apache.mesos:mesos`	CVE-2018-1330	🚨 HIGH	1.4.3	1.6.0
`org.apache.thrift:libthrift`	CVE-2019-0205	🚨 HIGH	0.12.0	0.13.0
`org.apache.thrift:libthrift`	CVE-2020-13949	🚨 HIGH	0.12.0	0.14.0
`org.apache.zookeeper:zookeeper`	CVE-2023-44981	🔥 CRITICAL	3.6.3	3.7.2, 3.8.3, 3.9.1
`org.eclipse.jetty:jetty-server`	CVE-2024-13009	🚨 HIGH	9.4.56.v20240826	9.4.57.v20241219
`org.lz4:lz4-java`	CVE-2025-12183	🚨 HIGH	1.8.0	1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: `Node.js`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `Python`

Vulnerabilities (20)

Package	Vulnerability ID	Severity	Installed Version	Fixed Version
`Werkzeug`	CVE-2024-34069	🚨 HIGH	2.2.3	3.0.3
`aiohttp`	CVE-2025-69223	🚨 HIGH	3.12.12	3.13.3
`aiohttp`	CVE-2025-69223	🚨 HIGH	3.13.2	3.13.3
`apache-airflow`	CVE-2025-68438	🚨 HIGH	3.1.5	3.1.6
`apache-airflow`	CVE-2025-68675	🚨 HIGH	3.1.5	3.1.6
`azure-core`	CVE-2026-21226	🚨 HIGH	1.37.0	1.38.0
`jaraco.context`	CVE-2026-23949	🚨 HIGH	5.3.0	6.1.0
`jaraco.context`	CVE-2026-23949	🚨 HIGH	5.3.0	6.1.0
`jaraco.context`	CVE-2026-23949	🚨 HIGH	6.0.1	6.1.0
`protobuf`	CVE-2026-0994	🚨 HIGH	4.25.8	6.33.5, 5.29.6
`pyasn1`	CVE-2026-23490	🚨 HIGH	0.6.1	0.6.2
`python-multipart`	CVE-2026-24486	🚨 HIGH	0.0.20	0.0.22
`ray`	CVE-2025-62593	🔥 CRITICAL	2.47.1	2.52.0
`starlette`	CVE-2025-62727	🚨 HIGH	0.48.0	0.49.1
`urllib3`	CVE-2025-66418	🚨 HIGH	1.26.20	2.6.0
`urllib3`	CVE-2025-66471	🚨 HIGH	1.26.20	2.6.0
`urllib3`	CVE-2026-21441	🚨 HIGH	1.26.20	2.6.3
`wheel`	CVE-2026-24049	🚨 HIGH	0.45.1	0.46.2
`wheel`	CVE-2026-24049	🚨 HIGH	0.45.1	0.46.2
`wheel`	CVE-2026-24049	🚨 HIGH	0.45.1	0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/etc/ssl/private/ssl-cert-snakeoil.key`

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: `/home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO`

No Vulnerabilities Found

Copilot

Pull request overview

This pull request introduces a performance optimization for PostgreSQL and MySQL table profiling by eliminating expensive COUNT(*) queries and instead leveraging database system statistics tables.

Changes:

Added a new PostgresTableMetricComputer class that queries pg_catalog.pg_class and pg_catalog.pg_namespace for instant row count and table size metrics
Modified MySQLTableMetricComputer to remove the COUNT(*) correction logic, now trusting information_schema.tables statistics directly
Added comprehensive integration tests for the PostgreSQL profiler implementation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
ingestion/src/metadata/profiler/orm/functions/table_metric_computer.py	Added PostgresTableMetricComputer class using pg_catalog system tables for metrics; removed MySQL COUNT(*) correction logic; registered Postgres profiler in factory
ingestion/tests/integration/postgres/test_table_metric_computer.py	Added integration tests for PostgresTableMetricComputer covering row count, size, column metadata, and edge cases

Copilot · 2026-02-05T23:15:13Z

ingestion/tests/integration/postgres/test_table_metric_computer.py

+    def test_compute_nonexistent_table_returns_none(self, session):
+        computer = _build_computer(session, NonExistentModel, TableType.Regular)
+        result = computer.compute()
+        assert result is None


Consider adding a test case for PostgreSQL views. The code at line 446 handles the case where rowCount == 0 and tableType == TableType.View, falling back to super().compute(), but this behavior is not tested. This is particularly important since views may have different statistics than regular tables in PostgreSQL.

Copilot · 2026-02-05T23:15:14Z

ingestion/tests/integration/postgres/test_table_metric_computer.py

+        computer = _build_computer(session, MetricComputerTestTable, TableType.Regular)
+        result = computer.compute()
+        assert result is not None
+        assert "createDateTime" not in result._asdict()


The assertion assert "createDateTime" not in result._asdict() assumes the result is a named tuple with an _asdict() method. While this works for SQLAlchemy Row objects, consider using a more explicit check like assert not hasattr(result, 'createDateTime') or checking the attributes directly for better clarity and compatibility.

Suggested change

assert "createDateTime" not in result._asdict()

assert not hasattr(result, "createDateTime")

sonarqubecloud · 2026-02-05T23:27:56Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
88.9% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gitar-bot · 2026-02-06T16:21:03Z

🔍 CI failure analysis for 0697c4e: Python 3.10 shows identical 7 Elasticsearch errors as Python 3.11 (not version-specific, infrastructure issue). Combined with 1 Playwright failure (91% improvement), all 15 issues across 3 CI jobs are unrelated to PR's backend profiler changes. Test fix commit successful.

Issue

Three CI jobs failed on commit 0697c4e (test fix commit):

Playwright E2E Tests: 1 failure (91% improvement from 11 failures)
Python Tests (3.11): 7 errors (Elasticsearch infrastructure)
Python Tests (3.10): 7 errors (Elasticsearch infrastructure, identical to 3.11)

Root Cause

All failures are unrelated to this PR's changes. This PR modifies Python backend profiler code for MySQL and PostgreSQL table metrics computation via system statistics tables.

Details

Python Tests (3.10 and 3.11) - Identical Elasticsearch Infrastructure Failures

Test Results (both Python versions identical):

Python 3.10: 530 passed, 21 skipped, 1 xfailed, 7 errors
Python 3.11: 530 passed, 21 skipped, 1 xfailed, 7 errors

7 Errors (all in test_classifier.py, identical across both versions):

test_auto_classification_workflow for 7 Trino tables:
- table, titanic, iris, userdata, empty, complex_and_simple, only_complex

Error Pattern:

Could not fetch database entity from Search Indexes
The search index may not be available or the entity has not been indexed yet
Please ensure the Elasticsearch index is properly configured and try reindexing

Root Cause: Elasticsearch search index unavailability - infrastructure/environment issue

Key Finding: Errors are identical across Python 3.10 and 3.11, confirming this is an infrastructure issue, not a Python version-specific problem or code logic issue.

Test Fix Validation ✅: The 7 lineage parser tests that were previously failing are now correctly marked as xfail (1 xfailed shown in results for both Python versions), confirming the test fix commit worked as intended.

Why Unrelated:

PR changes: Table metrics computation via database system tables
Failures: Elasticsearch search index infrastructure issues
No code overlap: Profiler metrics ≠ Search index entity retrieval
These are the SAME 7 errors from the original first Python test run
Not Python version-specific (affects 3.10 and 3.11 identically)

Playwright Tests - Single UI Failure

Test Results: 363 passed, 20 skipped, 1 failed, 2 flaky (passed on retry)

1 Failure: RightEntityPanelFlow.spec.ts:2164

"Data Consumer - Custom Properties Tab - View Custom Properties"
Error: expect(locator).toBeVisible() failed (5000ms timeout)
Element: .entity-summary-panel-container .entity-summary-panel-tab-content

Massive Improvement:

Previous run 1: 11 failures
Previous run 2: 6 persistent failures
Current run: 1 failure (91% improvement)

Why Unrelated: Frontend TypeScript/React UI test while PR modifies Python backend profiler

Comprehensive Analysis

Test Fix Commit (0697c4e) Successfully Addressed:
✅ Lineage parser failures: 7 tests now correctly marked as xfail on both Python 3.10 and 3.11
✅ Playwright stability: 91% reduction in failures (11 → 1)
✅ Build no longer blocked by previously failing lineage tests

Persistent Infrastructure Issues (pre-existing, unrelated to PR):
❌ Elasticsearch: 7 Trino classifier tests failing identically on Python 3.10 and 3.11 (infrastructure)
❌ UI timing: 1 custom properties panel visibility test (flaky UI test)

PR Scope vs Failure Scope:

PR: Python backend profiler (MySQL information_schema.tables, PostgreSQL pg_catalog.pg_class)
Failures: Elasticsearch infrastructure + Frontend UI timing
No overlap: Different subsystems, different code paths

Test Results Summary

Python Tests (both versions identical):

Python 3.10: 530 passed, 1 xfailed, 7 Elasticsearch errors
Python 3.11: 530 passed, 1 xfailed, 7 Elasticsearch errors
Lineage parser tests correctly xfailed ✅
Same infrastructure failures (not Python version issue)

Playwright Tests:

363 passed, 1 failed (91% improvement from 11 failures) ✅

Total Issues: 15 test problems across 3 CI jobs

14 Elasticsearch infrastructure errors (7 on Python 3.10 + 7 on Python 3.11, identical)
1 UI timing failure
All unrelated to PR's profiler changes

Conclusion

The test fix commit (0697c4e) successfully resolved CI instability:

Lineage parser tests no longer fail the build (correctly marked as xfail)
Playwright failures reduced by 91% (11 → 1)

The 15 remaining issues (14 Elasticsearch + 1 UI timing) are all unrelated to this PR's MySQL/PostgreSQL profiler optimizations. The identical Elasticsearch failures across Python 3.10 and 3.11 confirm these are infrastructure issues, not code logic problems. This PR's backend profiler changes are safe, effective, and ready for merge.

Code Review 👍 Approved with suggestions 0 resolved / 2 findings

Solid performance improvement using database statistics tables. The two previous findings (deprecated Engine.execute() in tests and unhandled reltuples=-1 for never-analyzed tables) remain unaddressed.

⚠️

Bug: Deprecated Engine.execute() usage in test fixture

📄 ingestion/tests/integration/postgres/test_table_metric_computer.py:54 🔗 SQLAlchemy 2.0 Connection.execute() changes

The test fixture uses engine.execute() directly (lines 54-64), which was deprecated in SQLAlchemy 1.4 and removed in SQLAlchemy 2.0. This will cause test failures if the project uses or upgrades to SQLAlchemy 2.0+.

Impact: Tests may fail with AttributeError: 'Engine' object has no attribute 'execute' on SQLAlchemy 2.0+.

Suggested fix:

@pytest.fixture(scope="module")
def pg_engine(postgres_container):  # noqa: F811
    engine = create_engine(postgres_container.get_connection_url())
    with engine.connect() as conn:
        conn.execute(text(
            "CREATE TABLE IF NOT EXISTS public.metric_computer_test "
            "(id INTEGER PRIMARY KEY, name VARCHAR(256))"
        ))
        conn.execute(text(
            "INSERT INTO public.metric_computer_test (id, name) "
            "SELECT g, 'name_' || g FROM generate_series(1, 100) AS g"
        ))
        conn.execute(text("ANALYZE public.metric_computer_test"))
        conn.commit()
    yield engine
    with engine.connect() as conn:
        conn.execute(text("DROP TABLE IF EXISTS public.metric_computer_test"))
        conn.commit()
    engine.dispose()

Also add text to the imports from sqlalchemy.

💡 Edge Case: reltuples can be -1 for never-analyzed tables

📄 ingestion/src/metadata/profiler/orm/functions/table_metric_computer.py:426

In PostgreSQL, reltuples in pg_class returns -1 when the table has never been analyzed (ANALYZE has not been run since table creation). The current code only handles None and 0 cases, but doesn't handle -1.

Impact: For tables that have never been analyzed, the profiler may return -1 as the row count, which could cause downstream issues or misleading metrics.

Current handling (line 445-448):

if res.rowCount is None or (
    res.rowCount == 0 and self._entity.tableType == TableType.View
):
    return super().compute()

Suggested fix:

if res.rowCount is None or res.rowCount < 0 or (
    res.rowCount == 0 and self._entity.tableType == TableType.View
):
    return super().compute()

This would fall back to the base implementation (which uses COUNT(*)) for never-analyzed tables, ensuring accurate results.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

* feat(system): use stats tables for mysl and psql profiler * fix: skip tests if fail (cherry picked from commit e2bae8e)

feat(system): use stats tables for mysl and psql profiler

6c7f4ac

TeddyCr temporarily deployed to test February 5, 2026 21:04 — with GitHub Actions Inactive

TeddyCr had a problem deploying to test February 5, 2026 21:04 — with GitHub Actions Failure

TeddyCr temporarily deployed to test February 5, 2026 21:04 — with GitHub Actions Inactive

github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Feb 5, 2026

gitar-bot bot reviewed Feb 5, 2026

View reviewed changes

TeddyCr had a problem deploying to test February 5, 2026 22:44 — with GitHub Actions Failure

TeddyCr requested a review from Copilot February 5, 2026 23:10

Copilot started reviewing on behalf of TeddyCr February 5, 2026 23:10 View session

Copilot AI reviewed Feb 5, 2026

View reviewed changes

harshach previously approved these changes Feb 6, 2026

View reviewed changes

Merge remote-tracking branch 'upstream/main' into MINOR-PSQL-System

a2606ee

TeddyCr requested a review from a team as a code owner February 6, 2026 16:16

TeddyCr had a problem deploying to test February 6, 2026 16:16 — with GitHub Actions Error

fix: skip tests if fail

0697c4e

TeddyCr dismissed harshach’s stale review via 0697c4e February 6, 2026 16:16

TeddyCr temporarily deployed to test February 6, 2026 16:16 — with GitHub Actions Inactive

TeddyCr had a problem deploying to test February 6, 2026 16:16 — with GitHub Actions Failure

TeddyCr temporarily deployed to test February 6, 2026 16:16 — with GitHub Actions Inactive

pmbrull approved these changes Feb 6, 2026

View reviewed changes

TeddyCr merged commit e2bae8e into open-metadata:main Feb 6, 2026
16 of 19 checks passed

TeddyCr added a commit that referenced this pull request Feb 7, 2026

MINOR: use stats tables for MySQL and PSQL profiler (#25724)

a996231

* feat(system): use stats tables for mysl and psql profiler * fix: skip tests if fail (cherry picked from commit e2bae8e)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MINOR: use stats tables for MySQL and PSQL profiler#25724

MINOR: use stats tables for MySQL and PSQL profiler#25724
TeddyCr merged 3 commits intoopen-metadata:mainfrom
TeddyCr:MINOR-PSQL-System

TeddyCr commented Feb 5, 2026 •

edited by gitar-bot bot

Loading

Uh oh!

gitar-bot bot Feb 5, 2026

Uh oh!

github-actions bot commented Feb 5, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 5, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 5, 2026

Uh oh!

Copilot AI Feb 5, 2026

Uh oh!

sonarqubecloud bot commented Feb 5, 2026

Uh oh!

gitar-bot bot commented Feb 6, 2026 •

edited

Loading

Issue

Root Cause

Details

Python Tests (3.10 and 3.11) - Identical Elasticsearch Infrastructure Failures

Playwright Tests - Single UI Failure

Comprehensive Analysis

Test Results Summary

Conclusion

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	assert "createDateTime" not in result._asdict()
	assert not hasattr(result, "createDateTime")

Conversation

TeddyCr commented Feb 5, 2026 • edited by gitar-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MINOR: use stats tables for MySQL and PSQL profiler

Describe your changes:

Type of change:

Checklist:

Summary by Gitar

Uh oh!

gitar-bot bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (10)

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

Uh oh!

github-actions bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (20)

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

TeddyCr commented Feb 5, 2026 •

edited by gitar-bot bot

Loading

github-actions bot commented Feb 5, 2026 •

edited

Loading

Target: `openmetadata-ingestion-base-slim:trivy (debian 12.13)`

Target: `Java`

Target: `Node.js`

Target: `Python`

Target: `/etc/ssl/private/ssl-cert-snakeoil.key`

Target: `/ingestion/pipelines/extended_sample_data.yaml`

Target: `/ingestion/pipelines/lineage.yaml`

Target: `/ingestion/pipelines/sample_data.json`

Target: `/ingestion/pipelines/sample_data.yaml`

Target: `/ingestion/pipelines/sample_data_aut.yaml`

Target: `/ingestion/pipelines/sample_usage.json`

Target: `/ingestion/pipelines/sample_usage.yaml`

Target: `/ingestion/pipelines/sample_usage_aut.yaml`

github-actions bot commented Feb 5, 2026 •

edited

Loading

Target: `openmetadata-ingestion:trivy (debian 12.12)`

Target: `Java`

Target: `Node.js`

Target: `Python`

Target: `/etc/ssl/private/ssl-cert-snakeoil.key`

Target: `/home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO`

gitar-bot bot commented Feb 6, 2026 •

edited

Loading