Skip to content

fix(profiler): SAP HANA CREATE_TIME from SYS.TABLES + uppercase catalog matching#27483

Open
IceS2 wants to merge 6 commits intomainfrom
task/debug-sap-hana-profiler-create-time-miss-53888850
Open

fix(profiler): SAP HANA CREATE_TIME from SYS.TABLES + uppercase catalog matching#27483
IceS2 wants to merge 6 commits intomainfrom
task/debug-sap-hana-profiler-create-time-miss-53888850

Conversation

@IceS2
Copy link
Copy Markdown
Contributor

@IceS2 IceS2 commented Apr 17, 2026

Fixes #27439

Summary

  • CREATE_TIME fix: HANA's SYS.M_TABLES lacks CREATE_TIME. Query SYS.TABLES via a second CTE + LEFT JOIN so createDateTime is populated in table profiles.
  • Case sensitivity fix: HANA catalog stores identifiers in uppercase but OM stores them lowercase. Apply .upper() to schema/table names in WHERE clauses so rows actually match.
  • Tests: 5 new unit tests covering uppercase matching, LEFT JOIN NULL handling, CTE structure, nonexistent table, and column metadata labels.

Test plan

  • Unit tests pass (pytest test_table_metric_computer.py — 28/28)
  • E2E validated against HANA Express: libraries_ table profiled with rowCount=7, columnCount=9, sizeInByte=16480, createDateTime=2026-04-16T15:32:08
image

…case catalog matching

HANA's SYS.M_TABLES lacks CREATE_TIME — query SYS.TABLES via CTE + LEFT JOIN.
HANA catalog stores identifiers uppercase but OM stores lowercase — apply .upper() to WHERE filters.
@IceS2 IceS2 requested a review from a team as a code owner April 17, 2026 12:20
Copilot AI review requested due to automatic review settings April 17, 2026 12:20
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Apr 17, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes SAP HANA table profiling so createDateTime is correctly populated and table/schema matching works when OpenMetadata stores identifiers in lowercase but HANA system views use uppercase.

Changes:

  • Update SAPHanaTableMetricComputer to pull CREATE_TIME from SYS.TABLES via a second CTE and LEFT JOIN to SYS.M_TABLES.
  • Uppercase schema/table identifiers used in the HANA system-view filters to ensure rows match.
  • Add unit tests covering the new CTE/join structure, uppercase matching, and NULL create-time behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
ingestion/src/metadata/profiler/orm/functions/table_metric_computer.py Reworks HANA metrics query to join SYS.M_TABLES with SYS.TABLES and uppercase identifiers for matching.
ingestion/tests/unit/observability/profiler/test_table_metric_computer.py Adds tests validating the new HANA query shape (CTEs + LEFT JOIN) and uppercase filtering behavior.

Comment on lines +353 to +355
assert sql_upper.count("WITH") >= 1 or sql_upper.count("AS (") >= 2 or sql_upper.count("AS \n(") >= 2, (
f"Expected two CTEs in query, got: {sql}"
)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion in this test doesn’t actually verify that there are two CTEs: sql_upper.count("WITH") >= 1 will pass even with a single CTE, and the or chain makes the check very permissive. Consider tightening this to assert the presence of both CTE definitions (e.g., both M_TABLES and TABLES appear in the WITH ... clause, or count CTE AS ( occurrences in a dialect-stable way).

Suggested change
assert sql_upper.count("WITH") >= 1 or sql_upper.count("AS (") >= 2 or sql_upper.count("AS \n(") >= 2, (
f"Expected two CTEs in query, got: {sql}"
)
assert "WITH" in sql_upper, f"Expected WITH clause in query, got: {sql}"
assert "M_TABLES AS (" in sql_upper, (
f"Expected M_TABLES CTE definition in query, got: {sql}"
)
assert "TABLES AS (" in sql_upper, (
f"Expected TABLES CTE definition in query, got: {sql}"
)

Copilot uses AI. Check for mistakes.
Comment on lines +362 to +367
"""When table not found in either system view, execute().first() returns None."""
session = _build_mock_session()
session.execute.return_value.first.return_value = None
computer = _build_computer(session, SAPHanaTableMetricComputer)
result = computer.compute()
assert result is None
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_compute_returns_none_for_nonexistent_table duplicates test_compute_returns_none_when_no_result earlier in the same class (both only assert compute() returns None when execute().first() is None). Consider removing one, or adjusting this one to cover a distinct case not already exercised.

Suggested change
"""When table not found in either system view, execute().first() returns None."""
session = _build_mock_session()
session.execute.return_value.first.return_value = None
computer = _build_computer(session, SAPHanaTableMetricComputer)
result = computer.compute()
assert result is None
"""When table is absent from HANA system views, compute returns None and
still queries using uppercased identifiers expected by the catalog."""
session = _build_mock_session()
session.execute.return_value.first.return_value = None
computer = _build_computer(session, SAPHanaTableMetricComputer)
result = computer.compute()
sql = str(
session.execute.call_args[0][0].compile(
compile_kwargs={"literal_binds": True}
)
)
assert result is None
assert "TEST_SCHEMA" in sql, (
f"Nonexistent-table lookup must use uppercased schema name, got: {sql}"
)
assert "TEST_TABLE" in sql, (
f"Nonexistent-table lookup must use uppercased table name, got: {sql}"
)

Copilot uses AI. Check for mistakes.
session.execute.return_value.first.return_value = mock_result
computer = _build_computer(session, SAPHanaTableMetricComputer)
computer.compute()
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These .compile(compile_kwargs={"literal_binds": True}) lines exceed Black’s default line length (88) and will likely fail the repo’s black --check formatting step (see ingestion/noxfile.py:55). Running Black should wrap these expressions into multiple lines.

Suggested change
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
sql = str(
session.execute.call_args[0][0].compile(
compile_kwargs={"literal_binds": True}
)
)

Copilot uses AI. Check for mistakes.
session.execute.return_value.first.return_value = mock_result
computer = _build_computer(session, SAPHanaTableMetricComputer)
computer.compute()
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is likely not Black-formatted (it exceeds Black’s 88-char line length). Please run the formatter or wrap the str(session.execute...compile(...)) call so black --check passes.

Copilot uses AI. Check for mistakes.
session.execute.return_value.first.return_value = mock_result
computer = _build_computer(session, SAPHanaTableMetricComputer)
computer.compute()
sql = str(session.execute.call_args[0][0].compile(compile_kwargs={"literal_binds": True}))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line likely exceeds Black’s 88-character limit and will fail formatting checks. Consider wrapping the .compile(compile_kwargs={...}) call across lines and re-running black.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

IceS2 added 2 commits April 17, 2026 16:43
…ack formatting

- Guard against None schema/table name before .upper() in HANA compute
- Tighten CTE assertion to check both M_TABLES and TABLES definitions
- Make nonexistent-table test distinct by verifying uppercase in query
- Apply Black formatting fixes
Copilot AI review requested due to automatic review settings April 17, 2026 15:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

TeddyCr
TeddyCr previously approved these changes Apr 17, 2026
@TeddyCr TeddyCr enabled auto-merge (squash) April 17, 2026 15:26
auto-merge was automatically disabled April 17, 2026 15:47

Head branch was pushed to by a user without write access

Copilot AI review requested due to automatic review settings April 17, 2026 15:50
@IceS2 IceS2 review requested due to automatic review settings April 17, 2026 15:50
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 17, 2026

Code Review ✅ Approved

Updates SAP HANA profiler to correctly parse CREATE_TIME from SYS.TABLES and implements case-insensitive catalog matching. No issues found.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 17, 2026

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • www.antlr.org
    • Triggering command: /usr/bin/curl curl REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

🔴 Playwright Results — 3 failure(s), 26 flaky

✅ 3634 passed · ❌ 3 failed · 🟡 26 flaky · ⏭️ 111 skipped

Shard Passed Failed Flaky Skipped
🔴 Shard 1 453 1 4 26
🟡 Shard 2 647 0 4 7
🔴 Shard 3 651 1 8 1
🟡 Shard 4 632 0 2 27
🔴 Shard 5 609 1 1 42
🟡 Shard 6 642 0 7 8

Genuine Failures (failed on all attempts)

Features/DataAssetRulesDisabled.spec.ts › Verify the Chart entity item action after rules disabled (shard 1)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoContainText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator: getByTestId('domain-link')
Expected substring: �[32m"PW Domain 4�[7m7413388�[27m"�[39m
Received string:    �[31m"PW Domain 4�[7m1703c7c�[27m"�[39m
Timeout: 15000ms

Call log:
�[2m  - Expect "toContainText" with timeout 15000ms�[22m
�[2m  - waiting for getByTestId('domain-link')�[22m
�[2m    19 × locator resolved to <a data-testid="domain-link" href="/domain/%22PW%25domain.41703c7c%22" class="no-underline domain-link domain-link-text font-medium text-sm render-domain-lebel-style">PW Domain 41703c7c</a>�[22m
�[2m       - unexpected value "PW Domain 41703c7c"�[22m

Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3)
Error: Timeout 60000ms exceeded while waiting on the predicate
Pages/Glossary.spec.ts › Add and Remove Assets (shard 5)
�[31mTest timeout of 180000ms exceeded.�[39m
🟡 26 flaky test(s) (passed on retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Database entity item action after rules disabled (shard 1, 1 retry)
  • Features/Pagination.spec.ts › should test pagination on Classification Tags page (shard 1, 1 retry)
  • Flow/Tour.spec.ts › Tour should work from welcome screen (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/ChangeSummaryBadge.spec.ts › Automated badge should appear on entity description with Automated source (shard 2, 1 retry)
  • Features/DomainTierCertificationVoting.spec.ts › DataProduct - Certification assign, update, and remove (shard 2, 1 retry)
  • Features/Glossary/GlossaryP3Tests.spec.ts › should handle multiple rapid API calls (shard 2, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/UserProfileOnlineStatus.spec.ts › Should show online status badge on user profile for active users (shard 3, 1 retry)
  • Features/UserProfileOnlineStatus.spec.ts › Should not show online status for inactive users (shard 3, 1 retry)
  • Flow/ConditionalPermissions.spec.ts › User with matchAnyTag permission can only view Container with the tag (shard 3, 1 retry)
  • Flow/CustomizeWidgets.spec.ts › Total Data Assets Widget (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for DashboardDataModel (shard 4, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › Should verify deleted user not visible in owner selection for database (shard 5, 1 retry)
  • Features/AutoPilot.spec.ts › Create Service and check the AutoPilot status (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › Inactive Announcement create & delete (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The profiler fails because it queries a CREATE_TIME column from SYS.M_TABLES that doesn't exist

4 participants