Skip to content

Fix #26274: Powerbi - Add support for bigquery NativeQuery Lineage#26275

Merged
ulixius9 merged 3 commits intomainfrom
issue-26274
Mar 6, 2026
Merged

Fix #26274: Powerbi - Add support for bigquery NativeQuery Lineage#26275
ulixius9 merged 3 commits intomainfrom
issue-26274

Conversation

@ulixius9
Copy link
Member

@ulixius9 ulixius9 commented Mar 6, 2026

Describe your changes:

Fixes #26274

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • BigQuery NativeQuery Support:
    • Added _parse_bigquery_query_source() method to extract lineage from Value.NativeQuery expressions with inline SQL
    • Integrated BigQuery NativeQuery parsing into _parse_bigquery_source() with BIGQUERY_QUERY_EXPRESSION_KW constant
  • Error Handling:
    • Added try-catch for invalid FQN characters in build_es_fqn_search_string() to skip problematic tables
  • Tests:
    • Added 8 comprehensive test cases covering direct BigQuery navigation, NativeQuery with comments/CTEs, block comments, and edge cases

This will update automatically on new commits.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (37)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (9)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (38)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.16.1 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (22)

Package Vulnerability ID Severity Installed Version Fixed Version
Authlib CVE-2026-28802 🚨 HIGH 1.6.6 1.6.7
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
google-cloud-aiplatform CVE-2026-2472 🚨 HIGH 1.130.0 1.131.0
google-cloud-aiplatform CVE-2026-2473 🚨 HIGH 1.130.0 1.133.0
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.5 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2025-61726 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61728 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61730 🚨 HIGH v1.25.5 1.24.12, 1.25.6

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@gitar-bot
Copy link

gitar-bot bot commented Mar 6, 2026

🔍 CI failure analysis for b414c12: All CI failures are infrastructure/flakiness issues unrelated to this PR's PowerBI ingestion changes. Retrying the jobs should resolve these failures.

Issue

Multiple CI failures observed across different jobs. None are related to this PR's changes (PowerBI ingestion Python files in ingestion/src/metadata/ingestion/source/dashboard/powerbi/).


playwright-ci-postgresql (4, 6) — Browser/context crashes

Fails on:

Entity.spec.ts:2137:7 - Test timeout of 180000ms exceeded.
Error: page.waitForLoadState: Target page, context or browser has been closed

Root cause: 6 out of 712 tests fail due to browser/page context closure during test execution. Failures cluster around the end of the 1.3-hour test suite, consistent with resource exhaustion (memory/CPU) after sustained parallel execution across 4 shards:

  • Entity.spec.ts:2137 — browser closed during waitForLoadState('networkidle')
  • LineageSettings.spec.ts:134 — 540s timeout, browser closed
  • Customproperties-part2.spec.ts:187 — 180s timeout waiting for /api/v1/search/query
  • Domains.spec.ts:1168 — 540s timeout, browser closed
  • Entity.spec.ts:1376 — glossary tag element not found
  • Entity.spec.ts:294 — ML Model Tier assertion mismatch

playwright-ci-postgresql (6, 6) — Lineage UI E2E test (previous run)

Failed on:

playwright/e2e/Pages/Lineage.spec.ts:466 - "Verify table search with special characters as handled"

Root cause: The test attempts to click a lineage node for a table with a forward slash in its name (pw-table-with/slash-...) but the element was never found in the DOM within the 60s timeout. 608 out of 609 tests passed.


py-run-tests (3.10) — Flaky network test (previous run)

Failed on:

tests/unit/topology/dashboard/test_lookml_bitbucket_reader.py::TestLookMLBitBucketReader::test_lookml_read_and_parse

Root cause: The test makes a real network call to api.bitbucket.org which timed out in CI (HTTPSConnectionPool(host='api.bitbucket.org', port=443): Read timed out. (read timeout=30)). All other 4172 tests passed.


Details

This PR only modifies files in ingestion/src/metadata/ingestion/source/dashboard/powerbi/ (Python backend for PowerBI ingestion). None of these failures are in the same module or related to the changes made:

  • The browser crash failures are infrastructure/resource-exhaustion issues in Playwright E2E tests
  • The LookML test depends on external Bitbucket network connectivity
  • The lineage special-character test is a pre-existing flaky test unrelated to PowerBI ingestion

All failures can be addressed by retrying the CI jobs (say "Gitar retry it").

Code Review 👍 Approved with suggestions 0 resolved / 2 findings

Adds BigQuery NativeQuery lineage support to PowerBI ingestion, but two minor issues remain: the NativeQuery keyword check runs before comment stripping in the raw M expression, and the SQL line comment pattern is misnamed and applied to SQL queries rather than M comments.

💡 Edge Case: NativeQuery keyword check on raw expression matches M comments

📄 ingestion/src/metadata/ingestion/source/dashboard/powerbi/metadata.py:1156

In _parse_bigquery_source, the check if BIGQUERY_QUERY_EXPRESSION_KW in source_expression at line 1156 is evaluated on the raw M expression before comments are stripped. If a user has a commented-out Value.NativeQuery(GoogleBigQuery.Database( line (e.g., // Source = Value.NativeQuery(GoogleBigQuery.Database(...)) but the actual source is a direct BigQuery navigation, the code will enter the NativeQuery parsing path and return None, silently skipping lineage for a valid direct connection.

The block-comment test case (MOCK_BIGQUERY_NATIVE_QUERY_BLOCK_COMMENTS_EXP) works because both the commented and real sources are NativeQuery. The untested edge case is a commented-out NativeQuery with a real direct navigation source.

Consider stripping M comments at the beginning of _parse_bigquery_source (before the keyword checks), similar to how _parse_bigquery_query_source does it internally.

Suggested fix
        # Strip M language comments before checking keywords
        cleaned_expression = re.sub(
            r"/\*.*?\*/", "", source_expression, flags=re.DOTALL
        )
        cleaned_expression = re.sub(
            SQL_LINE_COMMENT_PATTERN, "", cleaned_expression
        )

        # Handle Value.NativeQuery with inline SQL
        if BIGQUERY_QUERY_EXPRESSION_KW in cleaned_expression:
            return self._parse_bigquery_query_source(source_expression)
💡 Quality: M-language comment pattern misnamed and applied to SQL queries

📄 ingestion/src/metadata/ingestion/source/dashboard/powerbi/constants.py:50 📄 ingestion/src/metadata/ingestion/source/dashboard/powerbi/metadata.py:1054

The constant SQL_LINE_COMMENT_PATTERN = r"//[^ ]*" strips //-style comments, which are Power Query M language comments, not SQL comments (SQL uses --). The name SQL_LINE_COMMENT_PATTERN is misleading.

More importantly, this pattern is applied to extracted SQL queries at line 1054 (_parse_bigquery_query_source) and line 1249 (_parse_snowflake_query_source), after M comments have already been stripped from the surrounding expression. Applying // removal to SQL content is unnecessary (M comments should be gone) and could theoretically corrupt SQL containing // characters in string literals (e.g., URLs).

Consider renaming to M_LINE_COMMENT_PATTERN and removing its application to already-extracted SQL queries (keep only re.sub(r"--[^ ]*", ...) for SQL).

🤖 Prompt for agents
Code Review: Adds BigQuery NativeQuery lineage support to PowerBI ingestion, but two minor issues remain: the NativeQuery keyword check runs before comment stripping in the raw M expression, and the SQL line comment pattern is misnamed and applied to SQL queries rather than M comments.

1. 💡 Edge Case: NativeQuery keyword check on raw expression matches M comments
   Files: ingestion/src/metadata/ingestion/source/dashboard/powerbi/metadata.py:1156

   In `_parse_bigquery_source`, the check `if BIGQUERY_QUERY_EXPRESSION_KW in source_expression` at line 1156 is evaluated on the raw M expression *before* comments are stripped. If a user has a commented-out `Value.NativeQuery(GoogleBigQuery.Database(` line (e.g., `// Source = Value.NativeQuery(GoogleBigQuery.Database(...)`) but the actual source is a direct BigQuery navigation, the code will enter the NativeQuery parsing path and return `None`, silently skipping lineage for a valid direct connection.
   
   The block-comment test case (`MOCK_BIGQUERY_NATIVE_QUERY_BLOCK_COMMENTS_EXP`) works because both the commented and real sources are NativeQuery. The untested edge case is a commented-out NativeQuery with a real direct navigation source.
   
   Consider stripping M comments at the beginning of `_parse_bigquery_source` (before the keyword checks), similar to how `_parse_bigquery_query_source` does it internally.

   Suggested fix:
   # Strip M language comments before checking keywords
   cleaned_expression = re.sub(
       r"/\*.*?\*/", "", source_expression, flags=re.DOTALL
   )
   cleaned_expression = re.sub(
       SQL_LINE_COMMENT_PATTERN, "", cleaned_expression
   )
   
   # Handle Value.NativeQuery with inline SQL
   if BIGQUERY_QUERY_EXPRESSION_KW in cleaned_expression:
       return self._parse_bigquery_query_source(source_expression)

2. 💡 Quality: M-language comment pattern misnamed and applied to SQL queries
   Files: ingestion/src/metadata/ingestion/source/dashboard/powerbi/constants.py:50, ingestion/src/metadata/ingestion/source/dashboard/powerbi/metadata.py:1054

   The constant `SQL_LINE_COMMENT_PATTERN = r"//[^
   ]*"` strips `//`-style comments, which are Power Query M language comments, not SQL comments (SQL uses `--`). The name `SQL_LINE_COMMENT_PATTERN` is misleading.
   
   More importantly, this pattern is applied to **extracted SQL queries** at line 1054 (`_parse_bigquery_query_source`) and line 1249 (`_parse_snowflake_query_source`), after M comments have already been stripped from the surrounding expression. Applying `//` removal to SQL content is unnecessary (M comments should be gone) and could theoretically corrupt SQL containing `//` characters in string literals (e.g., URLs).
   
   Consider renaming to `M_LINE_COMMENT_PATTERN` and removing its application to already-extracted SQL queries (keep only `re.sub(r"--[^
   ]*", ...)` for SQL).

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 6, 2026

@ulixius9 ulixius9 merged commit c45858a into main Mar 6, 2026
38 of 41 checks passed
@ulixius9 ulixius9 deleted the issue-26274 branch March 6, 2026 08:32
harshsoni2024 added a commit that referenced this pull request Mar 6, 2026
…26275)

* Fix #26274: Powerbi - Add support for bigquery NativeQuery Lineage

* log invalid fqn table search

* sql line comment pattern constant

---------

Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
ulixius9 added a commit that referenced this pull request Mar 6, 2026
…26275) (#26282)

* Fix #26274: Powerbi - Add support for bigquery NativeQuery Lineage

* log invalid fqn table search

* sql line comment pattern constant

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
ulixius9 added a commit that referenced this pull request Mar 6, 2026
* Fix #26274: Powerbi - Add support for bigquery NativeQuery Lineage (#26275)

* Fix #26274: Powerbi - Add support for bigquery NativeQuery Lineage

* log invalid fqn table search

* sql line comment pattern constant

---------

Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>

* bump ingestion version

---------

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
harshsoni2024 added a commit that referenced this pull request Mar 6, 2026
…26275)

* Fix #26274: Powerbi - Add support for bigquery NativeQuery Lineage

* log invalid fqn table search

* sql line comment pattern constant

---------

Co-authored-by: harshsoni2024 <harshsoni2024@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Powerbi Add support for bigquery NativeQuery Lineage

2 participants