fix: enable subprocess coverage tracking for CLI E2E tests by ulixius9 · Pull Request #27329 · open-metadata/OpenMetadata

ulixius9 · 2026-04-13T17:51:42Z

Summary

CLI E2E tests run connectors via subprocess.Popen("metadata ingest") but subprocess coverage data was silently lost, meaning SonarCloud never saw the E2E execution coverage
Added parallel = true to coverage config so parent and child processes write to separate .coverage.<pid> files instead of colliding on a single .coverage file
Changed COVERAGE_PROCESS_START path from relative to absolute using GITHUB_WORKSPACE to ensure the subprocess finds the config regardless of working directory

Evidence: Metabase (zero unit tests, only E2E) shows 53.6% on SonarCloud with client.py at 17.2%. Inspection of .coverage.metabase confirms only import-time + in-process setUpClass lines are present — method bodies executed in the subprocess (API calls, response parsing) have zero coverage data.

Expected impact: All connectors in the E2E matrix should see significant SonarCloud coverage increases once subprocess execution data is captured (e.g., Metabase client.py from 17% to ~70%+).

Test plan

Trigger workflow manually for a single connector (e.g., metabase) with debug: false
Check coverage report output in CI logs — connector files should show higher coverage than before
Verify .coverage.* glob picks up multiple parallel data files during coverage combine
After SonarCloud upload, confirm coverage increase for the tested connector

🤖 Generated with Claude Code

CLI E2E tests run connectors via `subprocess.Popen("metadata ingest")` but the subprocess coverage data was silently lost. Two issues: 1. Missing `parallel = true` in coverage config — parent pytest process and child subprocess both wrote to the same `.coverage` file, causing data collision. With parallel mode, each process writes to its own `.coverage.<pid>` file that `coverage combine` can merge. 2. `COVERAGE_PROCESS_START` used a relative path (`ingestion/pyproject.toml`) in sitecustomize.py. Resolved to absolute using `GITHUB_WORKSPACE`. Evidence: Metabase (zero unit tests, only E2E) shows 53.6% on SonarCloud with client.py at 17.2% — inspection of .coverage.metabase confirms only import-time + in-process setup lines are present, with zero method body coverage from the subprocess execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Enables Python coverage collection for CLI E2E tests that execute connector code in subprocesses, so SonarCloud reflects coverage from those subprocess runs.

Changes:

Enabled Coverage.py parallel mode so multiple processes write separate .coverage.* data files instead of colliding.
Made COVERAGE_PROCESS_START point to an absolute pyproject.toml path (via GITHUB_WORKSPACE) so subprocesses can reliably find the coverage config.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`ingestion/pyproject.toml`	Turns on Coverage.py parallel mode to prevent multi-process data-file collisions.
`.github/workflows/py-cli-e2e-tests.yml`	Updates `sitecustomize.py` injection so subprocess coverage startup uses an absolute config path.

Copilot · 2026-04-13T17:56:50Z

ingestion/pyproject.toml

 source = ["metadata"]
 relative_files = true
 branch = true
+parallel = true


Setting parallel = true in the global [tool.coverage.run] config changes Coverage.py’s default output from a single .coverage file to per-process .coverage.* files. This workflow (and other CI/local targets) currently assumes .coverage exists (e.g., mv ingestion/.coverage ... in the python-unittests/python-integration matrix entries, and local make run_python_tests doesn’t run coverage combine). Consider either (a) scoping parallel mode to the CLI E2E job via a dedicated coverage config used only for COVERAGE_PROCESS_START, or (b) updating the unit/integration flows to coverage combine (or move/rename the .coverage.* set) before referencing ingestion/.coverage.

Suggested change

parallel = true

* MINOR: Fix snowflake e2e * fix pyformat

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

ingestion/tests/cli_e2e/test_cli_snowflake.py

github-actions · 2026-04-14T11:05:43Z

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-14T12:07:08Z

ingestion/tests/cli_e2e/test_cli_snowflake.py

 class SnowflakeCliTest(CliCommonDB.TestSuite, SQACommonMethods):
    """


The PR description/title are focused on subprocess coverage collection, but this file also introduces substantial Snowflake E2E behavior changes (unskipping the suite, relaxing assertions, adding multiple new xfail’ed tests/features). To keep the coverage fix reviewable and reduce CI risk, consider splitting the Snowflake test refactor/feature additions into a separate PR (or explicitly call out why they’re required for the coverage change).

Copilot · 2026-04-14T12:07:08Z

ingestion/tests/cli_e2e/test_cli_snowflake.py

+    def build_config_file_for_usage(self) -> None:
+        """Build config file for usage ingestion"""
+        import yaml
+
+        self.build_config_file(E2EType.INGEST)
+
+        with open(self.test_file_path, "r", encoding="utf-8") as file:
+            config = yaml.safe_load(file)
+
+        config["source"]["type"] = "snowflake-usage"
+        config["source"]["sourceConfig"] = {
+            "config": {
+                "type": "DatabaseUsage",
+                "queryLogDuration": 1,
+                "resultLimit": 10000,
+            }
+        }
+
+        with open(self.test_file_path, "w", encoding="utf-8") as file:
+            yaml.dump(config, file, default_flow_style=False)
+


build_config_file_for_usage is added but not referenced anywhere in this test module (or the CLI E2E suite), which increases maintenance surface without exercising the behavior. Either remove it until it’s needed, or add a test that uses it so the configuration path is actually validated in CI.

Suggested change

def build_config_file_for_usage(self) -> None:

"""Build config file for usage ingestion"""

import yaml

self.build_config_file(E2EType.INGEST)

with open(self.test_file_path, "r", encoding="utf-8") as file:

config = yaml.safe_load(file)

config["source"]["type"] = "snowflake-usage"

config["source"]["sourceConfig"] = {

"config": {

"type": "DatabaseUsage",

"queryLogDuration": 1,

"resultLimit": 10000,

}

}

with open(self.test_file_path, "w", encoding="utf-8") as file:

yaml.dump(config, file, default_flow_style=False)

Copilot · 2026-04-14T12:07:08Z

ingestion/tests/cli_e2e/test_cli_snowflake.py

+        """Test stored procedures, tags, dynamic tables, streams, constraints,
+        and clustering in a single ingestion workflow."""
+        # -- 1. Create all Snowflake objects --
+        # Stored procedure (requires raw connection for USE DATABASE)


The comment says the stored procedure creation “requires raw connection for USE DATABASE”, but the code doesn’t execute USE DATABASE (and uses fully-qualified names). Please update the comment to match the actual reason for using raw_connection() (or switch to the SQLAlchemy connection if raw access isn’t required).

Suggested change

# Stored procedure (requires raw connection for USE DATABASE)

# Stored procedure

# Use a raw DBAPI connection for the procedure creation statement.

sonarqubecloud · 2026-04-14T14:52:51Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gitar-bot · 2026-04-14T15:38:10Z

ingestion/tests/cli_e2e/test_cli_snowflake.py

+        countries_table = self.retrieve_table("e2e_snowflake.E2E_DB.E2E_TEST.COUNTRIES")
+        self.assertIsNotNone(countries_table)
+        regions_table = self.retrieve_table("e2e_snowflake.E2E_DB.E2E_TEST.REGIONS")
+        self.assertIsNotNone(regions_table)
+        pk_columns = [
+            col for col in regions_table.columns if col.name.root == "REGION_ID"
+        ]
+        self.assertGreater(len(pk_columns), 0, "REGION_ID column should exist")


💡 Quality: FK constraint assertion only checks column existence, not the FK

In test_snowflake_features_ingestion, the foreign key constraint validation (lines 707-714) retrieves the countries and regions tables but only asserts that the REGION_ID column exists on the regions table. It never actually checks that the FK relationship was ingested — e.g., verifying countries_table has a constraint referencing regions_table, or checking col.constraint on the REGION_ID column in countries. This means the FK creation SQL runs but the test would pass even if FK ingestion is completely broken.

Suggested fix:

# After retrieving countries_table, verify FK is present: fk_columns = [ col for col in countries_table.columns if col.name.root == "REGION_ID" and col.constraint ] self.assertGreater( len(fk_columns), 0, "REGION_ID in countries should have an FK constraint", )

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-14T15:38:18Z

Code Review 👍 Approved with suggestions 0 resolved / 1 findings

Enables subprocess coverage tracking for CLI E2E tests, ensuring more comprehensive execution metrics. Update the FK constraint assertion in test_snowflake_features_ingestion to verify the foreign key relationship rather than just column existence.

💡 Quality: FK constraint assertion only checks column existence, not the FK

📄 ingestion/tests/cli_e2e/test_cli_snowflake.py:707-714

In test_snowflake_features_ingestion, the foreign key constraint validation (lines 707-714) retrieves the countries and regions tables but only asserts that the REGION_ID column exists on the regions table. It never actually checks that the FK relationship was ingested — e.g., verifying countries_table has a constraint referencing regions_table, or checking col.constraint on the REGION_ID column in countries. This means the FK creation SQL runs but the test would pass even if FK ingestion is completely broken.

Suggested fix

# After retrieving countries_table, verify FK is present:
fk_columns = [
    col for col in countries_table.columns
    if col.name.root == "REGION_ID" and col.constraint
]
self.assertGreater(
    len(fk_columns), 0,
    "REGION_ID in countries should have an FK constraint",
)

🤖 Prompt for agents

Code Review: Enables subprocess coverage tracking for CLI E2E tests, ensuring more comprehensive execution metrics. Update the FK constraint assertion in `test_snowflake_features_ingestion` to verify the foreign key relationship rather than just column existence.

1. 💡 Quality: FK constraint assertion only checks column existence, not the FK
   Files: ingestion/tests/cli_e2e/test_cli_snowflake.py:707-714

   In `test_snowflake_features_ingestion`, the foreign key constraint validation (lines 707-714) retrieves the `countries` and `regions` tables but only asserts that the `REGION_ID` column exists on the `regions` table. It never actually checks that the FK relationship was ingested — e.g., verifying `countries_table` has a constraint referencing `regions_table`, or checking `col.constraint` on the `REGION_ID` column in `countries`. This means the FK creation SQL runs but the test would pass even if FK ingestion is completely broken.

   Suggested fix:
   # After retrieving countries_table, verify FK is present:
   fk_columns = [
       col for col in countries_table.columns
       if col.name.root == "REGION_ID" and col.constraint
   ]
   self.assertGreater(
       len(fk_columns), 0,
       "REGION_ID in countries should have an FK constraint",
   )

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

Copilot AI review requested due to automatic review settings April 13, 2026 17:51

ulixius9 requested review from a team, akash-jain-10, harshach and tutte as code owners April 13, 2026 17:51

github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Apr 13, 2026

harshach previously approved these changes Apr 13, 2026

View reviewed changes

ulixius9 had a problem deploying to test April 13, 2026 17:52 — with GitHub Actions Failure

ulixius9 temporarily deployed to test April 13, 2026 17:52 — with GitHub Actions Inactive

ulixius9 had a problem deploying to test April 13, 2026 17:52 — with GitHub Actions Failure

ulixius9 temporarily deployed to test April 13, 2026 17:52 — with GitHub Actions Inactive

ulixius9 had a problem deploying to test April 13, 2026 17:52 — with GitHub Actions Failure

Copilot AI reviewed Apr 13, 2026

View reviewed changes

ulixius9 temporarily deployed to test April 13, 2026 18:01 — with GitHub Actions Inactive

ulixius9 temporarily deployed to test April 14, 2026 02:54 — with GitHub Actions Inactive

MINOR: Fix snowflake e2e (#26677)

d9f41a9

* MINOR: Fix snowflake e2e * fix pyformat

Copilot AI review requested due to automatic review settings April 14, 2026 07:19

Copilot started reviewing on behalf of ulixius9 April 14, 2026 07:20 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

ingestion/tests/cli_e2e/test_cli_snowflake.py Show resolved Hide resolved

ingestion/tests/cli_e2e/test_cli_snowflake.py Show resolved Hide resolved

ingestion/tests/cli_e2e/test_cli_snowflake.py Show resolved Hide resolved

ulixius9 temporarily deployed to test April 14, 2026 07:30 — with GitHub Actions Inactive

ulixius9 had a problem deploying to test April 14, 2026 07:30 — with GitHub Actions Failure

ulixius9 temporarily deployed to test April 14, 2026 07:30 — with GitHub Actions Inactive

improve snowflake test

e9a69d3

ulixius9 had a problem deploying to test April 14, 2026 11:00 — with GitHub Actions Error

fix count

1af8db6

Copilot AI reviewed Apr 14, 2026

View reviewed changes

mark flaky auto classification test

9192f79

gitar-bot bot reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: enable subprocess coverage tracking for CLI E2E tests#27329

fix: enable subprocess coverage tracking for CLI E2E tests#27329
ulixius9 wants to merge 6 commits intomainfrom
fix/cli-e2e-subprocess-coverage

ulixius9 commented Apr 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

sonarqubecloud bot commented Apr 14, 2026

Uh oh!

gitar-bot bot Apr 14, 2026

Uh oh!

gitar-bot bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		class SnowflakeCliTest(CliCommonDB.TestSuite, SQACommonMethods):
		"""

	# Stored procedure (requires raw connection for USE DATABASE)
	# Stored procedure
	# Use a raw DBAPI connection for the procedure creation statement.

Conversation

ulixius9 commented Apr 13, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Apr 14, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

gitar-bot bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gitar-bot bot commented Apr 14, 2026 •

edited

Loading