Skip to content

Improve azure-ai-ml unit test branch coverage from 58% to 85%#45932

Closed
deyaaeldeen wants to merge 1 commit intomainfrom
feature/azure-ai-ml-test-coverage-gaps
Closed

Improve azure-ai-ml unit test branch coverage from 58% to 85%#45932
deyaaeldeen wants to merge 1 commit intomainfrom
feature/azure-ai-ml-test-coverage-gaps

Conversation

@deyaaeldeen
Copy link
Copy Markdown
Member

Summary

Adds 20 new unit test files with 495 passing tests targeting previously uncovered branches in the azure-ai-ml SDK.

Coverage Impact (excluding vendor/generated code)

Metric Before After Change
Branch coverage 58.3% 85.1% +26.8%
Line coverage 76.0% 76.1% +0.1%

Details

  • 495 tests pass, 0 failures — all tests verified locally
  • All tests use mocking/monkeypatching with no external service dependencies
  • Tests generated using coverage-driven test generation tooling with gpt-5-mini

Modules covered:

  • Operations: job, schedule, model, component, data, batch endpoint
  • Storage helpers: blob, fileshare, gen2
  • Utilities: asset utils, artifact utils/utilities, exception helper, general utils
  • Other: local job invoker, deployment templates, pipeline expressions, thresholds

Test Files Added (20)

File Tests Lines
test_job_ops_helper_gaps.py ~55 1,150
test_local_job_invoker_gaps.py ~50 1,080
test_job_operations_gaps.py ~45 1,050
test_thresholds_gaps.py ~40 947
test_deployment_template_gaps.py ~35 881
test_artifact_utils_gaps.py ~30 879
test_schedule_operations_gaps.py ~30 845
test_artifact_utilities_gaps.py ~25 747
test_data_index_func_gaps.py ~25 661
test_blob_storage_helper_gaps.py ~20 506
test_asset_utils_gaps.py ~20 497
test_batch_endpoint_operations_gaps.py ~20 494
test_model_operations_gaps.py ~20 489
test_exception_helper_gaps.py ~15 392
test_utils_gaps.py ~15 356
test_fileshare_storage_helper_gaps.py ~15 344
test_gen2_storage_helper_gaps.py ~10 284
test_pipeline_expression_gaps.py ~10 276
test_data_operations_gaps.py ~10 192
test_component_operations_gaps.py ~5 152

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a large set of new unit tests for azure-ai-ml to exercise previously uncovered branches (targeting a substantial branch-coverage increase) across operations, utilities, storage helpers, and error-handling paths.

Changes:

  • Adds new pytest modules covering utilities, operations helpers, and storage helper behaviors via mocking/monkeypatching.
  • Adds new tests for schedule/model/component/data/batch-endpoint operations and pipeline expression logic.
  • Adds new tests for artifact/storage utilities (blob/fileshare/gen2) and exception formatting/raising helpers.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
sdk/ml/azure-ai-ml/tests/test_utils_gaps.py Adds tests for common utility helpers (string transforms, hashing, retry, discovery URL helpers).
sdk/ml/azure-ai-ml/tests/test_schedule_operations_gaps.py Adds tests for schedule monitoring ARM ID resolution and signal-specific paths.
sdk/ml/azure-ai-ml/tests/test_pipeline_expression_gaps.py Adds tests for PipelineExpression operand typing, data binding conversion, and component code generation/caching.
sdk/ml/azure-ai-ml/tests/test_model_operations_gaps.py Adds tests for model operations branches (evaluator validation, label resolution, package/share flows).
sdk/ml/azure-ai-ml/tests/test_gen2_storage_helper_gaps.py Adds tests for Gen2 upload/list/exists/download error and warning branches.
sdk/ml/azure-ai-ml/tests/test_fileshare_storage_helper_gaps.py Adds tests for fileshare upload/exists/delete/recursive download branches.
sdk/ml/azure-ai-ml/tests/test_exception_helper_gaps.py Adds tests for validation error formatting and log_and_raise_error branching.
sdk/ml/azure-ai-ml/tests/test_deployment_template_gaps.py Adds tests for deployment template dump/rest/dict conversions and property helpers.
sdk/ml/azure-ai-ml/tests/test_data_operations_gaps.py Adds tests for data operations validation branches and MLTable schema download handling.
sdk/ml/azure-ai-ml/tests/test_data_index_func_gaps.py Adds tests for data index helper functions and routing logic.
sdk/ml/azure-ai-ml/tests/test_component_operations_gaps.py Adds tests for component dependency resolution and pipeline-job validation branches.
sdk/ml/azure-ai-ml/tests/test_blob_storage_helper_gaps.py Adds tests for blob storage helper behaviors (warnings, error wrapping, exists/list/download).
sdk/ml/azure-ai-ml/tests/test_batch_endpoint_operations_gaps.py Adds tests for batch endpoint input resolution and create/update exception handling.
sdk/ml/azure-ai-ml/tests/test_asset_utils_gaps.py Adds tests for asset utility helpers (hashing, ignore, upload helpers, version helpers).
sdk/ml/azure-ai-ml/tests/test_artifact_utils_gaps.py Adds tests for artifact cache behaviors (download retries, checksum checks, path derivation).
sdk/ml/azure-ai-ml/tests/test_artifact_utilities_gaps.py Adds tests for datastore info/log listing and artifact upload/metadata update helpers.

Comment on lines +494 to +497
return self.result


... No newline at end of file
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file ends with a standalone ... (Ellipsis) expression, which looks like leftover scaffolding/truncation from test generation. It’s valid Python but adds noise and may confuse readers/tools. Please remove it (and any similar placeholders) if it isn’t intentionally used.

Suggested change
return self.result
...
return self.result

Copilot uses AI. Check for mistakes.
Comment on lines +661 to +691
def test_check_and_upload_env_build_context_updates_build_path(monkeypatch, tmp_path):
def fake_upload_to_datastore(
operation_scope,
datastore_operation,
path,
**kwargs,
):
return SimpleNamespace(
full_storage_path='https://account/path'
)

monkeypatch.setattr(artifact_utilities, '_upload_to_datastore', fake_upload_to_datastore)

environment = SimpleNamespace(
path=str(tmp_path / 'env-src'),
name='env-name',
version='3',
datastore='datastore',
build=SimpleNamespace(path=None),
_upload_hash='hash',
)
operations = SimpleNamespace(
_operation_scope='scope',
_datastore_operation='datastore-operation',
)

result = artifact_utilities._check_and_upload_env_build_context(environment, operations)

assert result is environment
assert environment.build.path == 'https://account/path/'

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate test function name: this test_check_and_upload_env_build_context_updates_build_path overwrites the earlier test with the same name in this module, so one of them won’t run. Rename/merge these tests to keep both cases covered.

Copilot uses AI. Check for mistakes.
Comment on lines +25 to +57
for module_name, class_names in module_definitions.items():
module = types.ModuleType(module_name)
for class_name in class_names:
setattr(module, class_name, type(class_name, (), {}))
sys.modules[module_name] = module

REF_DOC_ERROR_MESSAGE_MAP = {}
util_module = types.ModuleType("azure.ai.ml.entities._util")
util_module.REF_DOC_ERROR_MESSAGE_MAP = REF_DOC_ERROR_MESSAGE_MAP
sys.modules["azure.ai.ml.entities._util"] = util_module

azure_datastore_module = sys.modules["azure.ai.ml._schema._datastore"]
assets_module = sys.modules["azure.ai.ml._schema.assets.data"]
environment_assets_module = sys.modules["azure.ai.ml._schema.assets.environment"]
model_assets_module = sys.modules["azure.ai.ml._schema.assets.model"]
sweep_module = sys.modules["azure.ai.ml._schema._sweep"]
job_module = sys.modules["azure.ai.ml._schema.job"]

REF_DOC_ERROR_MESSAGE_MAP.update(
{
model_assets_module.ModelSchema: "Model schema guidance",
assets_module.DataSchema: "Data schema guidance",
job_module.CommandJobSchema: "Command job schema guidance",
sweep_module.SweepJobSchema: "Sweep schema guidance",
azure_datastore_module.AzureBlobSchema: "Datastore guidance",
azure_datastore_module.AzureDataLakeGen1Schema: "Gen1 datastore guidance",
azure_datastore_module.AzureDataLakeGen2Schema: "Gen2 datastore guidance",
azure_datastore_module.AzureFileSchema: "File datastore guidance",
environment_assets_module.EnvironmentSchema: "Environment schema guidance",
"": "Default guidance",
}
)

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test module mutates sys.modules at import time (creating/replacing azure.ai.ml._schema.* and azure.ai.ml.entities._util). That global state can leak into other tests and make the suite order-dependent. Move these module stubs into a fixture that uses monkeypatch.setitem(sys.modules, ...) (so they’re automatically rolled back) and avoid direct assignment to sys.modules at module import.

Suggested change
for module_name, class_names in module_definitions.items():
module = types.ModuleType(module_name)
for class_name in class_names:
setattr(module, class_name, type(class_name, (), {}))
sys.modules[module_name] = module
REF_DOC_ERROR_MESSAGE_MAP = {}
util_module = types.ModuleType("azure.ai.ml.entities._util")
util_module.REF_DOC_ERROR_MESSAGE_MAP = REF_DOC_ERROR_MESSAGE_MAP
sys.modules["azure.ai.ml.entities._util"] = util_module
azure_datastore_module = sys.modules["azure.ai.ml._schema._datastore"]
assets_module = sys.modules["azure.ai.ml._schema.assets.data"]
environment_assets_module = sys.modules["azure.ai.ml._schema.assets.environment"]
model_assets_module = sys.modules["azure.ai.ml._schema.assets.model"]
sweep_module = sys.modules["azure.ai.ml._schema._sweep"]
job_module = sys.modules["azure.ai.ml._schema.job"]
REF_DOC_ERROR_MESSAGE_MAP.update(
{
model_assets_module.ModelSchema: "Model schema guidance",
assets_module.DataSchema: "Data schema guidance",
job_module.CommandJobSchema: "Command job schema guidance",
sweep_module.SweepJobSchema: "Sweep schema guidance",
azure_datastore_module.AzureBlobSchema: "Datastore guidance",
azure_datastore_module.AzureDataLakeGen1Schema: "Gen1 datastore guidance",
azure_datastore_module.AzureDataLakeGen2Schema: "Gen2 datastore guidance",
azure_datastore_module.AzureFileSchema: "File datastore guidance",
environment_assets_module.EnvironmentSchema: "Environment schema guidance",
"": "Default guidance",
}
)
REF_DOC_ERROR_MESSAGE_MAP = {}
@pytest.fixture(autouse=True)
def _patch_schema_and_util_modules(monkeypatch):
# Create stub schema modules and insert them into sys.modules for the duration of each test.
created_modules = {}
for module_name, class_names in module_definitions.items():
module = types.ModuleType(module_name)
for class_name in class_names:
setattr(module, class_name, type(class_name, (), {}))
monkeypatch.setitem(sys.modules, module_name, module)
created_modules[module_name] = module
# Create the util module that exposes REF_DOC_ERROR_MESSAGE_MAP.
util_module = types.ModuleType("azure.ai.ml.entities._util")
util_module.REF_DOC_ERROR_MESSAGE_MAP = REF_DOC_ERROR_MESSAGE_MAP
monkeypatch.setitem(sys.modules, "azure.ai.ml.entities._util", util_module)
azure_datastore_module = created_modules["azure.ai.ml._schema._datastore"]
assets_module = created_modules["azure.ai.ml._schema.assets.data"]
environment_assets_module = created_modules["azure.ai.ml._schema.assets.environment"]
model_assets_module = created_modules["azure.ai.ml._schema.assets.model"]
sweep_module = created_modules["azure.ai.ml._schema._sweep"]
job_module = created_modules["azure.ai.ml._schema.job"]
# Ensure the mapping is reset for each test.
REF_DOC_ERROR_MESSAGE_MAP.clear()
REF_DOC_ERROR_MESSAGE_MAP.update(
{
model_assets_module.ModelSchema: "Model schema guidance",
assets_module.DataSchema: "Data schema guidance",
job_module.CommandJobSchema: "Command job schema guidance",
sweep_module.SweepJobSchema: "Sweep schema guidance",
azure_datastore_module.AzureBlobSchema: "Datastore guidance",
azure_datastore_module.AzureDataLakeGen1Schema: "Gen1 datastore guidance",
azure_datastore_module.AzureDataLakeGen2Schema: "Gen2 datastore guidance",
azure_datastore_module.AzureFileSchema: "File datastore guidance",
environment_assets_module.EnvironmentSchema: "Environment schema guidance",
"": "Default guidance",
}
)

Copilot uses AI. Check for mistakes.

assert result["createdBy"] == "creator"
assert result["createdTime"] == "2025-01-01T00:00:00Z"
assert result["modifiedTime"] == "2025-02-02T00:00:00:00Z" if False else result["modifiedTime"] == "2025-02-02T00:00:00Z"
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion contains dead code (... if False else ...), which makes the test harder to read and suggests leftover autogenerated scaffolding. Please simplify this to a single, direct assertion.

Suggested change
assert result["modifiedTime"] == "2025-02-02T00:00:00:00Z" if False else result["modifiedTime"] == "2025-02-02T00:00:00Z"
assert result["modifiedTime"] == "2025-02-02T00:00:00Z"

Copilot uses AI. Check for mistakes.
Comment on lines +495 to +508
def test_optional_pipeline_input_varied_states():
assert not data_index_func.optional_pipeline_input_provided(None)
assert not data_index_func.optional_pipeline_input_provided(_DummyPipelineInput(None))
assert data_index_func.optional_pipeline_input_provided(_DummyPipelineInput("value"))


def test_use_automatic_compute_applies_resources():
component = _StubComponent()
result = data_index_func.use_automatic_compute(component, instance_count=5, instance_type="small")
assert result is component
assert component.last_resources["instance_count"] == 5
assert component.last_resources["instance_type"] == "small"
assert component.last_resources["properties"] == {"compute_specification": {"automatic": True}}

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple duplicated test function names in this file (for example, this test_use_automatic_compute_applies_resources duplicates an earlier test with the same name). In Python, later definitions overwrite earlier ones, so pytest will silently skip overwritten tests. Ensure all test function names are unique (rename/merge duplicates) so all intended cases execute.

Copilot uses AI. Check for mistakes.
Comment on lines +433 to +445
def test_resolve_input_private_preview_short_circuits(monkeypatch):
operations = _make_batch_operations()
monkeypatch.setattr(batch_endpoint_module, "Input", DummyInput)
monkeypatch.setattr(batch_endpoint_module, "is_private_preview_enabled", lambda: True)

def matcher(pattern, string):
return pattern == batch_endpoint_module.AZUREML_REGEX_FORMAT

monkeypatch.setattr(batch_endpoint_module, "re", FakeRe(matcher))
entry = DummyInput("preview:data", "uri_folder")
operations._resolve_input(entry, base_path=".")
assert entry.path == "preview:data"

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate test function name: this test_resolve_input_private_preview_short_circuits overwrites the earlier definition in the same file, so one of the tests won’t run. Rename/merge to keep both scenarios covered.

Copilot uses AI. Check for mistakes.
Comment on lines +583 to +599
def test_download_artifacts_raises_after_retries(monkeypatch):
monkeypatch.setattr(ArtifactCache, "check_artifact_extension", staticmethod(lambda: None))
ArtifactCache._instance = None
cache = ArtifactCache()
monkeypatch.setattr(ArtifactCache, "_redirect_artifacts_tool_path", lambda self, organization: None)
responses = [
SimpleNamespace(returncode=1, stderr="err1", stdout="out1"),
SimpleNamespace(returncode=1, stderr="err2", stdout="out2"),
]

def fake_run(*args, **kwargs):
return responses.pop(0)

monkeypatch.setattr(subprocess, "run", fake_run)
with pytest.raises(RuntimeError) as excinfo:
cache._download_artifacts(["cmd"], "org", "name", "1.0", "feed", max_retries=2)
assert "Download artifact debug info" in str(excinfo.value)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate test function name: this test_download_artifacts_raises_after_retries overwrites the earlier definition in the same file, so one of the tests won’t execute. Rename or consolidate the duplicates so pytest collects both.

Copilot uses AI. Check for mistakes.
assert excinfo.value.message == str(error)


def test_log_and_raise_error_propagates_other_exceptions():
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate test function name: this test_log_and_raise_error_propagates_other_exceptions overwrites the earlier definition in the same file, so only one of them will be collected by pytest. Rename or remove one of the duplicates so both behaviors are actually tested.

Suggested change
def test_log_and_raise_error_propagates_other_exceptions():
def test_log_and_raise_error_propagates_value_error():

Copilot uses AI. Check for mistakes.
Comment on lines +329 to +344
def test_recursive_download_raises_ml_exception_on_list_failure():
failing_client = _FailingShareDirectoryClient()

with pytest.raises(MlException) as excinfo:
recursive_download(
failing_client,
destination='unused',
max_concurrency=1,
starts_with='prefix',
)

exception = excinfo.value
assert exception.message == 'Saving fileshare directory with prefix prefix was unsuccessful.'
assert exception.no_personal_data_message == 'Saving fileshare directory with prefix prefix was unsuccessful.'
assert exception.target == ErrorTarget.ARTIFACT
assert exception.error_category == ErrorCategory.SYSTEM_ERROR
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate test function name: this test_recursive_download_raises_ml_exception_on_list_failure overwrites the earlier definition in the same module, so one of the tests won’t run. Rename one of the functions (or merge the cases) to ensure pytest collects both.

Copilot uses AI. Check for mistakes.
Comment on lines +498 to +501
def test_placeholder():
assert True


Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placeholder test (assert True) doesn’t validate any behavior and adds noise to the suite. Please remove it or replace it with a real assertion covering an uncovered branch.

Suggested change
def test_placeholder():
assert True

Copilot uses AI. Check for mistakes.
Add 20 test files with 495 passing tests targeting uncovered branches
across core azure-ai-ml modules including operations, storage helpers,
utilities, and data processing. All tests use mocking/monkeypatching
with no external service dependencies.

Coverage impact (excluding vendor/generated code):
- Branch coverage: 58.3% → 85.1% (+26.8%)
- Line coverage: 76.0% → 76.1%
- Zero test failures

Modules covered:
- Job, schedule, model, component, data, batch endpoint operations
- Blob, fileshare, gen2 storage helpers
- Asset utils, artifact utils/utilities
- Local job invoker, deployment templates
- Pipeline expressions, thresholds, exception helper, general utils

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@deyaaeldeen deyaaeldeen force-pushed the feature/azure-ai-ml-test-coverage-gaps branch from 71e35aa to 5237dd4 Compare March 26, 2026 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants