Skip to content

test(e2e): winml inspect E2E tests + fix --list-tasks crash#676

Merged
DingmaomaoBJTU merged 4 commits into
mainfrom
qiowu/inspect_e2e
May 20, 2026
Merged

test(e2e): winml inspect E2E tests + fix --list-tasks crash#676
DingmaomaoBJTU merged 4 commits into
mainfrom
qiowu/inspect_e2e

Conversation

@DingmaomaoBJTU
Copy link
Copy Markdown
Collaborator

@DingmaomaoBJTU DingmaomaoBJTU commented May 19, 2026

Summary

Adds tests/e2e/test_inspect_e2e.py39 tests covering winml inspect.
Offline tests run without downloading any model weights; network tests hit
HuggingFace Hub with real model IDs.

Also fixes a crash in get_known_tasks(): HF_MODEL_CLASS_MAPPING contains
(model_type, None) sentinel entries for per-model-type defaults; iterating
without a None guard caused sorted() to raise
TypeError: '<' not supported between instances of 'NoneType' and 'str'
whenever --list-tasks was invoked.

uv run pytest tests/e2e/ -m "e2e and not network" -k inspect_e2e   # offline (30)
uv run pytest tests/e2e/ -m "e2e and network"     -k inspect_e2e   # network (9)

CLI commands under test

# ── offline: CLI surface ─────────────────────────────────────────────
winml inspect
winml inspect --help
winml inspect --model-type bert --format xml

# ── offline: --list-tasks ────────────────────────────────────────────
winml inspect --list-tasks

# ── offline: --model-type / --model-class (no weight download) ───────
winml inspect --model-type bert -f json
winml inspect --model-type bert -t feature-extraction -f json
winml inspect --model-type resnet -f json
winml inspect --model-class BertForMaskedLM -f json
winml inspect --model-type bert --verbose -f json
winml inspect --model-type bert --hierarchy -f json
winml inspect --model-type totally_nonexistent_model_xyz_123

# ── network: real model IDs ──────────────────────────────────────────
winml inspect -m bert-base-uncased -f json
winml inspect -m bert-base-uncased -f json -t feature-extraction
winml inspect -m bert-base-uncased -f json -t next-sentence-prediction
winml inspect -m microsoft/resnet-50 -f json
winml inspect -m facebook/convnext-tiny-224 -f json
winml inspect -m google/vit-base-patch16-224 -f json
winml inspect -m openai/clip-vit-base-patch32 -f json
winml inspect -m openai/clip-vit-base-patch32 -f json -t image-feature-extraction
winml inspect -m facebook/detr-resnet-50 -f json

Coverage

TestInspectCliSurface — CLI surface (11 tests, offline)

  • winml inspect (no args) → exit 2 with UsageError
  • winml inspect --help → exit 0
  • All flags documented in --help: -m, -f, --model-type, --model-class, --list-tasks, -v/--verbose, -H/--hierarchy
  • --format xml → exit non-zero, error mentions the bad choice

TestInspectListTasks--list-tasks (6 tests, offline)

  • Exit 0, non-empty output, every line is a non-empty string
  • Includes well-known tasks: feature-extraction, mask-generation
  • Works without any -m / --model-type / --model-class argument
  • Contrast: bare winml inspect → exit 2; winml inspect --list-tasks → exit 0

TestInspectModelTypeOnly — offline inspection (13 tests, offline)

  • --model-type bert and --model-type resnet resolve correct model_type + task in JSON
  • --model-class BertForMaskedLM resolves to bert / fill-mask
  • All 14 keys in EXPECTED_TOP_KEYS present in JSON output
  • loader, exporter, winml sub-sections have required fields
  • exporter.input_tensors and output_tensors are lists
  • -v / --verbose and table format both exit 0
  • --hierarchy without -mhierarchy: null (no crash)
  • Unknown model type exits non-zero

TestInspectBertbert-base-uncased (3 tests, network)

  • Auto-detect resolves to fill-mask, task_source == "TasksManager"
  • Explicit feature-extraction override accepted
  • next-sentence-prediction: clean success or clean error (no raw traceback)

TestInspectVision — vision models (3 tests, network)

  • microsoft/resnet-50, facebook/convnext-tiny-224, google/vit-base-patch16-224
  • Auto-detect → image-classification for all three

TestInspectCLIPopenai/clip-vit-base-patch32 (2 tests, network)

  • Auto-detect → feature-extraction or zero-shot-image-classification
  • Explicit image-feature-extraction override accepted

TestInspectDETRfacebook/detr-resnet-50 (1 test, network)

  • Auto-detect → object-detection

Add offline and network E2E tests for `winml inspect`, covering:
- CLI surface: help flags, no-args UsageError, invalid --format
- --list-tasks: exits 0, non-empty task lines, known tasks present
- Offline inspection via --model-type / --model-class (no weight download)
- Network inspection for bert, vision models (resnet/convnext/vit), CLIP, DETR

Also fix a bug in get_known_tasks(): HF_MODEL_CLASS_MAPPING contains
(model_type, None) sentinel entries for per-model-type defaults; iterating
over them without a None guard caused sorted() to crash with
"'<' not supported between instances of 'NoneType' and 'str'".
@DingmaomaoBJTU DingmaomaoBJTU requested a review from a team as a code owner May 19, 2026 09:22
Copy link
Copy Markdown
Contributor

@xieofxie xieofxie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review — 3 issues, posted inline so each can be resolved individually. The get_known_tasks() None-guard fix looks clean.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

Comment thread tests/e2e/test_inspect_e2e.py Outdated
Comment thread tests/e2e/test_inspect_e2e.py Outdated
Comment thread tests/e2e/test_inspect_e2e.py Outdated
Address PR review comments on #676:

1. Move TestInspectCliSurface (11 tests) and TestInspectListTasks (6 tests)
   from tests/e2e/ to tests/cli/test_inspect_cli.py per tests/CLAUDE.md:
   CLI surface tests belong under tests/cli/, not tests/e2e/.
   They now run under the default CI filter without -m e2e.

2. Replace tautological assertions in test_list_tasks_all_lines_are_strings
   (isinstance(str.strip(), str) is always true; len > 0 is guaranteed by
   the if-guard) with a regex invariant: each task line must match
   r"^[a-z][a-z0-9-]*$".

3. Collapse two redundant tests (test_list_tasks_no_model_flag_needed and
   test_list_tasks_overrides_missing_model_requirement both just asserted
   exit_code == 0, already covered by test_list_tasks_exits_zero) into a
   single test_list_tasks_is_sorted that verifies the --list-tasks output
   is in ascending lexicographic order.
Comment thread tests/cli/test_inspect_cli.py
Comment thread tests/cli/test_inspect_cli.py
Comment thread tests/cli/test_inspect_cli.py Outdated
- Parametrize 8 test_help_documents_* into one test with a
  class-scoped help_output fixture (one --help call per class)
- Merge test_invalid_format_exits_nonzero + test_invalid_format_names_bad_choice
  into single test_invalid_format
- Extract shared _run helper to tests/_helpers.py with correct
  -> Result return type; both cli and e2e test files import from there
Copy link
Copy Markdown
Collaborator

@timenick timenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. All three review comments cleanly addressed in 41d1947:

  • 8 test_help_documents_* collapsed into a single parametrized test with a class-scoped help_output fixture (one --help invocation per class)
  • test_invalid_format_* merged into one method
  • _run extracted to tests/_helpers.py with the correct Result return type, shared by both the CLI and E2E test files

The get_known_tasks() None-sentinel fix is also correct — verified that MODEL_CLASS_MAPPING (e.g. in models/hf/sam.py) intentionally uses (model_type, None) entries as per-model-type defaults.

🤖 Generated with GitHub Copilot CLI

@DingmaomaoBJTU DingmaomaoBJTU merged commit 271c1f7 into main May 20, 2026
9 checks passed
@DingmaomaoBJTU DingmaomaoBJTU deleted the qiowu/inspect_e2e branch May 20, 2026 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants