Skip to content

Test updates: stabler debup hash and cli integration coverage#462

Open
rugeli wants to merge 6 commits intomainfrom
feature/tests
Open

Test updates: stabler debup hash and cli integration coverage#462
rugeli wants to merge 6 commits intomainfrom
feature/tests

Conversation

@rugeli
Copy link
Copy Markdown
Collaborator

@rugeli rugeli commented Apr 14, 2026

Problem

follow-up to #419, closes #460

Solution

  • added _normalize_for_hashing to sort lists whose element order is not meaningful (e.g. "interior": ["A", "B", "C"], "interior": [ {"object": "green_sphere", "count": 5}, "outer_sphere"] ), while leaving order-sensitive lists(vectors, colors, etc.) unchanged.
  • added integration tests in test_pack_cli.py
  • refactored aws tests to write temporary files to tmp_path instead of the working dir

One note: this change can alter hash values for recipes that have lists, so re-run the same recipe after this merge may hash differently. Shouldn't cause issues, but worth noting so we can keep an eye on database tidiness.

Type of change

  • Refactor (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

@github-actions
Copy link
Copy Markdown
Contributor

Packing analysis report

Analysis for packing results located at cellpack/tests/outputs/test_spheres/spheresSST

Ingredient name Encapsulating radius Average number packed
ext_A 25 236.0

Packing image

Packing image

Distance analysis

Expected minimum distance: 50.00
Actual minimum distance: 50.01

Ingredient key Pairwise distance distribution
ext_A Distance distribution ext_A

Comment on lines +66 to +68
test_file = tmp_path / "test_file.txt"
test_file.write_text("test file")
aws_handler.upload_file(str(test_file))
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit out of scope but a simple change, switched tests to write via tmp_path so they leave no trace like test_file.txt in the repo root

Comment thread cellpack/tests/test_pack_cli.py Outdated
pack(recipe=recipe_data, config_path=str(config_path))


def test_pack_with_default_config(tmp_path, monkeypatch):
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to tmp_path in aws tests, we don't want test files written inside the repo, so we change directory to tmp_path with monkeypatch.chdir()

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens recipe deduplication by normalizing recipe JSON prior to hashing (so semantically equivalent recipes hash the same), and adds/updates tests to cover both the hashing behavior and the CLI workflow.

Changes:

  • Add DataDoc._normalize_for_hashing() and update generate_hash() to canonicalize order-insensitive lists while preserving positional numeric lists.
  • Add integration tests for cellpack/bin/pack.py covering recipe path input, recipe dict input, and default config behavior.
  • Refactor AWS handler tests to write temporary files under tmp_path instead of the working directory.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
cellpack/autopack/DBRecipeHandler.py Implements normalization logic to produce stable dedup hashes across semantically equivalent JSON.
cellpack/tests/test_data_doc.py Adds test cases that assert hash stability across dict key order and order-insensitive list permutations.
cellpack/tests/test_pack_cli.py Introduces CLI integration tests covering recipe-as-path, recipe-as-dict, and default config flows.
cellpack/tests/test_aws_handler.py Updates tests to use tmp_path for safer cross-platform temp file handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cellpack/tests/test_pack_cli.py Outdated
Comment on lines +61 to +65
def test_pack_with_recipe_path(tmp_path):
recipe_path = tmp_path / "recipe.json"
recipe_path.write_text(json.dumps(recipe_data))
config_path = _write_config(tmp_path)
pack(recipe=str(recipe_path), config_path=str(config_path))
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recipe_data is a module-level dict that gets passed into pack() (directly or via json.dumps). RecipeLoader mutates the input dict when recipe is a dict (it reuses nested references and later replaces objects[*]["representations"]/partners with class instances), so test order can become flaky (e.g., json.dumps(recipe_data) can fail after another test runs). Use a per-test fresh recipe dict (e.g., a pytest fixture returning a new dict or copy.deepcopy(recipe_data) at call sites) before writing/packing.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

have an integration test to check if the changes in this file don't break the local workflow

2 participants