Skip to content

Add strategy and strategy_prompt params to dedupe#110

Merged
nikosbosse merged 2 commits intomainfrom
feat/dedupe-strategy-modes
Feb 9, 2026
Merged

Add strategy and strategy_prompt params to dedupe#110
nikosbosse merged 2 commits intomainfrom
feat/dedupe-strategy-modes

Conversation

@nikosbosse
Copy link
Contributor

Summary

  • Add strategy parameter to dedupe() / dedupe_async(): "identify", "select" (default), or "combine"
  • Add strategy_prompt parameter for guiding LLM selection/combining behavior
  • Update generated DedupeOperation model with new fields
  • Update docs with strategy examples
  • Add integration tests for each strategy mode

Depends on server-side changes in futuresearch/delphos#4175 — merge that first.

Test plan

  • SDK model serialization (to_dict/from_dict) handles new fields
  • Integration tests for identify, select, combine, and strategy_prompt
  • Docs updated with examples

🤖 Generated with Claude Code

- Add `strategy` parameter to `dedupe()` / `dedupe_async()`:
  `"identify"`, `"select"` (default), or `"combine"`
- Add `strategy_prompt` parameter for guiding LLM selection/combining
- Update generated `DedupeOperation` model with new fields
- Convert strategy string to `DedupeOperationStrategy` enum before
  passing to generated model (prevents AttributeError on serialization)
- Update docs with strategy examples
- Add integration tests for each strategy mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
input_: DedupeOperationInputType2 | list[DedupeOperationInputType1Item] | UUID
equivalence_relation: str
session_id: None | Unset | UUID = UNSET
strategy: DedupeOperationStrategy | Unset = DedupeOperationStrategy.SELECT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does unset work? would it work if we have a default already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Unset is how the generated client distinguishes "not provided" from an explicit value. When UNSET is passed, the field is omitted from the serialized JSON request, letting the server apply its own default. The default of DedupeOperationStrategy.SELECT here is the server's documented default — it's used when someone instantiates the model directly without specifying strategy. But from ops.py, when the user passes strategy=None, we pass UNSET to let the server decide. So | Unset in the type is needed because it's a valid runtime value (meaning "omit from request"), even though the attr default is SELECT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above was Claude :)

Comment on lines +642 to +648
Args:
equivalence_relation: Description of what makes items equivalent
session: Optional session. If not provided, one will be created automatically.
input: The input table (DataFrame, UUID, or TableResult)
strategy: Strategy for handling duplicates: 'identify' (cluster only),
'select' (pick best, default), 'combine' (synthesize combined row)
strategy_prompt: Optional instructions guiding how selection or combining is performed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry this goes beyond your PR, but I think these might not be informative enough for CC or the like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — expanded the docstrings for strategy and strategy_prompt on both dedupe() and the generated DedupeOperation model to be much more descriptive. Each strategy mode now explains what columns are added, when to use it, and how strategy_prompt interacts with it. Should be much more useful for CC and similar tools reading the docstrings.

Address PR review feedback: make parameter descriptions more verbose
so that Claude Code and similar tools can understand the full behavior
of each strategy mode and how strategy_prompt interacts with them.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nikosbosse nikosbosse merged commit 2a4a49b into main Feb 9, 2026
4 checks passed
@nikosbosse nikosbosse deleted the feat/dedupe-strategy-modes branch February 9, 2026 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants