Skip to content

Update vicreg entropy#193

Merged
m1rl0k merged 5 commits intotestfrom
update-vicreg-entropy
Jan 23, 2026
Merged

Update vicreg entropy#193
m1rl0k merged 5 commits intotestfrom
update-vicreg-entropy

Conversation

@m1rl0k
Copy link
Collaborator

@m1rl0k m1rl0k commented Jan 21, 2026

No description provided.

Replaces print statements with logger calls for better logging consistency across modules. Updates docstrings and comments to clarify the distinction between tool-level and retrieval-level intent classification, especially in intent_classifier.py and mcp_router/intent.py. Improves TOON format handling to preserve structured results for internal callers while optimizing external output. Adds debug and composability improvements to hybrid search and context modules.
The 'symbol' field is now included in compact TOON-formatted outputs in both search and context results. The TOON helper is now imported from a shared location to avoid duplication, with a fallback for standalone usage. Test logging for TOON decode errors now uses the logging module instead of print to prevent stdio issues.
Introduces relative path extraction and indexing in the Neo4j backend for improved filtering and cleaner output, updating all relevant Cypher queries and node properties. Enhances fallback cache implementations in hybrid_search.py and semantic_expansion.py with bounded OrderedDicts for FIFO eviction. Upgrades mcp to 1.25.0. Extends VICReg with batch entropy regularization (VICReg-E) to improve activation diversity and prevent collapse, with validated loss improvements.
@augmentcode
Copy link

augmentcode bot commented Jan 21, 2026

🤖 Augment PR Summary

Summary: This PR refines retrieval/search behavior and improves observability while preserving JSON-RPC compatibility.

Changes:

  • Adds a cap-based escape hatch to ReFRAG “gate-first” candidate gathering to skip gating when it would exceed the configured cap.
  • Clarifies the intentional split between retrieval-level intent (4 broad categories) and tool-level router intent (12+ categories) via expanded module docstrings.
  • Replaces several debug print statements with structured logging to avoid corrupting stdio in MCP/JSON-RPC contexts.
  • Makes TOON output more composable by preserving structured results as results_json while emitting TOON text in results.
  • Updates context_search to pass through the caller’s output_format and prefer results_json when present.
  • Extends compact TOON encodings to include symbol (search/context) and pattern results.
  • Updates tests to log TOON decode failures via logging instead of stderr prints.

Technical Notes: These changes primarily target better token efficiency, safer logging under JSON-RPC, and clearer intent routing boundaries.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

if isinstance(results, list):
# Replace JSON array with TOON string (handles empty arrays too)
# Preserve original list for internal callers before TOON encoding
response["results_json"] = results
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including the full structured results_json alongside TOON-encoded results can negate the token-saving goal and may cause unexpectedly large responses for external clients. Consider whether results_json should be returned only in internal/composability contexts rather than in the tool’s public response.

Other Locations
  • scripts/mcp_impl/toon.py:116

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Introduces Intent-Aware Expansion Cascade (IAEC) for routing query expansion strategies based on query intent, adds pattern-based expansion using discovered code patterns, and implements calibrated confidence for reranking with score separation-aware early stopping. Updates configuration documentation and .env.example with new options. Refactors hybrid
expansion and search to support intent-driven strategies, and enhances pipeline edge extraction for
Neo4j/Qdrant compatibility. (This fixes QDRANT graph only collections on K8s deploy
Added upload-pvc.yaml, code-models-pvc.yaml, upload-service.yaml, and learning-reranker-worker.yaml to the Kubernetes deployment process and kustomization.yaml. Updated Dockerfile to include templates directory. These changes support new service deployments and resource management.
@m1rl0k m1rl0k merged commit 17a7a17 into test Jan 23, 2026
1 check passed
@m1rl0k m1rl0k deleted the update-vicreg-entropy branch January 23, 2026 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant