Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,51 +30,79 @@ const duplicates = await consolidationService.detectDuplicateMemories(
0.75
);

if (duplicates.length > 0) {
if (duplicates.length > 1) {
const primaryId = duplicates[0].id;
const others = duplicates.slice(1).map(d => d.id);

const validation = await consolidationService.validateConsolidationEligibility(primaryId, others);
if (validation.isValid) {
// Example pattern; consult the actual ConsolidationService implementation
// for the exact method names available in this version.
if (consolidationService.validateConsolidationEligibility) {
const validation = await consolidationService.validateConsolidationEligibility(primaryId, others);
if (!validation.isValid) {
console.log('Consolidation not safe:', validation.reasons);
// Abort or adjust as needed
}
}

if (consolidationService.previewConsolidation) {
const preview = await consolidationService.previewConsolidation(primaryId, others);
console.log(preview.summary);

const result = await consolidationService.consolidateMemories(primaryId, others);
console.log(`Merged ${result.consolidated} memories`);
}

const result = await consolidationService.consolidateMemories(primaryId, others);
console.log(`Merged ${result.consolidated} memories`);
}
```

The service performs safety checks, records audit entries, and handles rollbacks if consolidation fails.
The service is responsible for safety checks and atomic updates. Exact helper methods
(e.g. eligibility validation, preview, analytics) depend on the concrete `ConsolidationService`
implementation; use the available methods on your version rather than assuming undocumented ones.

## Scheduling

`DatabaseManager` can run consolidation on a schedule. Use it for background maintenance jobs:
`DatabaseManager` can run consolidation on a schedule. These controls are advanced and should be
treated as internal; the stable entrypoint remains `memori.getConsolidationService()`.

```typescript
memori.getConsolidationService(); // ensure service initialised

memori['dbManager'].startConsolidationScheduling({
intervalMinutes: 120,
maxConsolidationsPerRun: 20,
similarityThreshold: 0.8,
dryRun: true
});
import { Memori } from 'memorits';

const memori = new Memori({ databaseUrl: 'file:./memori.db' });
await memori.enable();

const consolidationService = memori.getConsolidationService();

// For most applications, call consolidationService methods explicitly from your own scheduler.
// Example (simplified):

async function runScheduledConsolidation() {
const duplicates = await consolidationService.detectDuplicateMemories('...', 0.8);
// apply your own selection/validation/merge policy here using the public service API
}
```

The scheduling methods live on `DatabaseManager`. Access them from scripts where you control the lifecycle. Scheduled runs log progress under the `DatabaseManager` component.
If you reach into `DatabaseManager` to use `startConsolidationScheduling` or related methods,
treat that as unstable, internal-only usage: APIs and access patterns may change between versions.

## Metrics & Analytics

If your `ConsolidationService` implementation exposes analytics helpers
(e.g. `getConsolidationAnalytics`, `getConsolidationHistory`), you can use them to
inspect duplicate density, success rates, and historical operations:

```typescript
const analytics = await consolidationService.getConsolidationAnalytics();
console.log(`Success rate: ${analytics.successRate}%`);
if (consolidationService.getConsolidationAnalytics) {
const analytics = await consolidationService.getConsolidationAnalytics();
console.log(`Success rate: ${analytics.successRate}%`);
}

const history = await consolidationService.getConsolidationHistory({ limit: 50 });
console.log(`Recent consolidations: ${history.length}`);
if (consolidationService.getConsolidationHistory) {
const history = await consolidationService.getConsolidationHistory({ limit: 50 });
console.log(`Recent consolidations: ${history.length}`);
}
```

These APIs provide insight into how often duplicates appear, how long consolidations take, and which namespaces are most affected.
Always consult the concrete `ConsolidationService` type in your version of `memorits`
to see which analytics methods are available.

## Extending the Service

Expand Down
66 changes: 33 additions & 33 deletions docs/developer/advanced-features/duplicate-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Memorits includes tooling to surface and manage duplicate memories so your knowl

## Finding Potential Duplicates

`Memori.findDuplicateMemories` wraps the duplicate detection pipeline. It compares supplied content against existing memories using Jaccard and trigram similarity metrics.
`Memori.findDuplicateMemories` wraps the duplicate detection pipeline. It compares supplied content against existing memories using the internal similarity scoring implemented by the duplicate management components (threshold-based, implementation detail), returning only candidates above the configured similarity threshold.

```typescript
import { Memori } from 'memorits';
Expand Down Expand Up @@ -33,53 +33,53 @@ Behind the scenes `DuplicateManager`:
- Filters results above the requested threshold.
- Emits structured logs (`component: DuplicateManager`) so you can trace detection.

## Conscious Context Consolidation
## Consolidation via Public API

When conscious mode is enabled you can trigger duplicate clean-up through `ConsciousAgent`. `Memori` invokes it internally, but you can call it manually for batch jobs:
When you need structured consolidation workflows, use the consolidation service exposed through `Memori` instead of reaching into private agents:

```typescript
await memori.initializeConsciousContext();

const consolidation = await memori['consciousAgent']?.consolidateDuplicates({
similarityThreshold: 0.8,
dryRun: true
});
import { Memori } from 'memorits';

if (consolidation) {
console.log(`Analysed ${consolidation.totalProcessed} memories, found ${consolidation.duplicatesFound} potential duplicates.`);
}
```
const memori = new Memori({ databaseUrl: 'file:./memori.db' });
await memori.enable();

> **Note:** `consciousAgent` is a private property; use TypeScript index access as shown only in controlled scripts. A public wrapper is planned in the API surface.
const consolidationService = memori.getConsolidationService();

The consolidation run:
const duplicates = await consolidationService.detectDuplicateMemories(
'Reminder: invoices go out on the 5th.',
0.75
);

- Validates that the primary memory exists and is not already being consolidated.
- Estimates memory usage and captures basic performance statistics.
- Can run in `dryRun` mode to preview results.
if (duplicates.length > 1) {
const primaryId = duplicates[0].id;
const others = duplicates.slice(1).map(d => d.id);

## Supporting Utilities
const validation = await consolidationService.validateConsolidationEligibility(primaryId, others);
if (validation.isValid) {
const preview = await consolidationService.previewConsolidation(primaryId, others);
console.log(preview.summary);

`DuplicateManager` also exposes helper methods:
const result = await consolidationService.consolidateMemories(primaryId, others);
console.log(`Merged ${result.consolidated} memories`);
}
}
```

- `calculateContentSimilarity(a, b)` – returns a score between 0 and 1. Useful for manual checks.
- `detectDuplicateCandidates(namespace, { minSimilarity, limit })` – scans recent memories for likely duplicates without needing an explicit query.
- `validateConsolidationSafety(primaryId, duplicateIds)` – ensures memories exist and are in an appropriate state for consolidation.
- `getConsolidationHistory(namespace?)` – returns an in-memory log of previous operations performed during the current process lifetime.
The consolidation run is responsible for safety checks, atomic updates, and rollback support (see concrete methods on the `ConsolidationService` / `MemoryConsolidationService` implementations).

Import them directly when you need finer control:
## Supporting Utilities (Advanced/Internal)

```typescript
import { DuplicateManager } from 'memorits/core/infrastructure/database/DuplicateManager';
```
Internal components such as `DuplicateManager` and repository helpers expose lower-level utilities (e.g. candidate detection, consolidation validation, history/analytics). These are wired behind `Memori.findDuplicateMemories` and the consolidation service.

As with other internal imports, this path is subject to change until a public facade is provided.
If you choose to import them directly from internal paths (e.g. `src/core/infrastructure/database/DuplicateManager.ts`), treat this as unstable advanced usage:
- APIs and paths may change without notice.
- Prefer `Memori.findDuplicateMemories` and `memori.getConsolidationService()` for stable integration.

## Practical Workflow

1. Use `findDuplicateMemories` when ingesting new information to alert users about potential duplicates.
2. Schedule a periodic conscious consolidation run (`consolidateDuplicates`) if conscious mode is enabled.
3. Log or review similarity scores above a manual threshold before deleting or merging content.
4. Consider storing references in memory metadata (e.g., `metadata.relatedIds`) when a duplicate is intentionally retained.
1. Use `Memori.findDuplicateMemories` when ingesting new information to alert users about potential duplicates.
2. Use `memori.getConsolidationService()` for controlled consolidation flows (detect → validate → preview → consolidate → rollback if needed).
3. Log or review similarity scores and previews before deleting or merging content.
4. If you must call internal managers directly, isolate that logic in your own adapter and treat it as non-stable API surface.

Duplicate detection is designed to be conservative: it surfaces likely matches without deleting anything automatically. This keeps the system safe by default while giving you the hooks to build richer moderation or review workflows.
Loading