Skip to content

fix(distill): per-day bucketing, UUID7 filenames, content-based timestamps#212

Merged
galexy merged 9 commits into
sageox:mainfrom
galexy:fix/distill-logical-issues-211
Mar 14, 2026
Merged

fix(distill): per-day bucketing, UUID7 filenames, content-based timestamps#212
galexy merged 9 commits into
sageox:mainfrom
galexy:fix/distill-logical-issues-211

Conversation

@galexy
Copy link
Copy Markdown
Contributor

@galexy galexy commented Mar 13, 2026

Summary

Fixes #211ox distill had several logical issues with how observations and discussions are assigned to daily summary files.

Problems fixed:

  • Intra-day overwrites: Running distill twice on the same day would overwrite the daily file, losing earlier content. Now each run produces a new file with a UUID7 suffix (YYYY-MM-DD-{uuid7}.md)
  • Multi-day accumulation: Missing multiple days of distill would dump all observations into a single file for today. Now observations are grouped by their RecordedAt date, producing one LLM call + one file per day
  • Multi-clone collision: Two clones running distill would overwrite each other's daily files. UUID7 filenames eliminate collision
  • Fresh clone re-reads everything: Without a state file, all historical observations would be reprocessed into one giant daily file. Now high-water marks are inferred from existing daily/weekly/monthly files
  • Discussion facts use filesystem mtime: Unreliable across clones since all files get clone timestamp. Now dates are parsed from content footer (created YYYY-MM-DD) or filename prefix
  • Single-period weekly/monthly: Only one summary was produced even if multiple week/month boundaries were crossed. Now each period gets its own summary via enumerateWeeks/enumerateMonths
  • Dry-run saved state: --dry-run would write the state file as a side effect, breaking subsequent runs

Changes

cmd/ox/distill.go

  • New helper functions: inferDailyHighWater, inferWeeklyHighWater, inferMonthlyHighWater — scan existing memory files to set high-water marks on fresh clones
  • New helpers: endOfDay, endOfMonth, isoWeekRange, groupObservationsByDay, readDailyFilesForDateRange, readWeeklyFilesForMonth
  • New types: distillPlan, isoWeek — structured plan with per-period enumeration
  • determineLayers returns distillPlan with explicit week/month lists instead of []string
  • distillDaily loops per-day with UUID7 filenames
  • distillWeekly takes a specific isoWeek, reads only that week's daily files
  • distillMonthly takes a specific month, reads only overlapping weekly files
  • loadDistillStateV2 accepts tcPath parameter; infers high-water marks when no state exists
  • LastDaily uses actual processing time (not end-of-day) to support intra-day re-runs

cmd/ox/distill_discussions.go

  • New type discussionFactEntry with parsed date
  • readPendingDiscussionFacts returns map[string][]discussionFactEntry grouped by date
  • Dates parsed from content footer or filename prefix instead of filesystem mtime

Flow diagram

graph TD
    A[runDistill] --> B[loadDistillStateV2]
    B -->|v2 exists| C[Use existing state]
    B -->|v1 exists| D[Migrate v1 → v2]
    B -->|no state| E[Infer high-water from files]
    
    A --> F[determineLayers]
    F --> G[distillPlan]
    G -->|Daily=true| H[distillDaily]
    G -->|Weeks list| I[distillWeekly per week]
    G -->|Months list| J[distillMonthly per month]
    
    H --> K[groupObservationsByDay]
    K --> L[Per-day loop]
    L --> M[LLM call per day]
    M --> N["Write YYYY-MM-DD-{uuid7}.md"]
    
    I --> O[readDailyFilesForDateRange]
    O --> P[LLM call per week]
    P --> Q[Write YYYY-WXX.md]
    
    J --> R[readWeeklyFilesForMonth]
    R --> S[LLM call per month]
    S --> T[Write YYYY-MM.md]
Loading

Test plan

  • 22 new unit tests covering all helper functions (infer, group, ISO week, date-range reads, state fallback, fact date parsing)
  • Updated existing tests for new distillPlan return type and loadDistillStateV2 signature
  • go vet and go build clean
  • All cmd/ox/ tests pass
  • Manual UAT across 7 scenarios (multi-day catch-up, intra-day runs, fresh clone recovery, multi-clone, weekly/monthly catch-up, discussion dating) — see docs/ai/uat/distill-fix-uat.md
  • UAT discovered and fixed 3 additional bugs (end-of-day state, infer granularity, dry-run state save)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Daily memory files now include unique identifiers for improved organization
    • Plan-driven distillation system for better daily, weekly, and monthly processing
  • Improvements

    • Enhanced date-based grouping and filtering of recorded facts
    • Smarter state detection from existing memory high-water marks
  • Tests

    • Expanded test coverage for distillation workflows and date handling

galexy and others added 3 commits March 13, 2026 09:53
…tamps (sageox#211)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR refactors the distillation system to use a plan-driven architecture with per-day UUID-based memory files, high-water inference from existing files, and grouped discussion facts handling to address multi-run and missed-day issues in daily/weekly/monthly summarization.

Changes

Cohort / File(s) Summary
Plan-driven distillation core
cmd/ox/distill.go
Added distillPlan and isoWeek types; refactored determineLayers to return a plan with Daily, Weeks, Months fields; updated runDistill to use plan-based conditional execution; modified loadDistillStateV2 to accept tcPath and infer high-water marks from memory files; per-day distillation now uses UUID7-suffixed filenames (memory/daily/YYYY-MM-DD-{uuid7}.md).
High-water inference and time utilities
cmd/ox/distill.go
Added inferDailyHighWater, inferWeeklyHighWater, inferMonthlyHighWater helpers; added endOfDay, endOfMonth, isoWeekRange, enumerateWeeks, enumerateMonths time utilities; added regexp patterns (dailyDateRe, weeklyRe, monthlyRe) for parsing distilled filenames and dates.
Filesystem and file reading
cmd/ox/distill.go
Added readDailyFilesForDateRange and readWeeklyFilesForMonth to enumerate memory files within date ranges; added groupObservationsByDay to partition observations by date for per-day processing.
Discussion facts refactoring
cmd/ox/distill_discussions.go
Refactored readPendingDiscussionFacts to return map[string][]discussionFactEntry (grouped by date) instead of separate slices; added parseFactDate helper to extract dates from file footers or YYYY-MM-DD prefixes; added skip optimization when memory/.discussion-facts file exists.
Test updates
cmd/ox/distill_test.go, cmd/ox/distill_discussions_test.go
Expanded tests for plan-driven behavior (TestDetermineLayers with subtests, TestDetermineLayers_MultipleWeeks/Months); added extensive coverage for high-water inference, date parsing, file enumeration, and migration paths; updated test calls to pass tcPath parameter to loadDistillStateV2; refactored readPendingDiscussionFacts tests to validate grouped return structure.

Sequence Diagram(s)

sequenceDiagram
    participant Client as runDistill
    participant StateLoader as loadDistillStateV2
    participant HighWater as High-water Inference
    participant FileSystem as Filesystem
    participant Planner as determineLayers
    participant Distiller as distillDaily/Weekly/Monthly

    Client->>StateLoader: Load state (projectRoot, tcPath)
    StateLoader->>FileSystem: Check if state file exists
    alt State file exists
        FileSystem-->>StateLoader: Return state
    else State file missing
        StateLoader->>HighWater: Infer marks from memory
        HighWater->>FileSystem: Read daily/weekly/monthly files
        FileSystem-->>HighWater: Return file metadata & dates
        HighWater-->>StateLoader: Return inferred high-water times
    end
    StateLoader-->>Client: Return state
    
    Client->>Planner: determineLayers(state, explicit, now)
    Planner-->>Client: Return distillPlan (Daily, Weeks, Months)
    
    alt plan.isEmpty() == false
        Client->>FileSystem: Read pending observations & discussions
        FileSystem-->>Client: Return data since high-water
        
        opt Daily enabled
            Client->>Distiller: distillDaily(plan.Daily data)
            Distiller->>FileSystem: Write memory/daily/YYYY-MM-DD-{uuid7}.md
        end
        
        opt Weeks defined
            Client->>Distiller: distillWeekly(plan.Weeks data)
            Distiller->>FileSystem: Write memory/weekly/YYYY-W{wk}.md
        end
        
        opt Months defined
            Client->>Distiller: distillMonthly(plan.Months data)
            Distiller->>FileSystem: Write memory/monthly/YYYY-MM.md
        end
        
        Client->>StateLoader: Save updated state with new high-water
        StateLoader->>FileSystem: Persist to cache directory
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

sageox

Suggested reviewers

  • rsnodgrass

Poem

🐰 A plan-born distillation so bright,
Each day gets its UUID each night,
No more overwrite fears,
High-water perseveres,
The rabbit hops forward—progress in sight! 🌱

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: per-day bucketing with UUID7 filenames and content-based timestamps for discussion fact dating.
Linked Issues check ✅ Passed The PR implements all key objectives from issue #211: prevents intra-day overwrites via UUID7 files, groups observations by RecordedAt date, avoids cross-clone collisions, infers high-water marks on fresh clones, parses discussion dates from content, enumerates multiple periods, and prevents dry-run state persistence.
Out of Scope Changes check ✅ Passed All changes are directly aligned with fixing the distill logic issues in #211; refactoring of state loading, plan-driven execution, and helper functions are necessary to implement the required fixes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/ox/distill.go (1)

490-493: ⚠️ Potential issue | 🟠 Major

--dry-run still saves state at the end of runDistill.

The PR objective states "Prevents --dry-run from saving state as a side effect," and line 465 correctly guards the post-discussion-extraction save. However, line 491 unconditionally saves state after all distillation steps, including dry-run mode.

🐛 Proposed fix to skip state save during dry-run
 	// save updated state
+	if distillDryRun {
+		return nil
+	}
 	if err := saveDistillStateV2(projectRoot, state); err != nil {
 		slog.Warn("failed to save distill state", "error", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ox/distill.go` around lines 490 - 493, runDistill currently always calls
saveDistillStateV2(projectRoot, state) at the end (unconditional on dry-run);
change it to skip saving when the dry-run flag is set by wrapping that call in
the same dry-run check used earlier (e.g., if dryRun / opts.DryRun / cfg.DryRun
is true then do not call saveDistillStateV2). Ensure you reference the same
dry-run boolean used at the earlier guard around the post-discussion save so the
final save uses the identical condition and no state is persisted during
--dry-run.
🧹 Nitpick comments (2)
cmd/ox/distill.go (2)

530-557: Week enumeration logic is correct but dense.

The loop correctly identifies completed ISO weeks between lastTime and now. The 13-week (91-day) max lookback for fresh states prevents unbounded processing.

Consider adding a brief inline comment explaining that weekEnd.After(cursor) && !weekEnd.After(now) selects only completed weeks where the Sunday has passed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ox/distill.go` around lines 530 - 557, The enumerateWeeks function is
correct but a bit dense; add a short inline comment near the conditional that
explains the intent of the check `weekEnd.After(cursor) && !weekEnd.After(now)`
so future readers understand we only include ISO weeks whose Sunday (weekEnd
from isoWeekRange) has passed relative to cursor and is not after now; place
this comment next to the conditional inside the for loop (in the enumerateWeeks
function, referencing cursor, weekEnd and isoWeekRange) and keep it concise.

761-761: DailyCount only tracks observations, not discussion facts.

Line 761 increments state.DailyCount by len(dayObs) but excludes len(dayFacts). If this counter is meant to track total items distilled, facts should also be counted. If it's intentionally observation-only (for backward compatibility with v1's ObservationCount), consider adding a comment to clarify.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ox/distill.go` at line 761, state.DailyCount is only incremented by
len(dayObs) but excludes len(dayFacts); decide intent and fix accordingly: if
DailyCount should represent total distilled items, update the increment at the
state.DailyCount line to add both len(dayObs) and len(dayFacts) (i.e.,
state.DailyCount += len(dayObs) + len(dayFacts)); if it is intentionally
observation-only for compatibility, add a clear comment next to the
state.DailyCount update documenting that it tracks only observations (and
mention v1 ObservationCount) so future readers aren’t confused; refer to the
variables state.DailyCount, dayObs and dayFacts to locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/ai/uat/distill-fix-uat.md`:
- Line 28: The docs claim `last_daily` is "end of latest day" but the code
(distill.go, symbol state.LastDaily) sets it to the current processing time via
now.Format(time.RFC3339); update the documentation (distill-state-v2.json
example and the prose) to state that last_daily is the actual processing
timestamp in RFC3339 format (or alternatively change the code to set
state.LastDaily to the end-of-day timestamp if you prefer code behavior), and
mention the exact format (RFC3339) so the doc matches the implementation.

---

Outside diff comments:
In `@cmd/ox/distill.go`:
- Around line 490-493: runDistill currently always calls
saveDistillStateV2(projectRoot, state) at the end (unconditional on dry-run);
change it to skip saving when the dry-run flag is set by wrapping that call in
the same dry-run check used earlier (e.g., if dryRun / opts.DryRun / cfg.DryRun
is true then do not call saveDistillStateV2). Ensure you reference the same
dry-run boolean used at the earlier guard around the post-discussion save so the
final save uses the identical condition and no state is persisted during
--dry-run.

---

Nitpick comments:
In `@cmd/ox/distill.go`:
- Around line 530-557: The enumerateWeeks function is correct but a bit dense;
add a short inline comment near the conditional that explains the intent of the
check `weekEnd.After(cursor) && !weekEnd.After(now)` so future readers
understand we only include ISO weeks whose Sunday (weekEnd from isoWeekRange)
has passed relative to cursor and is not after now; place this comment next to
the conditional inside the for loop (in the enumerateWeeks function, referencing
cursor, weekEnd and isoWeekRange) and keep it concise.
- Line 761: state.DailyCount is only incremented by len(dayObs) but excludes
len(dayFacts); decide intent and fix accordingly: if DailyCount should represent
total distilled items, update the increment at the state.DailyCount line to add
both len(dayObs) and len(dayFacts) (i.e., state.DailyCount += len(dayObs) +
len(dayFacts)); if it is intentionally observation-only for compatibility, add a
clear comment next to the state.DailyCount update documenting that it tracks
only observations (and mention v1 ObservationCount) so future readers aren’t
confused; refer to the variables state.DailyCount, dayObs and dayFacts to locate
the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a759030d-0d77-49d9-b48e-c8aefea1828d

📥 Commits

Reviewing files that changed from the base of the PR and between f8ad308 and d841039.

📒 Files selected for processing (5)
  • cmd/ox/distill.go
  • cmd/ox/distill_discussions.go
  • cmd/ox/distill_discussions_test.go
  • cmd/ox/distill_test.go
  • docs/ai/uat/distill-fix-uat.md

Comment thread docs/ai/uat/distill-fix-uat.md Outdated
…ount, docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
cmd/ox/distill.go (1)

58-60: inferDailyHighWater comment does not match the return value.

The comment says end-of-day, but the function currently returns start-of-day (time.Parse("2006-01-02", ...)). Please align the comment with behavior to avoid future regressions.

Also applies to: 87-91

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ox/distill.go` around lines 58 - 60, The docstring for
inferDailyHighWater is inaccurate: the function returns the start-of-day (parsed
via time.Parse("2006-01-02", ...)) but the comment says end-of-day; update the
comment to state that inferDailyHighWater returns the start-of-day (midnight)
UTC for the latest YYYY-MM-DD prefix, or zero time if no files, and make the
equivalent wording change in the similar comment block around the other
high-water inference (the block covering lines 87-91 / the other infer*HighWater
function).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/ox/distill.go`:
- Around line 517-521: The monthly-rollover check uses a 30-day duration and
calls enumerateMonths from lastMonthly which can both miss month transitions and
re-include the already-processed month; change the logic to compare calendar
month/year (e.g., now.Year()/now.Month() vs
lastMonthly.Year()/lastMonthly.Month()) instead of a 30*24*time.Hour threshold,
and call enumerateMonths starting at the first day of the next month (e.g.,
lastMonthly.AddDate(0,1,0) or adjust enumerateMonths to exclude its start) so
you only generate truly new months; apply the same change to the other identical
block around enumerateMonths (lines ~564-585) and keep references to
lastMonthlyTime(), enumerateMonths, and the explicit variable in the updated
logic.
- Around line 194-199: In groupObservationsByDay, skip any observation whose
RecordedAt is the zero (sentinel) time before bucketing to avoid grouping into
"0001-01-01"; add a guard in the loop checking obs.RecordedAt.IsZero() (or
equivalent) and continue to the next observation so only valid RecordedAt values
are formatted and appended to the groups map.

In `@docs/ai/uat/distill-fix-uat.md`:
- Around line 58-61: The test expectation is stricter than current behavior
because inferDailyHighWater (in cmd/ox/distill.go) seeds the high-water mark to
start-of-day which lets pre-existing same-day observations be re-processed;
change inferDailyHighWater so it seeds the high-water mark to exclude
already-present same-day data (e.g., set the initial high-water to the end of
the previous day or to the
start-of-next-day/just-after-the-last-existing-timestamp) so recovery runs only
surface truly new observations for today; update the logic in
inferDailyHighWater and any callers that rely on its returned boundary to ensure
comparisons use a strict greater-than (>) check for new observations.

---

Nitpick comments:
In `@cmd/ox/distill.go`:
- Around line 58-60: The docstring for inferDailyHighWater is inaccurate: the
function returns the start-of-day (parsed via time.Parse("2006-01-02", ...)) but
the comment says end-of-day; update the comment to state that
inferDailyHighWater returns the start-of-day (midnight) UTC for the latest
YYYY-MM-DD prefix, or zero time if no files, and make the equivalent wording
change in the similar comment block around the other high-water inference (the
block covering lines 87-91 / the other infer*HighWater function).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c25652e3-c265-4006-94fa-4e1fe302aac5

📥 Commits

Reviewing files that changed from the base of the PR and between d841039 and 0082522.

📒 Files selected for processing (2)
  • cmd/ox/distill.go
  • docs/ai/uat/distill-fix-uat.md

Comment thread cmd/ox/distill.go
Comment thread cmd/ox/distill.go
Comment thread docs/ai/uat/distill-fix-uat.md Outdated
…erate fix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/ox/distill.go (1)

339-340: ⚠️ Potential issue | 🟡 Minor

Reject invalid --layer values explicitly instead of silent no-op.

A typo in --layer currently falls through to plan.isEmpty() and prints “Nothing to distill”, which is misleading for CLI users.

Proposed fix
@@
 	now := time.Now().UTC()
+
+	switch distillLayer {
+	case "", "daily", "weekly", "monthly":
+	default:
+		return fmt.Errorf("invalid --layer %q (expected: daily, weekly, monthly)", distillLayer)
+	}
 
 	// determine which layers to run
 	plan := determineLayers(state, distillLayer, now)

Also applies to: 505-530

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ox/distill.go` around lines 339 - 340, Validate the --layer flag early
instead of letting invalid values fall through to plan.isEmpty(): after flags
are parsed (where distillCmd and distillLayer are defined), check if
distillLayer is non-empty and is one of the allowed values
("daily","weekly","monthly"); if not, return/exit with a clear error message
stating the invalid value and the allowed options. Apply the same validation in
the other distill code path that also reads distillLayer (the second distill
handler around the other diff block), and avoid treating an invalid layer as
"Nothing to distill" by performing this check before creating or evaluating
plan.isEmpty().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/ox/distill.go`:
- Around line 475-490: The loop currently returns on first failure and only
saves state once at the end, causing already-completed periods to be
reprocessed; after each successful call to distillDaily, distillWeekly, and
distillMonthly (inside the plan.Daily block and both plan.Weeks and plan.Months
loops) persist the progress by invoking the existing state-save/commit helper
used later in this file (the same logic currently executed at the end of
runDistill) and handle/return any save errors wrapped similarly, so each
completed period is recorded immediately and a later failure won’t cause
re-distill of finished periods.
- Around line 721-774: The loop currently writes and commits each day's memory
(writeMemoryFile, commitMemoryFile) but only updates state.LastDaily after the
whole loop, which causes retries to reprocess already-successful days; fix by
updating the checkpoint per successful day: after a successful write+commit for
a given day, set state.DailyCount += len(dayObs)+len(dayFacts), set
state.LastDaily = now.Format(time.RFC3339) (or a per-day timestamp you prefer)
and state.LastDailyHash = "" and then persist the state immediately using the
repository's existing state-save routine (the same persistence used elsewhere in
the codebase) so that retries will skip already-distilled days.

---

Outside diff comments:
In `@cmd/ox/distill.go`:
- Around line 339-340: Validate the --layer flag early instead of letting
invalid values fall through to plan.isEmpty(): after flags are parsed (where
distillCmd and distillLayer are defined), check if distillLayer is non-empty and
is one of the allowed values ("daily","weekly","monthly"); if not, return/exit
with a clear error message stating the invalid value and the allowed options.
Apply the same validation in the other distill code path that also reads
distillLayer (the second distill handler around the other diff block), and avoid
treating an invalid layer as "Nothing to distill" by performing this check
before creating or evaluating plan.isEmpty().

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0c9b30c0-7579-49f3-a1c1-325fc05f7dc1

📥 Commits

Reviewing files that changed from the base of the PR and between 0082522 and d8a8c78.

📒 Files selected for processing (2)
  • cmd/ox/distill.go
  • docs/ai/uat/distill-fix-uat.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/ai/uat/distill-fix-uat.md

Comment thread cmd/ox/distill.go
Comment thread cmd/ox/distill.go
…act files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
cmd/ox/distill.go (1)

57-59: Align inferDailyHighWater docs with actual behavior.

The comment says it returns end-of-day, but the implementation intentionally returns start-of-day. Keeping the contract precise avoids incorrect reuse by future callers.

Proposed fix
-// inferDailyHighWater scans memory/daily/ for the latest YYYY-MM-DD prefix.
-// Returns end-of-day UTC for that date, or zero time if no files.
+// inferDailyHighWater scans memory/daily/ for the latest YYYY-MM-DD prefix.
+// Returns start-of-day UTC for that date, or zero time if no files.

Also applies to: 84-90

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/ox/distill.go` around lines 57 - 59, The function comment for
inferDailyHighWater is inaccurate: it states the function returns the end-of-day
UTC but the implementation returns the start-of-day; update the docstring to
state it returns the start-of-day UTC for the latest YYYY-MM-DD prefix (or
alternatively change inferDailyHighWater to compute end-of-day if that was
intended) so the contract matches the actual behavior; reference the
inferDailyHighWater function and ensure the same wording is corrected for the
related comment block around lines 84-90.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/ox/distill_discussions.go`:
- Around line 243-249: The footer date match returned by factFooterDateRe
(inside parseFactDate) must be validated and rejected if it is a
sentinel/invalid date (e.g. "0001-01-01") before returning it; update
parseFactDate to parse the matched string (from factFooterDateRe) into a date
and only return it when it represents a valid/non-sentinel day (for example year
>= 2 or after a sensible epoch), otherwise continue to fallback logic
(factFilenameDateRe) or return empty; ensure you reference the
factFooterDateRe/factFilenameDateRe match handling so the invalid footer
timestamp is not used for bucketing.

In `@cmd/ox/distill.go`:
- Around line 246-249: Replace the silent-continue on read errors with a
fail-fast error return so unreadable distilled inputs don't get dropped: locate
the os.ReadFile call that assigns data, err (the line using
filepath.Join(dailyDir, f.name)) and the analogous block around the 306-308
region, and change the error branch to return/wrap the error (e.g. return
fmt.Errorf("reading distilled input %s: %w", filepath.Join(dailyDir, f.name),
err)) or propagate it up through the enclosing function instead of using
continue; ensure any callers handle the returned error appropriately.

---

Nitpick comments:
In `@cmd/ox/distill.go`:
- Around line 57-59: The function comment for inferDailyHighWater is inaccurate:
it states the function returns the end-of-day UTC but the implementation returns
the start-of-day; update the docstring to state it returns the start-of-day UTC
for the latest YYYY-MM-DD prefix (or alternatively change inferDailyHighWater to
compute end-of-day if that was intended) so the contract matches the actual
behavior; reference the inferDailyHighWater function and ensure the same wording
is corrected for the related comment block around lines 84-90.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 14073526-a176-4187-b7b8-ee0a206415a2

📥 Commits

Reviewing files that changed from the base of the PR and between d8a8c78 and 3b3a04e.

📒 Files selected for processing (2)
  • cmd/ox/distill.go
  • cmd/ox/distill_discussions.go

Comment thread cmd/ox/distill_discussions.go
Comment thread cmd/ox/distill.go
galexy and others added 3 commits March 13, 2026 18:53
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: SageOx <ox@sageox.ai>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@galexy galexy merged commit f71160f into sageox:main Mar 14, 2026
1 check passed
@galexy galexy deleted the fix/distill-logical-issues-211 branch March 14, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: ox distill has logical issues with determining which observations and discussion to include in daily summaries

1 participant