Skip to content

Prevent integration event and history sync from overwhelming local machines #82

@kjgbot

Description

@kjgbot

Problem

When a workspace has large integrations or high-volume webhook traffic, Pear can slow down the user's machine. Recent debugging showed several contributing patterns:

  • Local historical Relayfile mounts can contain a lot of files and data.
  • Polling or recursively watching mounted integration trees can produce a large amount of filesystem work.
  • Incoming webhook events can arrive in bursts and currently risk turning into unbounded broker injections.
  • Historical data, incoming webhook events, local mounts, and writeback are still too coupled in the runtime model.
  • Logging every integration event during a burst can add noticeable overhead.

The desired product behavior is:

  • The source/remote Relayfile instance may continue syncing and retaining historical integration data.
  • Pear, as a consumer, should not download or mount historical data unless the user explicitly enables historical download.
  • Incoming webhook data should still sync remotely, trigger Relayfile events, and notify selected agents without requiring historical local mounts.
  • Writeback should remain available without requiring a full local mirror.

Current Direction

Recent PR work made historical local download/mount opt-in in Pear, but there are more architectural hardening tasks before this is safe for high-volume users.

The main fix is not a language rewrite. Rust or Swift may help later for a constrained sidecar, but the first-order issue is unbounded dataflow: polling, recursive watching, broad subscriptions, and event fanout.

Proposed Architecture

1. Event-only default path

Default integration flow should be:

  1. Provider webhook arrives in cloud.
  2. Cloud/source Relayfile writes the incoming webhook data and emits a scoped event.
  3. Pear subscribes to selected Relayfile event paths via the SDK.
  4. Pear injects a compact <integration-event> into selected agents.
  5. Agents read additional context on demand via mounted history only if historical download is enabled, or via future targeted SDK/API reads.

Pear should not need to mount .integrations/ for event delivery.

2. Historical download is a consumer cache

Historical download/mount should be treated as an optional local cache:

  • Default off.
  • Explicit user toggle per project/integration/resource.
  • Bound by max files, max bytes, max depth, or selected resources where possible.
  • Never required for incoming webhook notification.
  • Never automatically enabled just because an integration is connected.

3. Writeback should use a small command surface

Writeback should not require recursively watching a full historical tree. Prefer one of:

  • SDK writeback API from Pear/agents.
  • A tiny .integrations/.outbox or provider-specific command area.
  • Targeted file paths only for known writeback-enabled command files.

Avoid recursive watchers on provider roots or history mirrors.

Implementation Tracks

Track A: Remove broad local recursive watchers

  • Audit src/main/integration-event-bridge.ts local mount watching.
  • Do not call watch(localRoot, { recursive: true }) for full integration history trees.
  • Only enable local watching when historical download is on and only for a bounded writeback/outbox command surface.
  • Ensure webhook-only mode uses remote Relayfile event stream only.

Acceptance criteria:

  • Starting Pear with connected integrations and historical download off creates no recursive watchers for integration data.
  • Incoming Slack/GitHub/Linear/Notion webhook events still notify subscribed agents.
  • Writeback still works through the intended command/API path.

Track B: Add bounded event queue and backpressure

Add a per-project integration event dispatcher before broker injection:

  • Max queued events per project.
  • Max injected events per second per project.
  • Coalesce by provider + resource path, and for Slack by channel/thread/message path.
  • Drop or compact duplicate/low-value events during bursts.
  • Emit a summary event when events are compacted, e.g. 12 Slack messages changed in #proj-cloud.
  • Avoid Promise.all fanout over large recipient lists without limits.

Acceptance criteria:

  • A burst of 1,000 file events does not create 1,000 immediate broker messages.
  • CPU remains bounded during bursts.
  • Agents receive useful summaries rather than a flood.

Track C: Server-side scoped subscriptions

Make sure Relayfile/cloud only sends relevant changes to Pear:

  • Subscribe to exact selected resources, not provider roots.
  • Avoid replaying historical changes as new events when Pear starts, unless explicitly requested.
  • Support from=now or equivalent resume semantics for default subscriptions.
  • Ensure event stream reconnect does not replay large historical backlogs by default.

Acceptance criteria:

  • Restarting Pear after a large integration sync does not inject old historical records as fresh events.
  • Event subscriptions only receive paths configured in the integration listener UI.

Track D: Lazy context reads

Do not require a local mirror for agents to fetch context:

  • Provide a targeted read path for event context, either via Relayfile SDK/API or a small Pear IPC bridge.
  • Event payload should include provider, resource path, resource id, title/status/actor when available.
  • Agents can request a specific file/resource by path without downloading the whole integration tree.

Acceptance criteria:

  • Agent can handle a Slack message event and fetch the specific message/thread context without historical download enabled.
  • Fetching context does not mount the provider root locally.

Track E: Recipient and broker fanout optimization

  • Cache project agent recipient lists briefly and invalidate when agents change.
  • Cache explicit integration notification targets.
  • Avoid listing agents for every event during bursts.
  • Rate-limit broker sendMessage calls per project.

Acceptance criteria:

  • Event bursts do not repeatedly call listAgents for each event.
  • Broker message sends are bounded and observable.

Track F: Logging and telemetry budgets

  • Gate verbose integration event logs behind a debug flag.
  • Aggregate repetitive event stream errors and polling fallback logs.
  • Add basic counters for events received, events injected, events coalesced, events dropped, queue depth, and mount count.

Acceptance criteria:

  • A high-volume event stream does not spam the main-process console.
  • We can diagnose bottlenecks from counters without enabling verbose logs.

Track G: Mount resource budgets

For historical download/mount when explicitly enabled:

  • Add warnings before mounting large integrations.
  • Enforce or expose limits: max files, max bytes, selected resources only.
  • Prefer narrow resource mounts over provider roots.
  • Avoid mode: poll for high-volume paths if the SDK can support event-driven or demand-loaded reads.

Acceptance criteria:

  • Enabling historical download for a large integration does not silently start an unbounded local sync.
  • UI communicates that historical download may be expensive.

Open Questions

  • Does @relayfile/sdk support event subscriptions with from=now or a no-backfill cursor? If not, add/track this in Relayfile.
  • Can writeback be exposed directly through SDK/API so Pear does not need local filesystem watching for command files?
  • Can Relayfile expose targeted resource reads suitable for agents without mounting local history?
  • What event compaction semantics should Slack use: per channel, per thread, per message, or time window?

Non-goals

  • Rewriting the whole Electron app in Rust or Swift.
  • Mounting every connected integration by default.
  • Treating historical source sync as disabled. The source remote instance can still retain history; Pear should just avoid downloading it unless requested.

Files To Inspect First

  • src/main/integration-event-bridge.ts
  • src/main/integration-mounts.ts
  • src/main/integrations.ts
  • src/main/relayfile-mount-launcher.ts
  • Relayfile SDK event subscription and mount APIs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions