Storing AI transcripts on a git branch is architecturally problematic for privacy

## The problem

Entire stores full AI session transcripts (prompts, responses, tool calls, file paths, commands) on the `entire/checkpoints/v1` branch within the same git repository, which means transcripts inherit the repo's access model.

For **open source projects**, this is especially concerning because the repo is public, so the transcripts are too. Anyone who fetches the branch can read the complete AI conversation history, including the developer's reasoning, mistakes, internal context, and potentially sensitive information that slipped past redaction.

Even for **private repos**, transcripts become visible to every collaborator with read access, and the trust boundary for "who can see the code" is not the same as "who should see the raw AI session history."

## Why a flag doesn't solve this

`--skip-push-sessions` exists, but it's not even the default, so sessions are pushed to the remote on every `git push` unless you explicitly opt out. Even with the flag, the transcripts are still committed to a local branch that can be inadvertently pushed, forked, or included in mirrors. The fundamental issue is that coupling transcript storage to the git repo means they will always travel with the code.

## The sensitivity of AI transcripts

AI coding transcripts are uniquely sensitive because they can contain:
- Internal reasoning and architectural decision-making
- Partial secrets or credentials that slip past entropy-based redaction
- Context about proprietary systems shared in prompts
- Debugging discussions that reveal security weaknesses
- File paths and system information

These transcripts capture a complete record of how code was written, including the parts developers would never put in a commit message or PR description.

## Alternative approaches

Projects like [AgentLogs](https://github.com/agentlogs/agentlogs) decouple transcript storage from the repository entirely, which seems like a more sound architecture for this kind of data because the transcripts don't inherit the repo's access model and can be managed with their own access controls.

Has the team considered an architecture where transcripts are stored outside the git repo (e.g., a local database, a self-hostable server, or an optional remote backend), rather than on a git branch?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storing AI transcripts on a git branch is architecturally problematic for privacy #340

The problem

Why a flag doesn't solve this

The sensitivity of AI transcripts

Alternative approaches

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Storing AI transcripts on a git branch is architecturally problematic for privacy #340

Description

The problem

Why a flag doesn't solve this

The sensitivity of AI transcripts

Alternative approaches

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions