Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
ab6535b
tests/log(refactor[setup_logger]): remove add_from_fs references
tony Nov 2, 2025
6690af4
config/reader(chore[todo]): plan duplicate-aware loader alignment
tony Nov 2, 2025
f981d9c
config/reader(feat[duplicates]): add duplicate-aware loader
tony Nov 2, 2025
0558358
tests/config_reader(add[coverage]): verify duplicate-aware loader
tony Nov 2, 2025
ceaf4f5
cli/fmt(refactor): reuse duplicate-aware config loader
tony Nov 2, 2025
e79fdbf
cli/add(feat[path-mode]): Detect origin and prompt for path imports
tony Nov 2, 2025
5c4b1f1
tests/cli(test[add]): Cover path-mode workflow
tony Nov 2, 2025
ddcd619
docs(CHANGES) note `vcspull add` improvement
tony Nov 2, 2025
54aa40a
cli(add,discover)(feat[no-merge]): Add duplicate merge opt-out
tony Nov 2, 2025
dd5ba36
tests/cli(add,discover): Cover merge opt-out paths
tony Nov 2, 2025
7dfb41c
tests/cli(snapshot): Update add/discover expectations
tony Nov 2, 2025
b59175b
docs(CHANGES): Note add/discover --no-merge option
tony Nov 2, 2025
113ce52
config/load(feat[duplicates]): preserve workspace roots when reading
tony Nov 2, 2025
3b468a6
docs(CHANGES) note duplicate-aware config loader behavior
tony Nov 2, 2025
9e18ea8
test/config(refactor[yaml-fixture]): switch to multiline string
tony Nov 2, 2025
de1e1a2
tests(fixtures[dedent]): normalize multi-line YAML/JSON samples
tony Nov 2, 2025
6dc41b6
docs(CHANGES) Copy tweaks
tony Nov 2, 2025
b6b26be
docs(CHANGES) clarify duplicate merge defaults for users
tony Nov 2, 2025
bd66efe
docs(CHANGES) align fmt notes with user-facing tone
tony Nov 2, 2025
cf8813f
docs(cli): reflect duplicate-aware defaults succinctly
tony Nov 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 31 additions & 6 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -37,17 +37,42 @@ _Upcoming changes will be written here._

#### `vcspull fmt` gains duplicate root merging (#479)

- Detects repeated workspace root labels and merges their repositories during
formatting so users keep a single normalized section.
- Adds `--no-merge` for workflows that need to preserve duplicate entries while
still seeing diagnostic warnings.
- Running `vcspull fmt` now consolidates repeated workspace sections into one
merged block so no repositories vanish during cleanup; pass `--no-merge` if
you want the command to surface duplicates without rewriting them.

#### `vcspull add` protects workspace configs during imports (#480)

- Default behavior now merges duplicate workspace roots so prior entries stay
intact; add `--no-merge` to keep the raw sections and handle them yourself.
- You can invoke `vcspull add ~/study/python/project`; the command inspects the
path, auto-fills the `origin` remote, shows the tilde-shortened workspace, and
asks for confirmation unless you supply `--yes`.
- CLI previews now contract `$HOME` to `~/…`, matching the rest of the UX.

#### `vcspull discover` honors --no-merge (#480)

- Running `vcspull discover --no-merge` only reports duplicates—it leaves the
file untouched until you decide to act.

#### Configuration loader: Support for duplicate workspace roots (#480)

- Commands backed by `load_configs` (`list`, `status`, `sync`, etc.)
automatically keep every repository even when workspace sections repeat; pass
`merge_duplicates=False` to fall back to the legacy "last entry wins"
behavior with a warning.

### Development

#### Snapshot coverage for formatter tests (#479)

- Formatter scenarios now use [Syrupy]-backed JSON and YAML snapshots to guard
against regressions in duplicate workspace-root merging.
- Formatter scenarios are checked against [Syrupy] JSON/YAML snapshots, making it
obvious when future changes alter the merged output or log text.

#### CLI `add` path-mode regression tests (#480)

- Parameterized pytest scenarios cover interactive prompts, duplicate merging,
and path inference to keep the redesigned workflow stable.

[syrupy]: https://github.com/syrupy-project/syrupy

Expand Down
3 changes: 1 addition & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,6 @@ Follow this workflow for code changes:
- `cli/__init__.py`: Main CLI entry point with argument parsing
- `cli/sync.py`: Repository synchronization functionality
- `cli/add.py`: Adding new repositories to configuration
- `cli/add_from_fs.py`: Scanning filesystem for repositories

3. **Repository Management**
- Uses `libvcs` package for VCS operations (git, svn, hg)
Expand Down Expand Up @@ -247,4 +246,4 @@ When stuck in debugging loops:
1. **Pause and acknowledge the loop**
2. **Minimize to MVP**: Remove all debugging cruft and experimental code
3. **Document the issue** comprehensively for a fresh approach
4. Format for portability (using quadruple backticks)
4. Format for portability (using quadruple backticks)
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,10 @@ $ vcspull add my-lib https://github.com/example/my-lib.git --path ~/code/my-lib
`-w ~/projects/libs`.
- Pass `-f/--file` to add to an alternate YAML file.
- Use `--dry-run` to preview changes before writing.
- Point at an existing checkout (`vcspull add ~/projects/example`) to infer the
name and remote; add `--yes` to skip the confirmation prompt.
- Append `--no-merge` if you prefer to review duplicate workspace roots
yourself instead of having vcspull merge them automatically.
- Follow with `vcspull sync my-lib` to clone or update the working tree after registration.

### Discover local checkouts and add en masse
Expand All @@ -116,8 +120,9 @@ $ vcspull discover ~/code --recursive
```

The scan shows each repository before import unless you opt into `--yes`. Add
`-w ~/code/` to pin the resulting workspace root or `-f` to
write somewhere other than the default `~/.vcspull.yaml`.
`-w ~/code/` to pin the resulting workspace root or `-f` to write somewhere other
than the default `~/.vcspull.yaml`. Duplicate workspace roots are merged by
default; include `--no-merge` to keep them separate while you review the log.

### Inspect configured repositories

Expand Down Expand Up @@ -148,15 +153,16 @@ command for automation workflows.

### Normalize configuration files

After importing or editing by hand, run the formatter to tidy up keys and keep
entries sorted:
After importing or editing by hand, run the formatter to tidy up keys, merge
duplicate workspace sections, and keep entries sorted:

```console
$ vcspull fmt -f ~/.vcspull.yaml --write
```

Use `vcspull fmt --all --write` to format every YAML file that vcspull can
discover under the standard config locations.
discover under the standard config locations. Add `--no-merge` if you only want
duplicate roots reported, not rewritten.

## Sync your repos

Expand Down
17 changes: 9 additions & 8 deletions docs/cli/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@ The `--path` flag is useful when:
- Using non-standard directory layouts
- The repository name doesn't match the desired directory name

You can also point `vcspull add` at an existing checkout. Supplying a path such
as `vcspull add ~/projects/example` infers the repository name, inspects its
`origin` remote, and prompts before writing. Add `--yes` when you need to skip
the confirmation in scripts.

## Choosing configuration files

By default, vcspull looks for the first YAML configuration file in:
Expand Down Expand Up @@ -142,14 +147,10 @@ $ vcspull add tooling https://github.com/company/tooling.git \

## Handling duplicates

If a repository with the same name already exists in the workspace, vcspull will warn you:

```console
$ vcspull add flask https://github.com/pallets/flask.git -w ~/code/
WARNING: Repository 'flask' already exists in workspace '~/code/'.
```

The existing entry is preserved and not overwritten.
vcspull merges duplicate workspace sections by default so existing repositories
stay intact. When conflicts appear, the command logs what it kept. Prefer to
resolve duplicates yourself? Pass `--no-merge` to leave every section untouched
while still surfacing warnings.

## After adding repositories

Expand Down
4 changes: 4 additions & 0 deletions docs/cli/discover.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ Scan complete: 2 repositories added, 0 skipped
```

The command prompts for each repository before adding it to your configuration.
When a matching workspace section already exists, vcspull merges the new entry
into it so previously tracked repositories stay intact. Prefer to review
duplicates yourself? Add `--no-merge` to keep every section untouched while
still seeing a warning.

## Recursive scanning

Expand Down
9 changes: 8 additions & 1 deletion docs/cli/fmt.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@
entries stay consistent. By default the formatter prints the proposed changes to
stdout. Apply the updates in place with `--write`.

When duplicate workspace roots are encountered, the formatter merges them into a
single section so repositories are never dropped. Prefer to review duplicates
without rewriting them? Pass `--no-merge` to leave the original sections in
place while still showing a warning.

## Command

```{eval-rst}
Expand All @@ -19,12 +24,14 @@ stdout. Apply the updates in place with `--write`.

## What gets formatted

The formatter performs three main tasks:
The formatter performs four main tasks:

- Expands string-only entries into verbose dictionaries using the `repo` key.
- Converts legacy `url` keys to `repo` for consistency with the rest of the
tooling.
- Sorts directory keys and repository names alphabetically to minimize diffs.
- Consolidates duplicate workspace roots into a single merged section while
logging any conflicts.

For example:

Expand Down
132 changes: 132 additions & 0 deletions src/vcspull/_internal/config_reader.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from __future__ import annotations

import copy
import json
import pathlib
import typing as t
Expand Down Expand Up @@ -106,6 +107,20 @@ def _from_file(cls, path: pathlib.Path) -> dict[str, t.Any]:
assert isinstance(path, pathlib.Path)
content = path.open(encoding="utf-8").read()

# TODO(#?): Align this loader with the duplicate-aware YAML handling that
# ``vcspull fmt`` introduced in November 2025. The formatter now uses a
# custom SafeLoader subclass to retain and merge duplicate workspace root
# sections so repos are never overwritten. ConfigReader currently drops
# later duplicates because PyYAML keeps only the last key. Options:
# 1) Extract the formatter's loader/merge helpers into a shared utility
# that ConfigReader can reuse here;
# 2) Replace ConfigReader entirely when reading vcspull configs and call
# the formatter helpers directly;
# 3) Keep this basic loader but add an opt-in path for duplicate-aware
# parsing so commands like ``vcspull add`` can avoid data loss.
# Revisit once the new ``vcspull add`` flow lands so both commands share
# the same duplication safeguards.

if path.suffix in {".yaml", ".yml"}:
fmt: FormatLiteral = "yaml"
elif path.suffix == ".json":
Expand Down Expand Up @@ -204,3 +219,120 @@ def dump(self, fmt: FormatLiteral, indent: int = 2, **kwargs: t.Any) -> str:
indent=indent,
**kwargs,
)


class _DuplicateTrackingSafeLoader(yaml.SafeLoader):
"""SafeLoader that records duplicate top-level keys."""

def __init__(self, stream: str) -> None:
super().__init__(stream)
self.top_level_key_values: dict[t.Any, list[t.Any]] = {}
self._mapping_depth = 0


def _duplicate_tracking_construct_mapping(
loader: _DuplicateTrackingSafeLoader,
node: yaml.nodes.MappingNode,
deep: bool = False,
) -> dict[t.Any, t.Any]:
loader._mapping_depth += 1
loader.flatten_mapping(node)
mapping: dict[t.Any, t.Any] = {}

for key_node, value_node in node.value:
construct = t.cast(
t.Callable[[yaml.nodes.Node], t.Any],
loader.construct_object,
)
key = construct(key_node)
value = construct(value_node)

if loader._mapping_depth == 1:
loader.top_level_key_values.setdefault(key, []).append(copy.deepcopy(value))

mapping[key] = value

loader._mapping_depth -= 1
return mapping


_DuplicateTrackingSafeLoader.add_constructor(
yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
_duplicate_tracking_construct_mapping,
)


class DuplicateAwareConfigReader(ConfigReader):
"""ConfigReader that tracks duplicate top-level YAML sections."""

def __init__(
self,
content: RawConfigData,
*,
duplicate_sections: dict[str, list[t.Any]] | None = None,
) -> None:
super().__init__(content)
self._duplicate_sections = duplicate_sections or {}

@property
def duplicate_sections(self) -> dict[str, list[t.Any]]:
"""Mapping of top-level keys to the list of duplicated values."""
return self._duplicate_sections

@classmethod
def _load_yaml_with_duplicates(
cls,
content: str,
) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]:
loader = _DuplicateTrackingSafeLoader(content)

try:
data = loader.get_single_data()
finally:
dispose = t.cast(t.Callable[[], None], loader.dispose)
dispose()

if data is None:
loaded: dict[str, t.Any] = {}
else:
if not isinstance(data, dict):
msg = "Loaded configuration is not a mapping"
raise TypeError(msg)
loaded = t.cast("dict[str, t.Any]", data)

duplicate_sections = {
t.cast(str, key): values
for key, values in loader.top_level_key_values.items()
if len(values) > 1
}

return loaded, duplicate_sections

@classmethod
def _load_from_path(
cls,
path: pathlib.Path,
) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]:
if path.suffix.lower() in {".yaml", ".yml"}:
content = path.read_text(encoding="utf-8")
return cls._load_yaml_with_duplicates(content)

return ConfigReader._from_file(path), {}

@classmethod
def from_file(cls, path: pathlib.Path) -> DuplicateAwareConfigReader:
content, duplicate_sections = cls._load_from_path(path)
return cls(content, duplicate_sections=duplicate_sections)

@classmethod
def _from_file(cls, path: pathlib.Path) -> dict[str, t.Any]:
content, _ = cls._load_from_path(path)
return content

@classmethod
def load_with_duplicates(
cls,
path: pathlib.Path,
) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]:
reader = cls.from_file(path)
return reader.content, reader.duplicate_sections
12 changes: 3 additions & 9 deletions src/vcspull/cli/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from vcspull.log import setup_logger

from ._formatter import VcspullHelpFormatter
from .add import add_repo, create_add_subparser
from .add import add_repo, create_add_subparser, handle_add_command
from .discover import create_discover_subparser, discover_repos
from .fmt import create_fmt_subparser, format_config_file
from .list import create_list_subparser, list_repos
Expand Down Expand Up @@ -368,14 +368,7 @@ def cli(_args: list[str] | None = None) -> None:
max_concurrent=getattr(args, "max_concurrent", None),
)
elif args.subparser_name == "add":
add_repo(
name=args.name,
url=args.url,
config_file_path_str=args.config,
path=args.path,
workspace_root_path=args.workspace_root_path,
dry_run=args.dry_run,
)
handle_add_command(args)
elif args.subparser_name == "discover":
discover_repos(
scan_dir_str=args.scan_dir,
Expand All @@ -384,6 +377,7 @@ def cli(_args: list[str] | None = None) -> None:
workspace_root_override=args.workspace_root_path,
yes=args.yes,
dry_run=args.dry_run,
merge_duplicates=args.merge_duplicates,
)
elif args.subparser_name == "fmt":
format_config_file(
Expand Down
Loading