diff --git a/CHANGES b/CHANGES index 2c0a4ac5..02f995b8 100644 --- a/CHANGES +++ b/CHANGES @@ -37,17 +37,42 @@ _Upcoming changes will be written here._ #### `vcspull fmt` gains duplicate root merging (#479) -- Detects repeated workspace root labels and merges their repositories during - formatting so users keep a single normalized section. -- Adds `--no-merge` for workflows that need to preserve duplicate entries while - still seeing diagnostic warnings. +- Running `vcspull fmt` now consolidates repeated workspace sections into one + merged block so no repositories vanish during cleanup; pass `--no-merge` if + you want the command to surface duplicates without rewriting them. + +#### `vcspull add` protects workspace configs during imports (#480) + +- Default behavior now merges duplicate workspace roots so prior entries stay + intact; add `--no-merge` to keep the raw sections and handle them yourself. +- You can invoke `vcspull add ~/study/python/project`; the command inspects the + path, auto-fills the `origin` remote, shows the tilde-shortened workspace, and + asks for confirmation unless you supply `--yes`. +- CLI previews now contract `$HOME` to `~/…`, matching the rest of the UX. + +#### `vcspull discover` honors --no-merge (#480) + +- Running `vcspull discover --no-merge` only reports duplicates—it leaves the + file untouched until you decide to act. + +#### Configuration loader: Support for duplicate workspace roots (#480) + +- Commands backed by `load_configs` (`list`, `status`, `sync`, etc.) + automatically keep every repository even when workspace sections repeat; pass + `merge_duplicates=False` to fall back to the legacy "last entry wins" + behavior with a warning. ### Development #### Snapshot coverage for formatter tests (#479) -- Formatter scenarios now use [Syrupy]-backed JSON and YAML snapshots to guard - against regressions in duplicate workspace-root merging. +- Formatter scenarios are checked against [Syrupy] JSON/YAML snapshots, making it + obvious when future changes alter the merged output or log text. + +#### CLI `add` path-mode regression tests (#480) + +- Parameterized pytest scenarios cover interactive prompts, duplicate merging, + and path inference to keep the redesigned workflow stable. [syrupy]: https://github.com/syrupy-project/syrupy diff --git a/CLAUDE.md b/CLAUDE.md index 70aeec05..20a0dc2b 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -103,7 +103,6 @@ Follow this workflow for code changes: - `cli/__init__.py`: Main CLI entry point with argument parsing - `cli/sync.py`: Repository synchronization functionality - `cli/add.py`: Adding new repositories to configuration - - `cli/add_from_fs.py`: Scanning filesystem for repositories 3. **Repository Management** - Uses `libvcs` package for VCS operations (git, svn, hg) @@ -247,4 +246,4 @@ When stuck in debugging loops: 1. **Pause and acknowledge the loop** 2. **Minimize to MVP**: Remove all debugging cruft and experimental code 3. **Document the issue** comprehensively for a fresh approach -4. Format for portability (using quadruple backticks) \ No newline at end of file +4. Format for portability (using quadruple backticks) diff --git a/README.md b/README.md index a7989e92..d016771e 100644 --- a/README.md +++ b/README.md @@ -104,6 +104,10 @@ $ vcspull add my-lib https://github.com/example/my-lib.git --path ~/code/my-lib `-w ~/projects/libs`. - Pass `-f/--file` to add to an alternate YAML file. - Use `--dry-run` to preview changes before writing. +- Point at an existing checkout (`vcspull add ~/projects/example`) to infer the + name and remote; add `--yes` to skip the confirmation prompt. +- Append `--no-merge` if you prefer to review duplicate workspace roots + yourself instead of having vcspull merge them automatically. - Follow with `vcspull sync my-lib` to clone or update the working tree after registration. ### Discover local checkouts and add en masse @@ -116,8 +120,9 @@ $ vcspull discover ~/code --recursive ``` The scan shows each repository before import unless you opt into `--yes`. Add -`-w ~/code/` to pin the resulting workspace root or `-f` to -write somewhere other than the default `~/.vcspull.yaml`. +`-w ~/code/` to pin the resulting workspace root or `-f` to write somewhere other +than the default `~/.vcspull.yaml`. Duplicate workspace roots are merged by +default; include `--no-merge` to keep them separate while you review the log. ### Inspect configured repositories @@ -148,15 +153,16 @@ command for automation workflows. ### Normalize configuration files -After importing or editing by hand, run the formatter to tidy up keys and keep -entries sorted: +After importing or editing by hand, run the formatter to tidy up keys, merge +duplicate workspace sections, and keep entries sorted: ```console $ vcspull fmt -f ~/.vcspull.yaml --write ``` Use `vcspull fmt --all --write` to format every YAML file that vcspull can -discover under the standard config locations. +discover under the standard config locations. Add `--no-merge` if you only want +duplicate roots reported, not rewritten. ## Sync your repos diff --git a/docs/cli/add.md b/docs/cli/add.md index 6d40e06b..aab6e762 100644 --- a/docs/cli/add.md +++ b/docs/cli/add.md @@ -63,6 +63,11 @@ The `--path` flag is useful when: - Using non-standard directory layouts - The repository name doesn't match the desired directory name +You can also point `vcspull add` at an existing checkout. Supplying a path such +as `vcspull add ~/projects/example` infers the repository name, inspects its +`origin` remote, and prompts before writing. Add `--yes` when you need to skip +the confirmation in scripts. + ## Choosing configuration files By default, vcspull looks for the first YAML configuration file in: @@ -142,14 +147,10 @@ $ vcspull add tooling https://github.com/company/tooling.git \ ## Handling duplicates -If a repository with the same name already exists in the workspace, vcspull will warn you: - -```console -$ vcspull add flask https://github.com/pallets/flask.git -w ~/code/ -WARNING: Repository 'flask' already exists in workspace '~/code/'. -``` - -The existing entry is preserved and not overwritten. +vcspull merges duplicate workspace sections by default so existing repositories +stay intact. When conflicts appear, the command logs what it kept. Prefer to +resolve duplicates yourself? Pass `--no-merge` to leave every section untouched +while still surfacing warnings. ## After adding repositories diff --git a/docs/cli/discover.md b/docs/cli/discover.md index 331f6669..4e7ee35d 100644 --- a/docs/cli/discover.md +++ b/docs/cli/discover.md @@ -45,6 +45,10 @@ Scan complete: 2 repositories added, 0 skipped ``` The command prompts for each repository before adding it to your configuration. +When a matching workspace section already exists, vcspull merges the new entry +into it so previously tracked repositories stay intact. Prefer to review +duplicates yourself? Add `--no-merge` to keep every section untouched while +still seeing a warning. ## Recursive scanning diff --git a/docs/cli/fmt.md b/docs/cli/fmt.md index 20807009..ef12e69d 100644 --- a/docs/cli/fmt.md +++ b/docs/cli/fmt.md @@ -6,6 +6,11 @@ entries stay consistent. By default the formatter prints the proposed changes to stdout. Apply the updates in place with `--write`. +When duplicate workspace roots are encountered, the formatter merges them into a +single section so repositories are never dropped. Prefer to review duplicates +without rewriting them? Pass `--no-merge` to leave the original sections in +place while still showing a warning. + ## Command ```{eval-rst} @@ -19,12 +24,14 @@ stdout. Apply the updates in place with `--write`. ## What gets formatted -The formatter performs three main tasks: +The formatter performs four main tasks: - Expands string-only entries into verbose dictionaries using the `repo` key. - Converts legacy `url` keys to `repo` for consistency with the rest of the tooling. - Sorts directory keys and repository names alphabetically to minimize diffs. +- Consolidates duplicate workspace roots into a single merged section while + logging any conflicts. For example: diff --git a/src/vcspull/_internal/config_reader.py b/src/vcspull/_internal/config_reader.py index 1c4890d3..22583b6d 100644 --- a/src/vcspull/_internal/config_reader.py +++ b/src/vcspull/_internal/config_reader.py @@ -1,5 +1,6 @@ from __future__ import annotations +import copy import json import pathlib import typing as t @@ -106,6 +107,20 @@ def _from_file(cls, path: pathlib.Path) -> dict[str, t.Any]: assert isinstance(path, pathlib.Path) content = path.open(encoding="utf-8").read() + # TODO(#?): Align this loader with the duplicate-aware YAML handling that + # ``vcspull fmt`` introduced in November 2025. The formatter now uses a + # custom SafeLoader subclass to retain and merge duplicate workspace root + # sections so repos are never overwritten. ConfigReader currently drops + # later duplicates because PyYAML keeps only the last key. Options: + # 1) Extract the formatter's loader/merge helpers into a shared utility + # that ConfigReader can reuse here; + # 2) Replace ConfigReader entirely when reading vcspull configs and call + # the formatter helpers directly; + # 3) Keep this basic loader but add an opt-in path for duplicate-aware + # parsing so commands like ``vcspull add`` can avoid data loss. + # Revisit once the new ``vcspull add`` flow lands so both commands share + # the same duplication safeguards. + if path.suffix in {".yaml", ".yml"}: fmt: FormatLiteral = "yaml" elif path.suffix == ".json": @@ -204,3 +219,120 @@ def dump(self, fmt: FormatLiteral, indent: int = 2, **kwargs: t.Any) -> str: indent=indent, **kwargs, ) + + +class _DuplicateTrackingSafeLoader(yaml.SafeLoader): + """SafeLoader that records duplicate top-level keys.""" + + def __init__(self, stream: str) -> None: + super().__init__(stream) + self.top_level_key_values: dict[t.Any, list[t.Any]] = {} + self._mapping_depth = 0 + + +def _duplicate_tracking_construct_mapping( + loader: _DuplicateTrackingSafeLoader, + node: yaml.nodes.MappingNode, + deep: bool = False, +) -> dict[t.Any, t.Any]: + loader._mapping_depth += 1 + loader.flatten_mapping(node) + mapping: dict[t.Any, t.Any] = {} + + for key_node, value_node in node.value: + construct = t.cast( + t.Callable[[yaml.nodes.Node], t.Any], + loader.construct_object, + ) + key = construct(key_node) + value = construct(value_node) + + if loader._mapping_depth == 1: + loader.top_level_key_values.setdefault(key, []).append(copy.deepcopy(value)) + + mapping[key] = value + + loader._mapping_depth -= 1 + return mapping + + +_DuplicateTrackingSafeLoader.add_constructor( + yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, + _duplicate_tracking_construct_mapping, +) + + +class DuplicateAwareConfigReader(ConfigReader): + """ConfigReader that tracks duplicate top-level YAML sections.""" + + def __init__( + self, + content: RawConfigData, + *, + duplicate_sections: dict[str, list[t.Any]] | None = None, + ) -> None: + super().__init__(content) + self._duplicate_sections = duplicate_sections or {} + + @property + def duplicate_sections(self) -> dict[str, list[t.Any]]: + """Mapping of top-level keys to the list of duplicated values.""" + return self._duplicate_sections + + @classmethod + def _load_yaml_with_duplicates( + cls, + content: str, + ) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]: + loader = _DuplicateTrackingSafeLoader(content) + + try: + data = loader.get_single_data() + finally: + dispose = t.cast(t.Callable[[], None], loader.dispose) + dispose() + + if data is None: + loaded: dict[str, t.Any] = {} + else: + if not isinstance(data, dict): + msg = "Loaded configuration is not a mapping" + raise TypeError(msg) + loaded = t.cast("dict[str, t.Any]", data) + + duplicate_sections = { + t.cast(str, key): values + for key, values in loader.top_level_key_values.items() + if len(values) > 1 + } + + return loaded, duplicate_sections + + @classmethod + def _load_from_path( + cls, + path: pathlib.Path, + ) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]: + if path.suffix.lower() in {".yaml", ".yml"}: + content = path.read_text(encoding="utf-8") + return cls._load_yaml_with_duplicates(content) + + return ConfigReader._from_file(path), {} + + @classmethod + def from_file(cls, path: pathlib.Path) -> DuplicateAwareConfigReader: + content, duplicate_sections = cls._load_from_path(path) + return cls(content, duplicate_sections=duplicate_sections) + + @classmethod + def _from_file(cls, path: pathlib.Path) -> dict[str, t.Any]: + content, _ = cls._load_from_path(path) + return content + + @classmethod + def load_with_duplicates( + cls, + path: pathlib.Path, + ) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]: + reader = cls.from_file(path) + return reader.content, reader.duplicate_sections diff --git a/src/vcspull/cli/__init__.py b/src/vcspull/cli/__init__.py index 199f5875..5a181c20 100644 --- a/src/vcspull/cli/__init__.py +++ b/src/vcspull/cli/__init__.py @@ -15,7 +15,7 @@ from vcspull.log import setup_logger from ._formatter import VcspullHelpFormatter -from .add import add_repo, create_add_subparser +from .add import add_repo, create_add_subparser, handle_add_command from .discover import create_discover_subparser, discover_repos from .fmt import create_fmt_subparser, format_config_file from .list import create_list_subparser, list_repos @@ -368,14 +368,7 @@ def cli(_args: list[str] | None = None) -> None: max_concurrent=getattr(args, "max_concurrent", None), ) elif args.subparser_name == "add": - add_repo( - name=args.name, - url=args.url, - config_file_path_str=args.config, - path=args.path, - workspace_root_path=args.workspace_root_path, - dry_run=args.dry_run, - ) + handle_add_command(args) elif args.subparser_name == "discover": discover_repos( scan_dir_str=args.scan_dir, @@ -384,6 +377,7 @@ def cli(_args: list[str] | None = None) -> None: workspace_root_override=args.workspace_root_path, yes=args.yes, dry_run=args.dry_run, + merge_duplicates=args.merge_duplicates, ) elif args.subparser_name == "fmt": format_config_file( diff --git a/src/vcspull/cli/add.py b/src/vcspull/cli/add.py index 5c6baba7..8220907a 100644 --- a/src/vcspull/cli/add.py +++ b/src/vcspull/cli/add.py @@ -4,20 +4,23 @@ import logging import pathlib +import subprocess import traceback import typing as t from colorama import Fore, Style -from vcspull._internal.config_reader import ConfigReader +from vcspull._internal.config_reader import DuplicateAwareConfigReader from vcspull.config import ( canonicalize_workspace_path, expand_dir, find_home_config_files, + merge_duplicate_workspace_roots, normalize_workspace_roots, save_config_yaml, workspace_root_label, ) +from vcspull.util import contract_user_home if t.TYPE_CHECKING: import argparse @@ -34,12 +37,21 @@ def create_add_subparser(parser: argparse.ArgumentParser) -> None: The parser to configure """ parser.add_argument( - "name", - help="Name for the repository in the config", + "target", + help=( + "Repository name (when providing a URL) or filesystem path to an " + "existing project" + ), ) parser.add_argument( "url", - help="Repository URL (e.g., https://github.com/user/repo.git)", + nargs="?", + help="Repository URL when explicitly specifying the name", + ) + parser.add_argument( + "--name", + dest="override_name", + help="Override detected repository name when importing from a path", ) parser.add_argument( "-f", @@ -71,6 +83,20 @@ def create_add_subparser(parser: argparse.ArgumentParser) -> None: action="store_true", help="Preview changes without writing to config file", ) + parser.add_argument( + "--no-merge", + dest="merge_duplicates", + action="store_false", + help="Skip merging duplicate workspace roots before writing", + ) + parser.add_argument( + "-y", + "--yes", + dest="assume_yes", + action="store_true", + help="Automatically confirm interactive prompts", + ) + parser.set_defaults(merge_duplicates=True) def _resolve_workspace_path( @@ -102,6 +128,179 @@ def _resolve_workspace_path( return cwd +def _detect_git_remote(repo_path: pathlib.Path) -> str | None: + """Return the ``origin`` remote URL for a Git repository if available.""" + try: + result = subprocess.run( + ["git", "-C", str(repo_path), "remote", "get-url", "origin"], + check=True, + capture_output=True, + text=True, + ) + except FileNotFoundError: + log.debug("git executable not found when inspecting %s", repo_path) + return None + except subprocess.CalledProcessError: + log.debug("No git remote 'origin' configured for %s", repo_path) + return None + + remote = result.stdout.strip() + return remote or None + + +def _normalize_detected_url(remote: str | None) -> tuple[str, str]: + """Return display and config URLs derived from a detected remote.""" + if remote is None: + return "", "" + + display_url = remote + config_url = remote + + normalized = remote.strip() + + if normalized and not normalized.startswith("git+"): + if normalized.startswith(("http://", "https://", "file://")): + config_url = f"git+{normalized}" + else: + config_url = normalized + elif normalized: + config_url = normalized + + return display_url, config_url + + +def handle_add_command(args: argparse.Namespace) -> None: + """Entry point for the ``vcspull add`` CLI command.""" + explicit_url = getattr(args, "url", None) + + if explicit_url: + add_repo( + name=args.target, + url=explicit_url, + config_file_path_str=args.config, + path=args.path, + workspace_root_path=args.workspace_root_path, + dry_run=args.dry_run, + merge_duplicates=args.merge_duplicates, + ) + return + + repo_input = getattr(args, "target", None) + if repo_input is None: + log.error("A repository path or name must be provided.") + return + + cwd = pathlib.Path.cwd() + repo_path = expand_dir(pathlib.Path(repo_input), cwd=cwd) + + if not repo_path.exists(): + log.error("Repository path %s does not exist.", repo_path) + return + + if not repo_path.is_dir(): + log.error("Repository path %s is not a directory.", repo_path) + return + + override_name = getattr(args, "override_name", None) + repo_name = override_name or repo_path.name + + detected_remote = _detect_git_remote(repo_path) + display_url, config_url = _normalize_detected_url(detected_remote) + + if not config_url: + display_url = contract_user_home(repo_path) + config_url = str(repo_path) + log.warning( + "Unable to determine git remote for %s; using local path in config.", + repo_path, + ) + + workspace_root_input = ( + args.workspace_root_path + if getattr(args, "workspace_root_path", None) + else repo_path.parent.as_posix() + ) + + workspace_path = expand_dir(pathlib.Path(workspace_root_input), cwd=cwd) + workspace_label = workspace_root_label( + workspace_path, + cwd=cwd, + home=pathlib.Path.home(), + ) + + summary_url = display_url or config_url + + display_path = contract_user_home(repo_path) + + log.info("%sFound new repository to import:%s", Fore.GREEN, Style.RESET_ALL) + log.info( + " %s+%s %s%s%s (%s%s%s)", + Fore.GREEN, + Style.RESET_ALL, + Fore.CYAN, + repo_name, + Style.RESET_ALL, + Fore.YELLOW, + summary_url, + Style.RESET_ALL, + ) + log.info( + " %s•%s workspace: %s%s%s", + Fore.BLUE, + Style.RESET_ALL, + Fore.MAGENTA, + workspace_label, + Style.RESET_ALL, + ) + log.info( + " %s↳%s path: %s%s%s", + Fore.BLUE, + Style.RESET_ALL, + Fore.BLUE, + display_path, + Style.RESET_ALL, + ) + + prompt_text = f"{Fore.CYAN}?{Style.RESET_ALL} Import this repository? [y/N]: " + + proceed = True + if args.dry_run: + log.info( + "%s?%s Import this repository? [y/N]: %sskipped (dry-run)%s", + Fore.CYAN, + Style.RESET_ALL, + Fore.YELLOW, + Style.RESET_ALL, + ) + elif getattr(args, "assume_yes", False): + log.info( + "%s?%s Import this repository? [y/N]: %sy (auto-confirm)%s", + Fore.CYAN, + Style.RESET_ALL, + Fore.GREEN, + Style.RESET_ALL, + ) + else: + try: + response = input(prompt_text) + except EOFError: + response = "" + proceed = response.strip().lower() in {"y", "yes"} + if not proceed: + log.info("Aborted import of '%s' from %s", repo_name, repo_path) + return + + add_repo( + name=repo_name, + url=config_url, + config_file_path_str=args.config, + path=str(repo_path), + workspace_root_path=workspace_root_input, + dry_run=args.dry_run, + merge_duplicates=args.merge_duplicates, + ) + + def add_repo( name: str, url: str, @@ -109,6 +308,8 @@ def add_repo( path: str | None, workspace_root_path: str | None, dry_run: bool, + *, + merge_duplicates: bool = True, ) -> None: """Add a repository to the vcspull configuration. @@ -148,42 +349,111 @@ def add_repo( config_file_path = home_configs[0] # Load existing config - raw_config: dict[str, t.Any] = {} + raw_config: dict[str, t.Any] + duplicate_root_occurrences: dict[str, list[t.Any]] if config_file_path.exists() and config_file_path.is_file(): try: - loaded_config = ConfigReader._from_file(config_file_path) + ( + raw_config, + duplicate_root_occurrences, + ) = DuplicateAwareConfigReader.load_with_duplicates(config_file_path) + except TypeError: + log.exception( + "Config file %s is not a valid YAML dictionary.", + config_file_path, + ) + return except Exception: log.exception("Error loading YAML from %s. Aborting.", config_file_path) if log.isEnabledFor(logging.DEBUG): traceback.print_exc() return - - if loaded_config is None: - raw_config = {} - elif isinstance(loaded_config, dict): - raw_config = loaded_config - else: - log.error( - "Config file %s is not a valid YAML dictionary.", - config_file_path, - ) - return else: + raw_config = {} + duplicate_root_occurrences = {} log.info( "Config file %s not found. A new one will be created.", config_file_path, ) + duplicate_merge_conflicts: list[str] = [] + duplicate_merge_changes = 0 + duplicate_merge_details: list[tuple[str, int]] = [] + + if merge_duplicates: + ( + raw_config, + duplicate_merge_conflicts, + duplicate_merge_changes, + duplicate_merge_details, + ) = merge_duplicate_workspace_roots(raw_config, duplicate_root_occurrences) + for message in duplicate_merge_conflicts: + log.warning(message) + + if duplicate_merge_changes and duplicate_merge_details: + for label, occurrence_count in duplicate_merge_details: + log.info( + "%s•%s Merged %s%d%s duplicate entr%s for workspace root %s%s%s", + Fore.BLUE, + Style.RESET_ALL, + Fore.YELLOW, + occurrence_count, + Style.RESET_ALL, + "y" if occurrence_count == 1 else "ies", + Fore.MAGENTA, + label, + Style.RESET_ALL, + ) + else: + if duplicate_root_occurrences: + duplicate_merge_details = [ + (label, len(values)) + for label, values in duplicate_root_occurrences.items() + ] + for label, occurrence_count in duplicate_merge_details: + log.warning( + "%s•%s Duplicate workspace root %s%s%s appears %s%d%s time%s; " + "skipping merge because --no-merge was provided.", + Fore.BLUE, + Style.RESET_ALL, + Fore.MAGENTA, + label, + Style.RESET_ALL, + Fore.YELLOW, + occurrence_count, + Style.RESET_ALL, + "" if occurrence_count == 1 else "s", + ) + + duplicate_merge_conflicts = [] + cwd = pathlib.Path.cwd() home = pathlib.Path.home() - normalization_result = normalize_workspace_roots( - raw_config, - cwd=cwd, - home=home, - ) - raw_config, workspace_map, merge_conflicts, merge_changes = normalization_result - config_was_normalized = merge_changes > 0 + if merge_duplicates: + ( + raw_config, + workspace_map, + merge_conflicts, + merge_changes, + ) = normalize_workspace_roots( + raw_config, + cwd=cwd, + home=home, + ) + config_was_normalized = (merge_changes + duplicate_merge_changes) > 0 + else: + ( + _normalized_preview, + workspace_map, + merge_conflicts, + _merge_changes, + ) = normalize_workspace_roots( + raw_config, + cwd=cwd, + home=home, + ) + config_was_normalized = False for message in merge_conflicts: log.warning(message) diff --git a/src/vcspull/cli/discover.py b/src/vcspull/cli/discover.py index f3f60590..8d66afd3 100644 --- a/src/vcspull/cli/discover.py +++ b/src/vcspull/cli/discover.py @@ -11,11 +11,12 @@ from colorama import Fore, Style -from vcspull._internal.config_reader import ConfigReader +from vcspull._internal.config_reader import DuplicateAwareConfigReader from vcspull.config import ( canonicalize_workspace_path, expand_dir, find_home_config_files, + merge_duplicate_workspace_roots, normalize_workspace_roots, save_config_yaml, workspace_root_label, @@ -104,6 +105,13 @@ def create_discover_subparser(parser: argparse.ArgumentParser) -> None: action="store_true", help="Preview changes without writing to config file", ) + parser.add_argument( + "--no-merge", + dest="merge_duplicates", + action="store_false", + help="Skip merging duplicate workspace roots before writing", + ) + parser.set_defaults(merge_duplicates=True) def _resolve_workspace_path( @@ -142,6 +150,8 @@ def discover_repos( workspace_root_override: str | None, yes: bool, dry_run: bool, + *, + merge_duplicates: bool = True, ) -> None: """Scan filesystem for git repositories and add to vcspull config. @@ -186,27 +196,36 @@ def discover_repos( else: config_file_path = home_configs[0] - raw_config: dict[str, t.Any] = {} + raw_config: dict[str, t.Any] + duplicate_root_occurrences: dict[str, list[t.Any]] if config_file_path.exists() and config_file_path.is_file(): try: - loaded_config = ConfigReader._from_file(config_file_path) + ( + raw_config, + duplicate_root_occurrences, + ) = DuplicateAwareConfigReader.load_with_duplicates(config_file_path) + except TypeError: + log.exception( + "Config file %s is not a valid YAML dictionary.", + config_file_path, + ) + return except Exception: log.exception("Error loading YAML from %s. Aborting.", config_file_path) if log.isEnabledFor(logging.DEBUG): traceback.print_exc() return - - if loaded_config is None: + if raw_config is None: raw_config = {} - elif isinstance(loaded_config, dict): - raw_config = loaded_config - else: + elif not isinstance(raw_config, dict): log.error( "Config file %s is not a valid YAML dictionary.", config_file_path, ) return else: + raw_config = {} + duplicate_root_occurrences = {} log.info( "%si%s Config file %s%s%s not found. A new one will be created.", Fore.CYAN, @@ -216,15 +235,79 @@ def discover_repos( Style.RESET_ALL, ) + duplicate_merge_conflicts: list[str] = [] + duplicate_merge_changes = 0 + duplicate_merge_details: list[tuple[str, int]] = [] + + if merge_duplicates: + ( + raw_config, + duplicate_merge_conflicts, + duplicate_merge_changes, + duplicate_merge_details, + ) = merge_duplicate_workspace_roots(raw_config, duplicate_root_occurrences) + for message in duplicate_merge_conflicts: + log.warning(message) + if duplicate_merge_changes and duplicate_merge_details: + for label, occurrence_count in duplicate_merge_details: + log.info( + "%s•%s Merged %s%d%s duplicate entr%s for workspace root %s%s%s", + Fore.BLUE, + Style.RESET_ALL, + Fore.YELLOW, + occurrence_count, + Style.RESET_ALL, + "y" if occurrence_count == 1 else "ies", + Fore.MAGENTA, + label, + Style.RESET_ALL, + ) + else: + if duplicate_root_occurrences: + duplicate_merge_details = [ + (label, len(values)) + for label, values in duplicate_root_occurrences.items() + ] + for label, occurrence_count in duplicate_merge_details: + log.warning( + "%s•%s Duplicate workspace root %s%s%s appears %s%d%s time%s; " + "skipping merge because --no-merge was provided.", + Fore.BLUE, + Style.RESET_ALL, + Fore.MAGENTA, + label, + Style.RESET_ALL, + Fore.YELLOW, + occurrence_count, + Style.RESET_ALL, + "" if occurrence_count == 1 else "s", + ) + cwd = pathlib.Path.cwd() home = pathlib.Path.home() - normalization_result = normalize_workspace_roots( - raw_config, - cwd=cwd, - home=home, - ) - raw_config, workspace_map, merge_conflicts, merge_changes = normalization_result + if merge_duplicates: + ( + raw_config, + workspace_map, + merge_conflicts, + merge_changes, + ) = normalize_workspace_roots( + raw_config, + cwd=cwd, + home=home, + ) + else: + ( + _, + workspace_map, + merge_conflicts, + merge_changes, + ) = normalize_workspace_roots( + raw_config, + cwd=cwd, + home=home, + ) for message in merge_conflicts: log.warning(message) @@ -350,7 +433,9 @@ def discover_repos( Style.RESET_ALL, ) - changes_made = merge_changes > 0 + changes_made = merge_duplicates and ( + merge_changes > 0 or duplicate_merge_changes > 0 + ) if not repos_to_add: if existing_repos: diff --git a/src/vcspull/cli/fmt.py b/src/vcspull/cli/fmt.py index 1ac60078..31ca7170 100644 --- a/src/vcspull/cli/fmt.py +++ b/src/vcspull/cli/fmt.py @@ -8,13 +8,13 @@ import traceback import typing as t -import yaml from colorama import Fore, Style -from vcspull._internal.config_reader import ConfigReader +from vcspull._internal.config_reader import DuplicateAwareConfigReader from vcspull.config import ( find_config_files, find_home_config_files, + merge_duplicate_workspace_roots, normalize_workspace_roots, save_config_yaml, ) @@ -25,149 +25,6 @@ log = logging.getLogger(__name__) -class _DuplicateTrackingSafeLoader(yaml.SafeLoader): - """PyYAML loader that records duplicate top-level keys.""" - - def __init__(self, stream: str) -> None: - super().__init__(stream) - self.top_level_key_values: dict[t.Any, list[t.Any]] = {} - self._mapping_depth = 0 - - -def _duplicate_tracking_construct_mapping( - loader: _DuplicateTrackingSafeLoader, - node: yaml.nodes.MappingNode, - deep: bool = False, -) -> dict[t.Any, t.Any]: - loader._mapping_depth += 1 - loader.flatten_mapping(node) - mapping: dict[t.Any, t.Any] = {} - - for key_node, value_node in node.value: - construct = t.cast( - t.Callable[[yaml.nodes.Node], t.Any], - loader.construct_object, - ) - key = construct(key_node) - value = construct(value_node) - - if loader._mapping_depth == 1: - loader.top_level_key_values.setdefault(key, []).append(copy.deepcopy(value)) - - mapping[key] = value - - loader._mapping_depth -= 1 - return mapping - - -_DuplicateTrackingSafeLoader.add_constructor( - yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, - _duplicate_tracking_construct_mapping, -) - - -def _load_yaml_config_with_duplicates( - config_file_path: pathlib.Path, -) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]: - content = config_file_path.read_text(encoding="utf-8") - loader = _DuplicateTrackingSafeLoader(content) - - try: - data = loader.get_single_data() - finally: - dispose = t.cast(t.Callable[[], None], loader.dispose) - dispose() - - if data is None: - return {}, {} - if not isinstance(data, dict): - msg = f"Config file {config_file_path} is not a valid YAML dictionary." - raise TypeError(msg) - - duplicates = { - t.cast(str, key): values - for key, values in loader.top_level_key_values.items() - if len(values) > 1 - } - - return t.cast("dict[str, t.Any]", data), duplicates - - -def _load_config_with_duplicate_roots( - config_file_path: pathlib.Path, -) -> tuple[dict[str, t.Any], dict[str, list[t.Any]]]: - if config_file_path.suffix.lower() in {".yaml", ".yml"}: - return _load_yaml_config_with_duplicates(config_file_path) - - return ConfigReader._from_file(config_file_path), {} - - -def _merge_duplicate_workspace_root_entries( - label: str, - occurrences: list[t.Any], -) -> tuple[t.Any, list[str], int]: - conflicts: list[str] = [] - change_count = max(len(occurrences) - 1, 0) - - if not occurrences: - return {}, conflicts, change_count - - if not all(isinstance(entry, dict) for entry in occurrences): - conflicts.append( - ( - f"Workspace root '{label}' contains duplicate entries that are not " - "mappings. Keeping the last occurrence." - ), - ) - return occurrences[-1], conflicts, change_count - - merged: dict[str, t.Any] = {} - - for entry in occurrences: - assert isinstance(entry, dict) - for repo_name, repo_config in entry.items(): - if repo_name not in merged: - merged[repo_name] = copy.deepcopy(repo_config) - elif merged[repo_name] != repo_config: - conflicts.append( - ( - f"Workspace root '{label}' contains conflicting definitions " - f"for repository '{repo_name}'. Keeping the existing entry." - ), - ) - - return merged, conflicts, change_count - - -def _merge_duplicate_workspace_roots( - config_data: dict[str, t.Any], - duplicate_roots: dict[str, list[t.Any]], -) -> tuple[dict[str, t.Any], list[str], int, list[tuple[str, int]]]: - if not duplicate_roots: - return config_data, [], 0, [] - - merged_config = copy.deepcopy(config_data) - conflicts: list[str] = [] - change_count = 0 - details: list[tuple[str, int]] = [] - - for label, occurrences in duplicate_roots.items(): - ( - merged_value, - entry_conflicts, - entry_changes, - ) = _merge_duplicate_workspace_root_entries( - label, - occurrences, - ) - merged_config[label] = merged_value - conflicts.extend(entry_conflicts) - change_count += entry_changes - details.append((label, len(occurrences))) - - return merged_config, conflicts, change_count, details - - def create_fmt_subparser(parser: argparse.ArgumentParser) -> None: """Create ``vcspull fmt`` argument subparser.""" parser.add_argument( @@ -315,17 +172,11 @@ def format_single_config( # Load existing config try: - raw_config, duplicate_root_occurrences = _load_config_with_duplicate_roots( - config_file_path, + raw_config, duplicate_root_occurrences = ( + DuplicateAwareConfigReader.load_with_duplicates(config_file_path) ) - if not isinstance(raw_config, dict): - log.error( - "Config file %s is not a valid YAML dictionary.", - config_file_path, - ) - return False except TypeError: - log.exception("Invalid configuration in %s", config_file_path) + log.exception("Config file %s is not a mapping", config_file_path) return False except Exception: log.exception("Error loading config from %s", config_file_path) @@ -349,7 +200,7 @@ def format_single_config( duplicate_merge_conflicts, duplicate_merge_changes, duplicate_merge_details, - ) = _merge_duplicate_workspace_roots(working_config, duplicate_root_occurrences) + ) = merge_duplicate_workspace_roots(working_config, duplicate_root_occurrences) elif duplicate_root_occurrences: duplicate_merge_details = [ (label, len(values)) for label, values in duplicate_root_occurrences.items() diff --git a/src/vcspull/config.py b/src/vcspull/config.py index 6fbbbb6f..ccb0efa0 100644 --- a/src/vcspull/config.py +++ b/src/vcspull/config.py @@ -14,7 +14,7 @@ from vcspull.validator import is_valid_config from . import exc -from ._internal.config_reader import ConfigReader +from ._internal.config_reader import ConfigReader, DuplicateAwareConfigReader from .util import get_config_dir, update_dict log = logging.getLogger(__name__) @@ -241,6 +241,8 @@ def find_config_files( def load_configs( files: list[pathlib.Path], cwd: pathlib.Path | Callable[[], pathlib.Path] = pathlib.Path.cwd, + *, + merge_duplicates: bool = True, ) -> list[ConfigDict]: """Return repos from a list of files. @@ -268,9 +270,44 @@ def load_configs( if isinstance(file, str): file = pathlib.Path(file) assert isinstance(file, pathlib.Path) - conf = ConfigReader._from_file(file) - assert is_valid_config(conf) - newrepos = extract_repos(conf, cwd=cwd) + + config_content, duplicate_roots = ( + DuplicateAwareConfigReader.load_with_duplicates(file) + ) + + if merge_duplicates: + ( + config_content, + merge_conflicts, + _merge_change_count, + merge_details, + ) = merge_duplicate_workspace_roots(config_content, duplicate_roots) + + for conflict in merge_conflicts: + log.warning("%s: %s", file, conflict) + + for root_label, occurrence_count in merge_details: + duplicate_count = max(occurrence_count - 1, 0) + if duplicate_count == 0: + continue + plural = "entry" if duplicate_count == 1 else "entries" + log.info( + "%s: merged %d duplicate %s for workspace root '%s'", + file, + duplicate_count, + plural, + root_label, + ) + elif duplicate_roots: + duplicate_list = ", ".join(sorted(duplicate_roots.keys())) + log.warning( + "%s: duplicate workspace roots detected (%s); keeping last occurrences", + file, + duplicate_list, + ) + + assert is_valid_config(config_content) + newrepos = extract_repos(config_content, cwd=cwd) if not repos: repos.extend(newrepos) @@ -444,6 +481,74 @@ def save_config_yaml(config_file_path: pathlib.Path, data: dict[t.Any, t.Any]) - config_file_path.write_text(yaml_content, encoding="utf-8") +def merge_duplicate_workspace_root_entries( + label: str, + occurrences: list[t.Any], +) -> tuple[t.Any, list[str], int]: + """Merge duplicate entries for a single workspace root.""" + conflicts: list[str] = [] + change_count = max(len(occurrences) - 1, 0) + + if not occurrences: + return {}, conflicts, change_count + + if not all(isinstance(entry, dict) for entry in occurrences): + conflicts.append( + ( + f"Workspace root '{label}' contains duplicate entries that are not " + "mappings. Keeping the last occurrence." + ), + ) + return occurrences[-1], conflicts, change_count + + merged: dict[str, t.Any] = {} + + for entry in occurrences: + assert isinstance(entry, dict) + for repo_name, repo_config in entry.items(): + if repo_name not in merged: + merged[repo_name] = copy.deepcopy(repo_config) + elif merged[repo_name] != repo_config: + conflicts.append( + ( + f"Workspace root '{label}' contains conflicting definitions " + f"for repository '{repo_name}'. Keeping the existing entry." + ), + ) + + return merged, conflicts, change_count + + +def merge_duplicate_workspace_roots( + config_data: dict[str, t.Any], + duplicate_roots: dict[str, list[t.Any]], +) -> tuple[dict[str, t.Any], list[str], int, list[tuple[str, int]]]: + """Merge duplicate workspace root sections captured during load.""" + if not duplicate_roots: + return copy.deepcopy(config_data), [], 0, [] + + merged_config = copy.deepcopy(config_data) + conflicts: list[str] = [] + change_count = 0 + details: list[tuple[str, int]] = [] + + for label, occurrences in duplicate_roots.items(): + ( + merged_value, + entry_conflicts, + entry_changes, + ) = merge_duplicate_workspace_root_entries( + label, + occurrences, + ) + merged_config[label] = merged_value + conflicts.extend(entry_conflicts) + change_count += entry_changes + details.append((label, len(occurrences))) + + return merged_config, conflicts, change_count, details + + def canonicalize_workspace_path( label: str, *, diff --git a/tests/cli/__snapshots__/test_add.ambr b/tests/cli/__snapshots__/test_add.ambr new file mode 100644 index 00000000..6c2c10b5 --- /dev/null +++ b/tests/cli/__snapshots__/test_add.ambr @@ -0,0 +1,94 @@ +# serializer version: 1 +# name: test_add_repo_duplicate_merge_behavior[merge-off] + dict({ + 'log': ''' + WARNING vcspull.cli.add:add.py:414 • Duplicate workspace root ~/study/python/ appears 2 times; skipping merge because --no-merge was provided. + INFO vcspull.cli.add:add.py:558 ✓ Successfully added 'pytest-docker' (git+https://github.com/avast/pytest-docker.git) to under '~/study/python/'. + + ''', + 'test_id': 'merge-off', + }) +# --- +# name: test_add_repo_duplicate_merge_behavior[merge-on] + dict({ + 'log': ''' + INFO vcspull.cli.add:add.py:395 • Merged 2 duplicate entries for workspace root ~/study/python/ + INFO vcspull.cli.add:add.py:558 ✓ Successfully added 'pytest-docker' (git+https://github.com/avast/pytest-docker.git) to under '~/study/python/'. + + ''', + 'test_id': 'merge-on', + }) +# --- +# name: test_handle_add_command_path_mode[path-auto-confirm] + dict({ + 'log': ''' + INFO vcspull.cli.add:add.py:235 Found new repository to import: + INFO vcspull.cli.add:add.py:236 + pytest-docker (https://github.com/avast/pytest-docker) + INFO vcspull.cli.add:add.py:247 • workspace: ~/study/python/ + INFO vcspull.cli.add:add.py:255 ↳ path: ~/study/python/pytest-docker + INFO vcspull.cli.add:add.py:276 ? Import this repository? [y/N]: y (auto-confirm) + INFO vcspull.cli.add:add.py:374 Config file not found. A new one will be created. + INFO vcspull.cli.add:add.py:558 ✓ Successfully added 'pytest-docker' (git+https://github.com/avast/pytest-docker) to under '~/study/python/'. + + ''', + 'test_id': 'path-auto-confirm', + }) +# --- +# name: test_handle_add_command_path_mode[path-interactive-accept] + dict({ + 'log': ''' + INFO vcspull.cli.add:add.py:235 Found new repository to import: + INFO vcspull.cli.add:add.py:236 + project-alias (https://github.com/example/project) + INFO vcspull.cli.add:add.py:247 • workspace: ~/study/python/ + INFO vcspull.cli.add:add.py:255 ↳ path: ~/study/python/pytest-docker + INFO vcspull.cli.add:add.py:374 Config file not found. A new one will be created. + INFO vcspull.cli.add:add.py:558 ✓ Successfully added 'project-alias' (git+https://github.com/example/project) to under '~/study/python/'. + + ''', + 'test_id': 'path-interactive-accept', + }) +# --- +# name: test_handle_add_command_path_mode[path-interactive-decline] + dict({ + 'log': ''' + INFO vcspull.cli.add:add.py:235 Found new repository to import: + INFO vcspull.cli.add:add.py:236 + pytest-docker (https://github.com/example/decline) + INFO vcspull.cli.add:add.py:247 • workspace: ~/study/python/ + INFO vcspull.cli.add:add.py:255 ↳ path: ~/study/python/pytest-docker + INFO vcspull.cli.add:add.py:290 Aborted import of 'pytest-docker' from + + ''', + 'test_id': 'path-interactive-decline', + }) +# --- +# name: test_handle_add_command_path_mode[path-no-merge] + dict({ + 'log': ''' + INFO vcspull.cli.add:add.py:235 Found new repository to import: + INFO vcspull.cli.add:add.py:236 + pytest-docker (https://github.com/example/no-merge) + INFO vcspull.cli.add:add.py:247 • workspace: ~/study/python/ + INFO vcspull.cli.add:add.py:255 ↳ path: ~/study/python/pytest-docker + INFO vcspull.cli.add:add.py:276 ? Import this repository? [y/N]: y (auto-confirm) + WARNING vcspull.cli.add:add.py:414 • Duplicate workspace root ~/study/python/ appears 2 times; skipping merge because --no-merge was provided. + INFO vcspull.cli.add:add.py:558 ✓ Successfully added 'pytest-docker' (git+https://github.com/example/no-merge) to under '~/study/python/'. + + ''', + 'test_id': 'path-no-merge', + }) +# --- +# name: test_handle_add_command_path_mode[path-no-remote] + dict({ + 'log': ''' + WARNING vcspull.cli.add:add.py:213 Unable to determine git remote for ; using local path in config. + INFO vcspull.cli.add:add.py:235 Found new repository to import: + INFO vcspull.cli.add:add.py:236 + pytest-docker (~/study/python/pytest-docker) + INFO vcspull.cli.add:add.py:247 • workspace: ~/study/python/ + INFO vcspull.cli.add:add.py:255 ↳ path: ~/study/python/pytest-docker + INFO vcspull.cli.add:add.py:276 ? Import this repository? [y/N]: y (auto-confirm) + INFO vcspull.cli.add:add.py:374 Config file not found. A new one will be created. + INFO vcspull.cli.add:add.py:558 ✓ Successfully added 'pytest-docker' () to under '~/study/python/'. + + ''', + 'test_id': 'path-no-remote', + }) +# --- diff --git a/tests/cli/__snapshots__/test_discover.ambr b/tests/cli/__snapshots__/test_discover.ambr new file mode 100644 index 00000000..15368c7a --- /dev/null +++ b/tests/cli/__snapshots__/test_discover.ambr @@ -0,0 +1,15 @@ +# serializer version: 1 +# name: test_discover_repos[discover-no-merge] + dict({ + 'log': ''' + WARNING vcspull.cli.discover:discover.py:272 • Duplicate workspace root ~/code/ appears 2 times; skipping merge because --no-merge was provided. + INFO vcspull.cli.discover:discover.py:469 + Found 1 new repository to import: + INFO vcspull.cli.discover:discover.py:478 + repo3 (git+https://github.com/user/repo3.git) + INFO vcspull.cli.discover:discover.py:527 + Importing 'repo3' (git+https://github.com/user/repo3.git) under '~/code/'. + INFO vcspull.cli.discover:discover.py:546 ✓ Successfully updated . + + ''', + 'test_id': 'discover-no-merge', + }) +# --- diff --git a/tests/cli/test_add.py b/tests/cli/test_add.py index aa959a94..cb1ae0bb 100644 --- a/tests/cli/test_add.py +++ b/tests/cli/test_add.py @@ -2,11 +2,17 @@ from __future__ import annotations +import argparse +import logging +import subprocess +import textwrap import typing as t import pytest +from syrupy.assertion import SnapshotAssertion -from vcspull.cli.add import add_repo +from vcspull.cli.add import add_repo, handle_add_command +from vcspull.util import contract_user_home if t.TYPE_CHECKING: import pathlib @@ -29,6 +35,17 @@ class AddRepoFixture(t.NamedTuple): expected_log_messages: list[str] +def init_git_repo(repo_path: pathlib.Path, remote_url: str | None) -> None: + """Initialize a git repository with an optional origin remote.""" + repo_path.mkdir(parents=True, exist_ok=True) + subprocess.run(["git", "init", "-q", str(repo_path)], check=True) + if remote_url: + subprocess.run( + ["git", "-C", str(repo_path), "remote", "add", "origin", remote_url], + check=True, + ) + + ADD_REPO_FIXTURES: list[AddRepoFixture] = [ AddRepoFixture( test_id="simple-add-default-workspace", @@ -297,3 +314,293 @@ def test_add_repo_creates_new_file( assert "./" in config assert "newrepo" in config["./"] + + +class AddDuplicateMergeFixture(t.NamedTuple): + """Fixture describing duplicate merge toggles for add_repo.""" + + test_id: str + merge_duplicates: bool + expected_repo_names: set[str] + expected_warning: str | None + + +ADD_DUPLICATE_MERGE_FIXTURES: list[AddDuplicateMergeFixture] = [ + AddDuplicateMergeFixture( + test_id="merge-on", + merge_duplicates=True, + expected_repo_names={"repo1", "repo2", "pytest-docker"}, + expected_warning=None, + ), + AddDuplicateMergeFixture( + test_id="merge-off", + merge_duplicates=False, + expected_repo_names={"repo2", "pytest-docker"}, + expected_warning="Duplicate workspace root", + ), +] + + +@pytest.mark.parametrize( + list(AddDuplicateMergeFixture._fields), + ADD_DUPLICATE_MERGE_FIXTURES, + ids=[fixture.test_id for fixture in ADD_DUPLICATE_MERGE_FIXTURES], +) +def test_add_repo_duplicate_merge_behavior( + test_id: str, + merge_duplicates: bool, + expected_repo_names: set[str], + expected_warning: str | None, + tmp_path: pathlib.Path, + monkeypatch: MonkeyPatch, + caplog: t.Any, + snapshot: SnapshotAssertion, +) -> None: + """Duplicate workspace roots log appropriately based on merge toggle.""" + import yaml + + caplog.set_level(logging.INFO) + + monkeypatch.setenv("HOME", str(tmp_path)) + monkeypatch.chdir(tmp_path) + + config_file = tmp_path / ".vcspull.yaml" + config_file.write_text( + textwrap.dedent( + """\ + ~/study/python/: + repo1: + repo: git+https://example.com/repo1.git + ~/study/python/: + repo2: + repo: git+https://example.com/repo2.git + """ + ), + encoding="utf-8", + ) + + add_repo( + name="pytest-docker", + url="git+https://github.com/avast/pytest-docker.git", + config_file_path_str=str(config_file), + path=str(tmp_path / "study/python/pytest-docker"), + workspace_root_path="~/study/python/", + dry_run=False, + merge_duplicates=merge_duplicates, + ) + + with config_file.open(encoding="utf-8") as fh: + config_after = yaml.safe_load(fh) + + assert "~/study/python/" in config_after + repos = config_after["~/study/python/"] + assert set(repos.keys()) == expected_repo_names + + if expected_warning is not None: + assert expected_warning in caplog.text + + normalized_log = caplog.text.replace(str(config_file), "") + snapshot.assert_match({"test_id": test_id, "log": normalized_log}) + + +class PathAddFixture(t.NamedTuple): + """Fixture describing CLI path-mode add scenarios.""" + + test_id: str + remote_url: str | None + assume_yes: bool + prompt_response: str | None + dry_run: bool + expected_written: bool + expected_url_kind: str # "remote" or "path" + override_name: str | None + expected_warning: str | None + merge_duplicates: bool + preexisting_yaml: str | None + + +PATH_ADD_FIXTURES: list[PathAddFixture] = [ + PathAddFixture( + test_id="path-auto-confirm", + remote_url="https://github.com/avast/pytest-docker", + assume_yes=True, + prompt_response=None, + dry_run=False, + expected_written=True, + expected_url_kind="remote", + override_name=None, + expected_warning=None, + merge_duplicates=True, + preexisting_yaml=None, + ), + PathAddFixture( + test_id="path-interactive-accept", + remote_url="https://github.com/example/project", + assume_yes=False, + prompt_response="y", + dry_run=False, + expected_written=True, + expected_url_kind="remote", + override_name="project-alias", + expected_warning=None, + merge_duplicates=True, + preexisting_yaml=None, + ), + PathAddFixture( + test_id="path-interactive-decline", + remote_url="https://github.com/example/decline", + assume_yes=False, + prompt_response="n", + dry_run=False, + expected_written=False, + expected_url_kind="remote", + override_name=None, + expected_warning=None, + merge_duplicates=True, + preexisting_yaml=None, + ), + PathAddFixture( + test_id="path-no-remote", + remote_url=None, + assume_yes=True, + prompt_response=None, + dry_run=False, + expected_written=True, + expected_url_kind="path", + override_name=None, + expected_warning="Unable to determine git remote", + merge_duplicates=True, + preexisting_yaml=None, + ), + PathAddFixture( + test_id="path-no-merge", + remote_url="https://github.com/example/no-merge", + assume_yes=True, + prompt_response=None, + dry_run=False, + expected_written=True, + expected_url_kind="remote", + override_name=None, + expected_warning="Duplicate workspace root", + merge_duplicates=False, + preexisting_yaml=""" +~/study/python/: + repo1: + repo: git+https://example.com/repo1.git +~/study/python/: + repo2: + repo: git+https://example.com/repo2.git +""", + ), +] + + +@pytest.mark.parametrize( + list(PathAddFixture._fields), + PATH_ADD_FIXTURES, + ids=[fixture.test_id for fixture in PATH_ADD_FIXTURES], +) +def test_handle_add_command_path_mode( + test_id: str, + remote_url: str | None, + assume_yes: bool, + prompt_response: str | None, + dry_run: bool, + expected_written: bool, + expected_url_kind: str, + override_name: str | None, + expected_warning: str | None, + merge_duplicates: bool, + preexisting_yaml: str | None, + tmp_path: pathlib.Path, + monkeypatch: MonkeyPatch, + caplog: t.Any, + snapshot: SnapshotAssertion, +) -> None: + """CLI path mode prompts and adds repositories appropriately.""" + caplog.set_level(logging.INFO) + + monkeypatch.setenv("HOME", str(tmp_path)) + monkeypatch.chdir(tmp_path) + + repo_path = tmp_path / "study/python/pytest-docker" + init_git_repo(repo_path, remote_url) + + config_file = tmp_path / ".vcspull.yaml" + + if preexisting_yaml is not None: + config_file.write_text(preexisting_yaml, encoding="utf-8") + + expected_input = prompt_response if prompt_response is not None else "y" + monkeypatch.setattr("builtins.input", lambda _: expected_input) + + args = argparse.Namespace( + target=str(repo_path), + url=None, + override_name=override_name, + config=str(config_file), + path=None, + workspace_root_path=None, + dry_run=dry_run, + assume_yes=assume_yes, + merge_duplicates=merge_duplicates, + ) + + handle_add_command(args) + + log_output = caplog.text + contracted_path = contract_user_home(repo_path) + + assert "Found new repository to import" in log_output + assert contracted_path in log_output + + normalized_log = log_output.replace(str(config_file), "") + normalized_log = normalized_log.replace(str(repo_path), "") + snapshot.assert_match({"test_id": test_id, "log": normalized_log}) + + if dry_run: + assert "skipped (dry-run)" in log_output + + if assume_yes: + assert "auto-confirm" in log_output + + if expected_warning is not None: + assert expected_warning in log_output + + repo_name = override_name or repo_path.name + + if expected_written: + import yaml + + assert config_file.exists() + with config_file.open(encoding="utf-8") as fh: + config_data = yaml.safe_load(fh) + + workspace = "~/study/python/" + assert workspace in config_data + assert repo_name in config_data[workspace] + + repo_entry = config_data[workspace][repo_name] + expected_url: str + if expected_url_kind == "remote" and remote_url is not None: + cleaned_remote = remote_url.strip() + expected_url = ( + cleaned_remote + if cleaned_remote.startswith("git+") + else f"git+{cleaned_remote}" + ) + else: + expected_url = str(repo_path) + + assert repo_entry == {"repo": expected_url} + else: + if config_file.exists(): + import yaml + + with config_file.open(encoding="utf-8") as fh: + config_data = yaml.safe_load(fh) + if config_data is not None: + workspace = config_data.get("~/study/python/") + if workspace is not None: + assert repo_name not in workspace + assert "Aborted import" in log_output diff --git a/tests/cli/test_discover.py b/tests/cli/test_discover.py index 1a21d4a5..d10e3d11 100644 --- a/tests/cli/test_discover.py +++ b/tests/cli/test_discover.py @@ -6,6 +6,7 @@ import typing as t import pytest +from syrupy.assertion import SnapshotAssertion from vcspull.cli.discover import discover_repos @@ -41,6 +42,8 @@ class DiscoverFixture(t.NamedTuple): preexisting_config: dict[str, t.Any] | None user_input: str | None expected_workspace_labels: set[str] | None + merge_duplicates: bool + preexisting_yaml: str | None DISCOVER_FIXTURES: list[DiscoverFixture] = [ @@ -59,6 +62,8 @@ class DiscoverFixture(t.NamedTuple): preexisting_config=None, user_input=None, expected_workspace_labels={"~/code/"}, + merge_duplicates=True, + preexisting_yaml=None, ), DiscoverFixture( test_id="discover-recursive", @@ -76,6 +81,8 @@ class DiscoverFixture(t.NamedTuple): preexisting_config=None, user_input=None, expected_workspace_labels={"~/code/"}, + merge_duplicates=True, + preexisting_yaml=None, ), DiscoverFixture( test_id="discover-dry-run", @@ -91,6 +98,8 @@ class DiscoverFixture(t.NamedTuple): preexisting_config=None, user_input=None, expected_workspace_labels=None, + merge_duplicates=True, + preexisting_yaml=None, ), DiscoverFixture( test_id="discover-default-config", @@ -106,6 +115,8 @@ class DiscoverFixture(t.NamedTuple): preexisting_config=None, user_input=None, expected_workspace_labels={"~/code/"}, + merge_duplicates=True, + preexisting_yaml=None, ), DiscoverFixture( test_id="discover-workspace-normalization", @@ -125,6 +136,8 @@ class DiscoverFixture(t.NamedTuple): }, user_input=None, expected_workspace_labels={"~/code/"}, + merge_duplicates=True, + preexisting_yaml=None, ), DiscoverFixture( test_id="discover-interactive-confirm", @@ -140,6 +153,8 @@ class DiscoverFixture(t.NamedTuple): preexisting_config=None, user_input="y", expected_workspace_labels={"~/code/"}, + merge_duplicates=True, + preexisting_yaml=None, ), DiscoverFixture( test_id="discover-interactive-abort", @@ -155,6 +170,32 @@ class DiscoverFixture(t.NamedTuple): preexisting_config=None, user_input="n", expected_workspace_labels=None, + merge_duplicates=True, + preexisting_yaml=None, + ), + DiscoverFixture( + test_id="discover-no-merge", + repos_to_create=[ + ("repo3", "git+https://github.com/user/repo3.git"), + ], + recursive=False, + workspace_override=None, + dry_run=False, + yes=True, + expected_repo_count=2, + config_relpath=".vcspull.yaml", + preexisting_config=None, + user_input=None, + expected_workspace_labels={"~/code/"}, + merge_duplicates=False, + preexisting_yaml=""" +~/code/: + repo1: + repo: git+https://example.com/repo1.git +~/code/: + repo2: + repo: git+https://example.com/repo2.git +""", ), ] @@ -263,9 +304,12 @@ def test_discover_repos( preexisting_config: dict[str, t.Any] | None, user_input: str | None, expected_workspace_labels: set[str] | None, + merge_duplicates: bool, + preexisting_yaml: str | None, tmp_path: pathlib.Path, monkeypatch: MonkeyPatch, caplog: t.Any, + snapshot: SnapshotAssertion, ) -> None: """Test discovering repositories from filesystem.""" import logging @@ -291,7 +335,9 @@ def test_discover_repos( target_config_file.parent.mkdir(parents=True, exist_ok=True) config_argument = str(target_config_file) - if preexisting_config is not None: + if preexisting_yaml is not None: + target_config_file.write_text(preexisting_yaml, encoding="utf-8") + elif preexisting_config is not None: import yaml target_config_file.write_text( @@ -310,8 +356,13 @@ def test_discover_repos( workspace_root_override=workspace_override, yes=yes, dry_run=dry_run, + merge_duplicates=merge_duplicates, ) + if preexisting_yaml is not None or not merge_duplicates: + normalized_log = caplog.text.replace(str(target_config_file), "") + snapshot.assert_match({"test_id": test_id, "log": normalized_log}) + if dry_run: # In dry-run mode, config file should not be created/modified if expected_repo_count == 0: @@ -383,8 +434,8 @@ def test_discover_config_load_edges( if mode == "non_dict": monkeypatch.setattr( - "vcspull.cli.discover.ConfigReader._from_file", - lambda _path: ["invalid"], + "vcspull.cli.discover.DuplicateAwareConfigReader.load_with_duplicates", + lambda _path: (["invalid"], {}), ) else: # mode == "exception" @@ -393,7 +444,7 @@ def _raise(_path: pathlib.Path) -> t.NoReturn: raise ValueError(error_message) monkeypatch.setattr( - "vcspull.cli.discover.ConfigReader._from_file", + "vcspull.cli.discover.DuplicateAwareConfigReader.load_with_duplicates", _raise, ) diff --git a/tests/test_config.py b/tests/test_config.py index 24ab02b4..f7fcf5f7 100644 --- a/tests/test_config.py +++ b/tests/test_config.py @@ -2,6 +2,8 @@ from __future__ import annotations +import logging +import textwrap import typing as t import pytest @@ -143,3 +145,67 @@ def test_extract_repos_injects_workspace_root( assert repo["workspace_root"] == expected_root expected_path = config.expand_dir(pl.Path(expected_root), cwd=tmp_path) / name assert repo["path"] == expected_path + + +def _write_duplicate_config(tmp_path: pathlib.Path) -> pathlib.Path: + config_path = tmp_path / "config.yaml" + config_path.write_text( + textwrap.dedent( + """\ + ~/workspace/: + alpha: + repo: git+https://example.com/alpha.git + ~/workspace/: + beta: + repo: git+https://example.com/beta.git + """ + ), + encoding="utf-8", + ) + return config_path + + +def test_load_configs_merges_duplicate_workspace_roots( + tmp_path: pathlib.Path, + caplog: pytest.LogCaptureFixture, + monkeypatch: pytest.MonkeyPatch, +) -> None: + """Duplicate workspace roots are merged to keep every repository.""" + monkeypatch.setenv("HOME", str(tmp_path)) + caplog.set_level(logging.INFO, logger="vcspull.config") + + config_path = _write_duplicate_config(tmp_path) + + repos = config.load_configs([config_path], cwd=tmp_path) + + repo_names = {repo["name"] for repo in repos} + assert repo_names == {"alpha", "beta"} + + merged_messages = [message for message in caplog.messages if "merged" in message] + assert merged_messages, "Expected a merge log entry for duplicate roots" + + +def test_load_configs_can_skip_merging_duplicates( + tmp_path: pathlib.Path, + caplog: pytest.LogCaptureFixture, + monkeypatch: pytest.MonkeyPatch, +) -> None: + """The merge step can be skipped while still warning about duplicates.""" + monkeypatch.setenv("HOME", str(tmp_path)) + caplog.set_level(logging.WARNING, logger="vcspull.config") + + config_path = _write_duplicate_config(tmp_path) + + repos = config.load_configs( + [config_path], + cwd=tmp_path, + merge_duplicates=False, + ) + + repo_names = {repo["name"] for repo in repos} + assert repo_names == {"beta"} + + warning_messages = [ + message for message in caplog.messages if "duplicate" in message + ] + assert warning_messages, "Expected a warning about duplicate workspace roots" diff --git a/tests/test_config_reader.py b/tests/test_config_reader.py new file mode 100644 index 00000000..0380d378 --- /dev/null +++ b/tests/test_config_reader.py @@ -0,0 +1,92 @@ +"""Tests for config reader utilities.""" + +from __future__ import annotations + +import pathlib +import textwrap + +from vcspull._internal.config_reader import DuplicateAwareConfigReader + + +def _write(tmp_path: pathlib.Path, name: str, content: str) -> pathlib.Path: + file_path = tmp_path / name + file_path.write_text(content, encoding="utf-8") + return file_path + + +def test_duplicate_aware_reader_records_yaml_duplicates(tmp_path: pathlib.Path) -> None: + """YAML duplicate workspace roots are captured without data loss.""" + yaml_content = textwrap.dedent( + """\ + ~/study/python/: + repo1: + repo: git+https://example.com/repo1.git + ~/study/python/: + repo2: + repo: git+https://example.com/repo2.git + """ + ) + config_path = _write(tmp_path, "config.yaml", yaml_content) + + reader = DuplicateAwareConfigReader.from_file(config_path) + + assert reader.content == { + "~/study/python/": { + "repo2": {"repo": "git+https://example.com/repo2.git"}, + }, + } + assert "~/study/python/" in reader.duplicate_sections + captured = reader.duplicate_sections["~/study/python/"] + assert len(captured) == 2 + assert captured[0] == { + "repo1": {"repo": "git+https://example.com/repo1.git"}, + } + assert captured[1] == { + "repo2": {"repo": "git+https://example.com/repo2.git"}, + } + + +def test_duplicate_aware_reader_handles_yaml_without_duplicates( + tmp_path: pathlib.Path, +) -> None: + """YAML without duplicate keys reports an empty duplicate map.""" + yaml_content = textwrap.dedent( + """\ + ~/code/: + repo: + repo: git+https://example.com/repo.git + """ + ) + config_path = _write(tmp_path, "single.yaml", yaml_content) + + reader = DuplicateAwareConfigReader.from_file(config_path) + + assert reader.content == { + "~/code/": { + "repo": {"repo": "git+https://example.com/repo.git"}, + }, + } + assert reader.duplicate_sections == {} + + +def test_duplicate_aware_reader_passes_through_json(tmp_path: pathlib.Path) -> None: + """JSON configs remain supported and duplicates remain empty.""" + json_content = textwrap.dedent( + """\ + { + "~/code/": { + "repo": {"repo": "git+https://example.com/repo.git"} + } + } + """ + ) + config_path = _write(tmp_path, "config.json", json_content) + + reader = DuplicateAwareConfigReader.from_file(config_path) + + assert reader.content == { + "~/code/": { + "repo": {"repo": "git+https://example.com/repo.git"}, + }, + } + assert reader.duplicate_sections == {} diff --git a/tests/test_log.py b/tests/test_log.py index 6631b272..f1fd0400 100644 --- a/tests/test_log.py +++ b/tests/test_log.py @@ -487,7 +487,6 @@ def test_setup_logger_leaves_cli_loggers_propagating(caplog: LogCaptureFixture) for logger_name in [ "vcspull.cli.add", - "vcspull.cli.add_from_fs", "vcspull.cli.sync", ]: logger = logging.getLogger(logger_name)