Skip to content

Anvil/fix adversarial review findings 1776030818#4

Merged
binhex merged 11 commits into
mainfrom
anvil/fix-adversarial-review-findings-1776030818
Apr 13, 2026
Merged

Anvil/fix adversarial review findings 1776030818#4
binhex merged 11 commits into
mainfrom
anvil/fix-adversarial-review-findings-1776030818

Conversation

@binhex
Copy link
Copy Markdown
Owner

@binhex binhex commented Apr 13, 2026

No description provided.

binhex and others added 11 commits April 13, 2026 11:31
If a download succeeds when apply_speed_check=True, that user is recorded
as the preferred user for the session. On each subsequent album, their
shared files are browsed first (user_browse). If they have the album and
it passes all filters, a direct download is attempted without a speed
check (already proven). preferred_user is cleared only on actual download
failure, not on local post-processing failures. Falls back to full
network search on browse error, no-match, filter failure, or download
failure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Format-specific tag deletion: MP3 uses mutagen.id3.delete() to remove
  both ID3v1 trailer and ID3v2 header; FLAC uses FLAC.clear_pictures()
  + save(deleteid3=True) to strip embedded pictures and stray ID3 blocks;
  M4A/MP4 clears MP4Tags; generic fallback handles OGG, OPUS, WMA etc.
- strip_audio_tags() now returns bool; _apply_result skips DB record when
  any file fails to strip (prevents album being silently marked complete)
- --strip-tags without --completed-pattern is now a hard UsageError instead
  of a warning, matching the other cross-option validations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- filter.py (M9): use explicit None checks in filter_by_bitrate/bitdepth;
  treat br==0 as absent so FLAC files with bitRate=0 are not excluded

- organizer.py (H6): skip zero-byte existing dest files rather than
  treating them as already placed; (M10): remove .aac from MP4 handler

- search.py (M11): add word-boundary matching in user_browse dir filter
  to prevent short artist names (e.g. 'Yes', 'Air') matching unrelated
  path components

- download.py (M6, M2, M3/H4, H2/D1, M4, H3/D2, M1/D7, H5/D5):
  - only delete files (not dirs) in _cleanup_incomplete_files
  - exclude Queued/Requested states from cancel-scope when include_active=False
  - guard against transfers missing 'id' key in poll response
  - pass existing_ids to ALL _cancel_transfers_for_filenames call sites
  - retry spontaneous Cancelled same as Errored/TimedOut
  - late-discovery block for transfers enqueued after our poll starts
  - per-file active_since queue timeout replacing all-or-nothing check
  - absolute wallclock ceiling (download_timeout * 3) for InProgress stalls

- cli.py (M7, H1, M5, M8):
  - add envvar='SLSKD_API_KEY' to --slskd-api-key option
  - _apply_result organises succeeded files on partial success; only
    DB-records on full success; DB exceptions caught
  - pass max_queue_length=None for preferred-user and --search-user paths
  - wrap all db.add_skipped calls in try/except

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- search.py: guard state.get('id') in network_search; always delete search
  in finally block; fix _matches_path to use normalised phrase matching so
  separator variants (underscores, dots) match correctly; fix user_browse
  startswith comparison to use normalised forward-slash paths and require
  full dir prefix (dir + '/') to prevent short-prefix false matches
- filter.py: guard resp.get('username') in build_candidates to prevent
  KeyError on malformed responses; use Counter instead of set for
  filter_responses_by_album so duplicate tokens (e.g. 'Yeah Yeah Yeahs')
  require matching frequency in directory name; add explicit parens to
  filter_by_search_type condition for clarity; remove unreachable dead code
  in pick_best_directory (lines 346-348)
- organizer.py: use src_path.resolve().parent.name in use_parent_prefix path
  to ensure resolved path is used; wrap both dest_dir.mkdir() and
  dest.parent.mkdir() in try/except OSError; fix TOCTOU race on dest.stat()
  with FileNotFoundError guard; explicitly unlink zero-byte dest before
  shutil.move for cross-platform compatibility
- scanner.py: add symlink guard in scan_library to skip symlinked directories
- download.py: replace poll_failed flag with time-based API-unreachable
  detection (abort after 60s of consecutive failures); guard new_tf.get('id')
  to skip transfers with missing id; remove remove_completed_downloads() from
  all failure/cancel paths — keep only on the success path so global slskd
  state is not polluted on errors
- cli.py: wrap final db.add_skipped() call in try/except
- database.py: normalise db_path via Path.expanduser().resolve() before
  building SQLAlchemy URL; remove unused os import

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- filter.py: fix _leaf() mixed-separator heuristic (rfind-based)
- filter.py: remove bitdepth==0 pass-through (treat as invalid metadata)
- filter.py: use \w Unicode boundary in filter_by_words regex
- filter.py: deduplicate candidates by username in build_candidates
- search.py: use \w Unicode boundary in _matches_path punctuation fallback
- search.py: normalise constructed path to forward slashes in user_browse
- download.py: handle disappeared transfers (tf is None) with retry/abort
- cli.py: guard db.is_downloaded/is_skipped with try/except
- cli.py: normalise filenames before set comparison in speed fallback
- utils.py: add safe_path_parts() helper (DRY)
- download.py/cli.py: use safe_path_parts() instead of inline expression
- client.py: verify_connection return type -> None, remove return True

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- download.py: include logical_fn in _cleanup_incomplete_files when it has
  already been popped from active (tf-is-None abort path); prevented the
  vanished file's incomplete partial from being cleaned up
- search.py: add re.UNICODE to _matches_path main-branch pattern for
  consistent Unicode word-boundary handling with non-ASCII artist/album names
- search.py: add early return False for empty/whitespace term in _matches_path
  to prevent boundary-only regex matching everything

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- _collect_ids: explicitly str()-coerce IDs to match set[str] annotation
- _cancel_transfers_for_filenames: normalise filename separators/case before
  membership check so forward-slash paths (user_browse) match backslash paths
  that slskd may return in the transfer listing
- _find_new_transfer_id: same normalisation for filename comparison
- batch-retry path: discard logical_fn from ever_started after successful
  re-enqueue so the per-file queue-timeout fires if the retried transfer gets
  stuck in Queued state (previously bypassed indefinitely)
- fallback ID-discovery: guard chosen.get('id') to prevent KeyError when a
  transfer dict lacks an 'id' field, matching the defensive pattern used
  everywhere else in the module

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Round-6 adversarial review found that tf.get('id', '') produces ''
when the 'id' key is absent, which passes the 'not in known_ids'
check, then tf['id'] raises KeyError — crashing download_and_monitor
and bypassing all transfer-cancel / file-cleanup logic.

Fix: extract tid = tf.get('id'); gate on tid is not None before
str(tid) comparison and return, consistent with the pattern already
used in the fallback ID-discovery path (~line 362).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…D coercion

Round-7 adversarial review found two bugs:

1. HIGH: Per-file state machine ran even when poll_failed=True.
   current_by_id is empty on a failed poll, so every active transfer
   appeared as 'vanished' and burned retry budget on a transient API
   blip. With max_retries=3 and multiple files, 3 intermittent poll
   failures would silently abort all downloads. All other timeout/check
   sections already had 'if not poll_failed:' guards — add the same
   guard around the per-file for-loop.

2. MEDIUM: Transfer ID type inconsistency — active dict accumulated a
   mix of raw API values (lines 344, 570) and explicit str() values
   (lines 366, 545). current_by_id was built with raw keys (line 406),
   so current_by_id.get(str_id) could silently return None, triggering
   spurious 'vanished transfer' retries. Fix: str()-ify current_by_id
   keys at build time and all active[fn] = tid assignment sites.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ncel orphaned retries; remove dead filter_by_search_type

Round-8 adversarial review findings:

1. MEDIUM (download.py): All three ID-discovery blocks (initial,
   fallback, late) compared slskd filenames against enqueued_by_fn
   using direct string equality. user_browse constructs forward-slash
   paths but slskd can echo backslash paths — causing zero transfers
   to be tracked and immediate download failure. Fix: build a
   norm_to_fn lookup (fn.replace('\','/').lower() -> canonical fn)
   at enqueue time; apply normalised fallback in all three discovery
   blocks. Consistent with _cancel_transfers_for_filenames and
   _find_new_transfer_id which already normalised.

2. LOW (download.py): When re-enqueue succeeded but
   _find_new_transfer_id returned None, the newly created slskd
   transfer was silently orphaned — consuming bandwidth with no speed
   floor enforcement for the rest of the session. Fix: call
   _cancel_transfers_for_filenames for the untracked filename,
   skipping IDs already in active or existing_ids.

3. LOW (filter.py): filter_by_search_type was dead code — defined
   publicly but never called anywhere. The identical threshold logic
   lived in pick_best_directory._valid_for_type. Removed the
   duplicate function; _SINGLE_MAX_TRACKS/_ALBUM_MIN_TRACKS constants
   remain for _valid_for_type.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…racked

When re-enqueue succeeds but _find_new_transfer_id returns None, cancel
the orphaned transfer and now also call _cleanup_incomplete_files so any
partial file in slskd's incomplete directory is removed. Consistent with
all other abort paths in the monitor loop.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@binhex binhex merged commit a5b2f3d into main Apr 13, 2026
binhex added a commit that referenced this pull request Apr 19, 2026
…gs-1776030818

Anvil/fix adversarial review findings 1776030818
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant