Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Fixed

- **O(N) query regression in bulk folder operations** (Issue #1199): `DocumentFolderService.move_documents_to_folder()` and `DocumentFolderService.delete_folder()` in `opencontractserver/corpuses/folder_service.py` previously issued ~3 DB round-trips per document (an EXISTS-style disambiguation fetch, an `UPDATE` via `save(update_fields=…)`, and an `INSERT` via `DocumentPath.objects.create()`). For a 100-document batch that was ~300 queries instead of the single-query `.update()` the old non-lineage code used. Both methods now:
1. Pre-fetch all occupied paths in the target directory in a **single** query via the new `_fetch_occupied_paths_in_directory` helper and pass the shared mutable set to `_disambiguate_path` via a new `occupied_override` parameter (replaces the single-purpose `extra_occupied` kwarg).
2. Batch-deactivate every superseded `DocumentPath` row with one `.filter(pk__in=…).update(is_current=False)` call.
3. Batch-insert every successor row with one `.bulk_create()` call, then manually dispatch `post_save` (`created=True`) via the new `_dispatch_document_path_created_signals` helper so the document-text embedding side effect wired up in `documents/signals.py::process_doc_on_document_path_create` still fires (bulk_create normally bypasses per-row signals).
4. Use `select_related("document")` + `select_for_update(of=("self",))` on the affected-path query so `current.document` accesses inside the build loop no longer N+1, and the row lock stays scoped to the `DocumentPath` table. Net result: a 100-document bulk move now executes roughly 4 DB round-trips instead of ~300, and the old batched `.update()` performance is restored without sacrificing the Path Tree history nodes introduced in PR #1195.
- **TOCTOU race on `DocumentPath` uniqueness** (Issue #1200): `DocumentFolderService.move_document_to_folder()`, `move_documents_to_folder()`, and `delete_folder()` previously caught `IntegrityError` from the `unique_active_path_per_corpus` partial unique constraint and either bubbled it up to the caller as a "Path conflict, please retry" error (single move) or rolled back the entire batch (bulk move / folder delete). Under concurrent moves of different documents to the same target folder, two transactions could both observe a candidate path as free in `_disambiguate_path()` and race to insert it; the loser hit the partial unique index and the operation failed. New helper `DocumentFolderService._create_successor_path_with_retry()` (`opencontractserver/corpuses/folder_service.py`) wraps the deactivate-then-create pair in a savepoint and retries with a freshly disambiguated path on `IntegrityError`, treating each losing path as occupied. Up to `MAX_PATH_CREATE_RETRIES + 1` attempts (`opencontractserver/constants/document_processing.py`) run before propagating the conflict. The partial unique index added in migration `0023_documentpath_documentpathgroupobjectpermission_and_more` remains the authoritative correctness guarantee. New test classes `TestMoveDocumentIntegrityRecovery`, `TestBulkMoveIntegrityRecovery`, and `TestDeleteFolderIntegrityRecovery` cover transient-failure recovery, disambiguated retry path selection, and exhausted-retry rollback (`opencontractserver/tests/test_document_folder_service.py`).
- **Bulk move loop recomputed target folder path per document** (Issue #1202): `DocumentFolderService.move_documents_to_folder()` (`opencontractserver/corpuses/folder_service.py`) called `_compute_moved_path()` once per document, and each call invoked `target_folder.get_path()` — which walks ancestors via a recursive CTE query. For an N-document bulk move to the same folder, this issued N redundant CTE queries for an O(1) value. The target folder path is now resolved once before the loop and threaded through `_compute_moved_path()` via a new optional `target_folder_path` parameter.
- **Backend coverage disappearing from Codecov dashboard** (`.codecov.yml`): Added a `backend` flag with `carryforward: true` (paths: `opencontractserver/`, `config/`), mirroring the pattern already in place for `frontend-unit` / `frontend-component`. Without carryforward, any commit whose backend `pytest` job fails or times out (e.g. the merge commit `f166a59`, where the push-to-main `pytest` check failed after PR #1213 merged) causes Codecov to record backend files at 0% for that commit, which then replaces the healthy ~90% backend coverage on the dashboard — making it look as if only frontend coverage (~39%) is being collected. With carryforward, a flaky/failing backend run inherits the parent commit's backend coverage so the dashboard continues to reflect both suites.
Expand Down
Loading
Loading