LCORE-625: Check if transcript directory is writable #602

tisnik · 2025-09-30T08:19:17Z

Description

LCORE-625: Check if transcript directory is writable

Type of change

Related Tickets & Documents

Related Issue #LCORE-625

Summary by CodeRabbit

New Features
- Improved validation for configured storage paths: checks now ensure paths are directories and writable when required, and respect whether a path must already exist.
Bug Fixes
- Better error handling and logging when saving transcripts to disk — IO errors are now reported and surfaced with path context.

coderabbitai · 2025-09-30T08:19:27Z

Warning

Rate limit exceeded

@tisnik has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 20 minutes and 14 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 93828af and 3388ab8.

📒 Files selected for processing (2)

tests/unit/models/config/test_user_data_collection.py (2 hunks)
tests/unit/utils/test_checks.py (2 hunks)

Walkthrough

Adds a new directory_check(...) in src/utils/checks.py, uses it from UserDataCollection.check_storage_location_is_set_when_needed in src/models/config.py to validate feedback/transcripts directories, and wraps the file write in store_transcript with an I/O try/except in src/utils/transcripts.py.

Changes

Cohort / File(s)	Summary
New directory validation `src/utils/checks.py`	Added `directory_check(path: FilePath, must_exists: bool, must_be_writable: bool, desc: str) -> None` which verifies existence (when `must_exists=True`), confirms the path is a directory, and optionally checks writability; raises `InvalidConfigurationError` with descriptive messages on failure.
Config validation uses checker `src/models/config.py`	`UserDataCollection.check_storage_location_is_set_when_needed` now calls `checks.directory_check` for `feedback_path` and `transcripts_path` when those features are enabled; retains prior missing-path errors and adds directory/writability validation.
Transcript write error handling `src/utils/transcripts.py`	`store_transcript` now wraps the file write in a try/except catching `IOError`/`OSError`, logs an error including the transcript path on failure, and re-raises the exception.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller as ConfigCheck
  participant Config as models.config:UserDataCollection
  participant Checks as utils.checks:directory_check
  participant FS as Filesystem/OS

  Config->>Checks: directory_check(path, must_exists=True, must_be_writable=..., desc)
  Checks->>FS: stat(path) / isdir(path)
  alt Path missing & must_exists == True
    Checks-->>Config: raise InvalidConfigurationError ("{desc} must be set")
  else Path exists
    alt Not a directory
      Checks-->>Config: raise InvalidConfigurationError ("{desc} is not a directory")
    else Is a directory
      alt must_be_writable == True
        Checks->>FS: access(path, WRITE)
        alt Not writable
          Checks-->>Config: raise InvalidConfigurationError ("{desc} is not writable")
        else Writable
          Checks-->>Config: return None
        end
      else
        Checks-->>Config: return None
      end
    end
  end
  Note over Config,Checks: Called for feedback and transcripts when enabled

sequenceDiagram
  autonumber
  actor Caller as StoreTranscriptCaller
  participant Transcripts as utils.transcripts:store_transcript
  participant FS as Filesystem/OS
  participant Logger as logs

  Caller->>Transcripts: store_transcript(path, payload)
  Transcripts->>FS: open(path, "w") / write(payload)
  alt Write succeeds
    Transcripts->>Logger: log success
    Transcripts-->>Caller: return
  else Write raises IOError/OSError
    Transcripts->>Logger: log error with path
    Transcripts-->>Caller: re-raise exception
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

LCORE-625: catch storing transcript issues #601 — similar change wrapping transcript file write in a try/except and logging I/O errors.

Poem

I twitch my whiskers, check each gate,
A burrow's path I validate.
I poke for write, I sniff the floor,
I log the fall, I guard the door.
Configs neat — I hop once more. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly references the JIRA issue and accurately summarizes the primary functionality introduced—verifying that the transcript directory is writable—making it clear to reviewers what change to expect without extraneous detail.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

Public repositories are always opted into early access features.
You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/utils/checks.py (1)
60-60: Consider reordering parameters for consistency.

For consistency with similar validation functions, consider placing desc before should_be_writable, as descriptive parameters typically precede behavioral flags in the codebase.

Apply this diff if you choose to reorder:
-def directory_check(path: FilePath, should_be_writable: bool, desc: str) -> None:
+def directory_check(path: FilePath, desc: str, should_be_writable: bool = False) -> None:
Note: This would require updating all call sites, so only consider if you haven't integrated this function yet.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 761bb29 and 2fa47aa.

📒 Files selected for processing (1)

src/utils/checks.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

src/utils/checks.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

src/utils/checks.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (1)

src/utils/checks.py (1)

77-82: LGTM - Implementation is correct.

The implementation correctly:

Validates directory existence using os.path.isdir()

Conditionally checks writability with os.access(path, os.W_OK)

Raises appropriate errors with clear messages

Follows the same pattern as file_check

src/utils/checks.py

coderabbitai · 2025-09-30T08:29:26Z

Note

Unit test generation is an Early Access feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Generating unit tests... This may take up to 20 minutes.

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fa47aa and 86115f9.

📒 Files selected for processing (2)

src/models/config.py (1 hunks)
src/utils/checks.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/utils/checks.py

🧰 Additional context used

📓 Path-based instructions (5)

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

src/models/config.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

src/models/config.py

src/{models/config.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/config.py,configuration.py}: All configuration uses Pydantic models extending ConfigurationBase
Configuration base models must set model_config with extra="forbid" to reject unknown fields

Files:

src/models/config.py

src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

src/models/config.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/config.py

🧬 Code graph analysis (1)

src/models/config.py (1)

src/utils/checks.py (1)

directory_check (60-87)

🪛 GitHub Actions: Pyright

src/models/config.py

[error] 238-238: pyright: Argument of type "str" cannot be assigned to parameter "path" of type "FilePath" in function "directory_check". "str" is not assignable to "Path".

[error] 249-249: pyright: Argument of type "str" cannot be assigned to parameter "path" of type "FilePath" in function "directory_check". "str" is not assignable to "Path".

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (2)

src/models/config.py (2)
237-242: Incorrect parameter order in directory_check call.

The desc parameter is passed in the wrong position. According to the function signature in src/utils/checks.py, desc is the fourth parameter, not the second.

Apply this diff to fix the parameter order:
             checks.directory_check(
                 self.feedback_storage,
-                desc="Check directory to store feedback",
                 must_exists=False,
                 must_be_writable=True,
+                desc="Check directory to store feedback",
             )
Likely an incorrect or invalid review comment.

248-253: Incorrect parameter order in directory_check call.

The desc parameter is passed in the wrong position. According to the function signature in src/utils/checks.py, desc is the fourth parameter, not the second.

Apply this diff to fix the parameter order:
             checks.directory_check(
                 self.transcripts_storage,
-                must_exists=False,
-                desc="Check directory to store transcripts",
+                must_exists=False,
                 must_be_writable=True,
+                desc="Check directory to store transcripts",
             )
Likely an incorrect or invalid review comment.

src/models/config.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

src/utils/checks.py (1)
70-70: Fix grammatical error in docstring.

The parameter description has a grammatical error: "Should the directory exists?" should be "Should the directory exist?"

Apply this diff:
-        must_exists (bool): Should the directory exists?
+        must_exists (bool): Should the directory exist?
src/utils/transcripts.py (2)
83-88: LGTM! Error handling for file write is appropriate.

The try/except block correctly catches I/O errors, logs them with context, and re-raises to propagate the failure. This provides a runtime safety net complementing the upfront directory writability check.

Minor note: In Python 3, IOError is an alias for OSError, making the tuple redundant (though harmless).

Optionally simplify to:
-    except (IOError, OSError) as e:
+    except OSError as e:
60-60: Consider wrapping directory creation in error handling for clearer diagnostics.

The mkdir call could fail due to runtime issues (disk full, permission changes, path conflicts). While the try/except around the file write will catch these failures, wrapping mkdir separately would provide clearer error messages distinguishing directory creation failures from file write failures.
     transcripts_path = construct_transcripts_path(user_id, conversation_id)
-    transcripts_path.mkdir(parents=True, exist_ok=True)
+    try:
+        transcripts_path.mkdir(parents=True, exist_ok=True)
+    except OSError as e:
+        logger.error("Failed to create transcript directory %s: %s", transcripts_path, e)
+        raise

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 745e06d and 93828af.

📒 Files selected for processing (3)

src/models/config.py (1 hunks)
src/utils/checks.py (1 hunks)
src/utils/transcripts.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/models/config.py

🧰 Additional context used

📓 Path-based instructions (2)

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

src/utils/checks.py
src/utils/transcripts.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

src/utils/checks.py
src/utils/transcripts.py

🧬 Code graph analysis (1)

src/utils/transcripts.py (1)

src/models/config.py (1)

dump (565-568)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (1)

src/utils/checks.py (1)

79-87: LGTM! Logic correctly handles conditional existence and writability checks.

The implementation correctly handles all valid parameter combinations:

Early return when path doesn't exist and must_exists=False is appropriate

Subsequent checks (directory type, writability) only run when the path exists

Error messages are clear and include the description context

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

src/utils/checks.py Show resolved Hide resolved

tisnik force-pushed the lcore-625-check-if-transcript-directory-is-writable branch from 2fa47aa to 86115f9 Compare September 30, 2025 08:47

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

src/models/config.py Show resolved Hide resolved

lightspeed-core deleted a comment from coderabbitai bot Sep 30, 2025

tisnik force-pushed the lcore-625-check-if-transcript-directory-is-writable branch 2 times, most recently from 745e06d to e4194f1 Compare September 30, 2025 09:46

LCORE-625: catch storing transcript issues

93828af

tisnik force-pushed the lcore-625-check-if-transcript-directory-is-writable branch from e4194f1 to 93828af Compare September 30, 2025 09:48

Updated unit tests accordingly

3388ab8

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

tisnik merged commit 80c7938 into lightspeed-core:main Sep 30, 2025
18 of 19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LCORE-625: Check if transcript directory is writable #602

LCORE-625: Check if transcript directory is writable #602

Uh oh!

tisnik commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 30, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot commented Sep 30, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LCORE-625: Check if transcript directory is writable #602

LCORE-625: Check if transcript directory is writable #602

Uh oh!

Conversation

tisnik commented Sep 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot commented Sep 30, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tisnik commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 30, 2025 •

edited

Loading