Fix main minor issues#174
Conversation
Review Summary by QodoAdd file staging, fix node names, and improve evaluation workflow
WalkthroughsDescription• Add file staging logic to CLI for FILE_ANALYZER tool integration • Fix node name mismatches in workflow output processing • Improve evaluation script with auto-loading local CSV datasets • Add session management and logging to CLI workflow • Update documentation with file attachment and evaluation instructions Diagramflowchart LR
CLI["CLI Input<br/>-f flag"]
STAGE["_prepare_session_files<br/>Validates & copies files"]
SESSION["Session Input Dir<br/>Staged files"]
TOOL["FILE_ANALYZER Tool<br/>Discovers files"]
EVAL["Evaluation Script<br/>Auto-load CSV"]
LANGSMITH["LangSmith Dataset<br/>Create if missing"]
CLI -->|file paths| STAGE
STAGE -->|validates| SESSION
SESSION -->|reads| TOOL
EVAL -->|check exists| LANGSMITH
EVAL -->|load local| LANGSMITH
File Changes1. app/core/main.py
|
Code Review by Qodo
1.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
Summary by CodeRabbit
WalkthroughAdds CLI support to stage local files into a per-session input directory and session-scoped logging; refactors the LangSmith evaluation script to provision datasets and run evaluations from a main() entry; updates README docs for CLI/file usage and LangSmith instructions; and renames a couple Streamlit LangGraph dispatch keys. ChangesSession File Attachment & CLI
LangSmith Evaluation (automated evaluator)
Streamlit LangGraph Dispatch Key Rename
Tests / Test Harness
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI (user)
participant Main as app.core.main
participant FS as Session Input Dir (filesystem)
participant Workflow as Workflow Runner
participant LangSmith as LangSmith Client
participant OpenAI as OpenAI API
CLI->>Main: start with -f files and -q, api_key
Main->>Main: create_user_session(), initialize_session_context(session_id)
Main->>FS: _prepare_session_files(session_id, file_paths) (validate & copy)
FS-->>Main: staging success / error
Main->>Workflow: create_workflow(session_id, api_key=openai_key)
CLI->>Workflow: invoke workflow with session inputs
Workflow->>OpenAI: LLM call (uses provided OpenAI key)
Workflow->>LangSmith: (optional) report/run evaluation
LangSmith-->>Main: dataset/run status
Main-->>CLI: output results / error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
app/core/main.py (1)
314-332: 💤 Low valueStage files before initializing models.
File staging (
_prepare_session_files) currently runs afterlangsmith_setup()andllm_creation(...). If a user supplies an invalid-fpath, we still pay the cost of LLM client construction and LangSmith setup before failing. Validating user-supplied inputs first is cheaper and gives faster feedback on bad CLI arguments.♻️ Proposed reordering
# Create a user session (mirrors the Streamlit session lifecycle) and # reconfigure the logger so subsequent CLI logs land in the session file. session_id = create_user_session() initialize_session_context(session_id) global logger logger = setup_logger(__name__) + + # Stage user-provided files into the session's input directory early so + # we fail fast before incurring model/LangSmith initialization costs. + if args.file: + try: + _prepare_session_files(session_id, args.file) + except SessionFilePreparationError as exc: + logger.error(str(exc)) + print(f"Error: {exc}") + return + # Initialize LangSmith if available langsmith_setup() # Get endpoint URL from arguments or environment endpoint_url = ( args.endpoint or os.environ.get("KG_ENDPOINT_URL") or "https://enpkg.commons-lab.org/graphdb/repositories/ENPKG" ) models = llm_creation(api_key=args.api_key) - - # Stage user-provided files into the session's input directory - if args.file: - try: - _prepare_session_files(session_id, args.file) - except SessionFilePreparationError as exc: - logger.error(str(exc)) - print(f"Error: {exc}") - return🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/core/main.py` around lines 314 - 332, Move the user file staging to run before expensive setup: validate and call _prepare_session_files(session_id, args.file) (handling SessionFilePreparationError) before calling langsmith_setup() and llm_creation(api_key=args.api_key); specifically, check args.file early, run the try/except around _prepare_session_files to log and exit on error, and only if staging succeeds proceed to call langsmith_setup() and llm_creation() so we avoid constructing LLM clients or initializing LangSmith when CLI file inputs are invalid.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/core/tests/evaluation.py`:
- Around line 18-76: The module runs all evaluation logic at import time
(load_dotenv(), Client(...), dataset bootstrap, etc.); move that top-level logic
into a new main() function and wrap invocation with if __name__ == "__main__":
main() so imports no longer execute side effects. Specifically, relocate the
calls that perform key checks, env var mutation, Client construction, dataset
existence check (client.read_dataset / client.create_dataset), CSV parsing loop,
and client.create_examples into main(), keep helper imports and constants (e.g.,
dataset_name, local_data_path) at module scope if needed, and call main() only
under the __name__ guard.
- Around line 45-54: Replace the broad exception handler around
client.read_dataset with a handler for the specific LangSmithNotFoundError so
only a 404 (dataset-not-found) triggers the create-from-local flow (use the
LangSmithNotFoundError class where exceptions are raised by
client.read_dataset); and when raising the FileNotFoundError for missing
local_data_path, use exception chaining suppression by raising
FileNotFoundError(...) from None. Ensure you reference the read_dataset call and
the local_data_path/FileNotFoundError site when making the change.
---
Nitpick comments:
In `@app/core/main.py`:
- Around line 314-332: Move the user file staging to run before expensive setup:
validate and call _prepare_session_files(session_id, args.file) (handling
SessionFilePreparationError) before calling langsmith_setup() and
llm_creation(api_key=args.api_key); specifically, check args.file early, run the
try/except around _prepare_session_files to log and exit on error, and only if
staging succeeds proceed to call langsmith_setup() and llm_creation() so we
avoid constructing LLM clients or initializing LangSmith when CLI file inputs
are invalid.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b4e3bdce-8093-408b-b749-3dd6b539fb3c
⛔ Files ignored due to path filters (2)
app/data/big_benchmark.csvis excluded by!**/*.csv,!**/*.csvapp/data/evaluation_dataset.csvis excluded by!**/*.csv,!**/*.csv
📒 Files selected for processing (5)
README.mdapp/core/main.pyapp/core/tests/evaluation.pydocs/examples/langsmith-evaluation.mdstreamlit_webapp/streamlit_utils.py
by __name__ == __main__ so importing the module no longer triggers
network calls or env mutations
- Replace broad except Exception with LangSmithNotFoundError for the
read_dataset call; add to the FileNotFoundError raise
PR Type
enhancement, tests, documentation
Description
Add file staging logic for CLI input files
Update evaluation script to auto-load local CSV datasets
Enhance CLI with file attachment option
Improve documentation with new usage examples
Diagram Walkthrough
File Walkthrough
main.py
Add file staging and enhance CLI optionsapp/core/main.py
SessionFilePreparationErrorfor error handlingevaluation.py
Update evaluation script for dataset handlingapp/core/tests/evaluation.py
streamlit_utils.py
Correct agent name case sensitivitystreamlit_webapp/streamlit_utils.py
README.md
Update README with new usage examplesREADME.md