Patient case summary#81
Patient case summary#81ashhass wants to merge 9 commits intopatchy631:mainfrom ashhass:patient_case_summary
Conversation
…engineering-hub into patient_case_summary
WalkthroughThis pull request introduces a Patient Case Summary Generator project with updated documentation, a new Streamlit interface, and a comprehensive backend summarization module. The README details system features, installation instructions, API key configurations, and dependency management. The Streamlit application in Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant SA as Streamlit App
participant WF as Workflow Engine
participant GS as Summarizer Module
U->>SA: Upload patient JSON file
SA->>WF: Initiate summary generation
WF->>GS: Process patient data asynchronously
GS-->>WF: Return generated case summary
WF-->>SA: Deliver summary and details
SA->>U: Display summary (or error messages)
sequenceDiagram
participant GRW as GuidelineWorkflow
participant CTX as Context
participant P as PatientInfoEvent
participant CB as ConditionBundleEvent
participant MG as MatchGuidelineEvent
participant MR as MatchGuidelineResultEvent
participant GC as GenerateCaseSummaryEvent
participant SE as StopEvent
GRW->>GRW: parse_patient_info(ctx)
GRW-->>P: Emit PatientInfoEvent
GRW->>GRW: create_condition_bundles(P)
GRW-->>CB: Emit ConditionBundleEvent
GRW->>GRW: dispatch_guideline_match(CB)
GRW-->>MG: Emit MatchGuidelineEvent
GRW->>GRW: handle_guideline_match(MG)
GRW-->>MR: Emit MatchGuidelineResultEvent
GRW->>GRW: gather_guideline_match(MR)
GRW-->>GC: Emit GenerateCaseSummaryEvent
GRW->>GRW: generate_output(GC)
GRW-->>SE: Emit StopEvent
Poem
Tip ⚡🧪 Multi-step agentic review comment chat (experimental)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
patient-case-summarizer/README.md (1)
51-60: Reduce excessive exclamation marks.Static analysis flagged that line 60 may use too many exclamation marks. This is a mere style preference, but consider toning them down to maintain a more professional tone.
-## Contribution -Contributions are welcome! Please fork the repository and submit a pull request with your improvements. +## Contribution +Contributions are welcome. Please fork the repository and submit a pull request with your improvements.🧰 Tools
🪛 LanguageTool
[style] ~60-~60: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 1254 characters long)
Context: ... Contribution Contributions are welcome! Please fork the repository and submit a...(EN_EXCESSIVE_EXCLAMATION)
patient-case-summarizer/app.py (1)
133-135: Consider refactoring the asynchronous call approach.Calling
asyncio.run(run_workflow())within the Streamlit button callback might conflict with Streamlit’s event loop if you plan to scale or incorporate more async operations. A better approach could be integrating the async code with Streamlit’s experimental async APIs or running tasks in a separate thread.patient-case-summarizer/summarizer_code.py (2)
439-439: Use a context manager for file operations.For enhanced readability and safety, replace explicit file open/close with a context manager.
- patient_info_dict = json.load(open(str(patient_info_path), "r")) + with open(str(patient_info_path), "r") as f: + patient_info_dict = json.load(f)🧰 Tools
🪛 Ruff (0.8.2)
439-439: Use a context manager for opening files
(SIM115)
469-469: Use a context manager for file operations.Similarly, when loading condition bundles, a context manager ensures proper resource handling.
- json.load(open(str(condition_info_path), "r")) + with open(str(condition_info_path), "r") as fp: + json.load(fp)🧰 Tools
🪛 Ruff (0.8.2)
469-469: Use a context manager for opening files
(SIM115)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
patient-case-summarizer/demo.movis excluded by!**/*.mov
📒 Files selected for processing (3)
patient-case-summarizer/README.md(1 hunks)patient-case-summarizer/app.py(1 hunks)patient-case-summarizer/summarizer_code.py(1 hunks)
🧰 Additional context used
🪛 LanguageTool
patient-case-summarizer/README.md
[style] ~60-~60: Using many exclamation marks might seem excessive (in this case: 4 exclamation marks for a text that’s 1254 characters long)
Context: ... Contribution Contributions are welcome! Please fork the repository and submit a...
(EN_EXCESSIVE_EXCLAMATION)
🪛 Ruff (0.8.2)
patient-case-summarizer/app.py
4-4: pathlib.Path imported but unused
Remove unused import: pathlib.Path
(F401)
5-5: patient_case_summary.LogEvent imported but unused
Remove unused import: patient_case_summary.LogEvent
(F401)
patient-case-summarizer/summarizer_code.py
9-9: Redefinition of unused List from line 5
Remove definition: List
(F811)
11-11: llama_index.llms.ollama.Ollama imported but unused
Remove unused import: llama_index.llms.ollama.Ollama
(F401)
439-439: Use a context manager for opening files
(SIM115)
469-469: Use a context manager for opening files
(SIM115)
612-612: IPython.display.clear_output imported but unused
Remove unused import: IPython.display.clear_output
(F401)
🔇 Additional comments (1)
patient-case-summarizer/README.md (1)
1-49: Documentation looks good.Your README provides clear instructions for setting up and running the Patient Case Summarizer. The explanation of the LlamaCloud setup and OpenAI API key usage is well-structured.
patient-case-summarizer/app.py
Outdated
| from pathlib import Path | ||
| from patient_case_summary import GuidelineRecommendationWorkflow, retriever, LogEvent |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Remove unused imports.
According to static analysis, pathlib.Path and LogEvent are imported but never used. Cleaning them up helps maintain clarity and reduce clutter.
-from pathlib import Path
-from patient_case_summary import GuidelineRecommendationWorkflow, retriever, LogEvent
+from patient_case_summary import GuidelineRecommendationWorkflow, retriever📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from pathlib import Path | |
| from patient_case_summary import GuidelineRecommendationWorkflow, retriever, LogEvent | |
| -from pathlib import Path | |
| -from patient_case_summary import GuidelineRecommendationWorkflow, retriever, LogEvent | |
| +from patient_case_summary import GuidelineRecommendationWorkflow, retriever |
🧰 Tools
🪛 Ruff (0.8.2)
4-4: pathlib.Path imported but unused
Remove unused import: pathlib.Path
(F401)
5-5: patient_case_summary.LogEvent imported but unused
Remove unused import: patient_case_summary.LogEvent
(F401)
| draw_all_possible_flows(GuidelineRecommendationWorkflow, filename="guideline_rec_workflow.html") | ||
|
|
||
| # Run the workflow | ||
| from IPython.display import clear_output |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Remove unused import “clear_output.”
This import is never used, so it can be safely deleted to clean up dependencies.
-from IPython.display import clear_output📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from IPython.display import clear_output |
🧰 Tools
🪛 Ruff (0.8.2)
612-612: IPython.display.clear_output imported but unused
Remove unused import: IPython.display.clear_output
(F401)
| from pathlib import Path | ||
| from datetime import datetime | ||
| from pydantic import BaseModel, Field | ||
| from typing import Optional, List, Tuple |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Remove redundant re-import of “List.”
List is already imported at line 5. You can remove the redefinition in line 9 to avoid duplicates.
-from typing import Optional, List, Tuple
+from typing import Optional, Tuple📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from typing import Optional, List, Tuple | |
| from typing import Optional, Tuple |
🧰 Tools
🪛 Ruff (0.8.2)
9-9: Redefinition of unused List from line 5
Remove definition: List
(F811)
| from pydantic import BaseModel, Field | ||
| from typing import Optional, List, Tuple | ||
|
|
||
| from llama_index.llms.ollama import Ollama |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Remove unused import “Ollama.”
Static analysis detected that llama_index.llms.ollama.Ollama is not used.
-from llama_index.llms.ollama import Ollama📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from llama_index.llms.ollama import Ollama |
🧰 Tools
🪛 Ruff (0.8.2)
11-11: llama_index.llms.ollama.Ollama imported but unused
Remove unused import: llama_index.llms.ollama.Ollama
(F401)
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (7)
patient-case-summarizer/app.py (1)
117-117: Avoid hardcoding model and URL values.The model name ("deepseek-r1:7b") and Ollama's base URL are hardcoded. This limits flexibility and could cause issues if the application is deployed in different environments.
-llm = Ollama(model="deepseek-r1:7b", base_url="http://localhost:11434") +# Load from environment variables with defaults +import os +ollama_model = os.environ.get("OLLAMA_MODEL", "deepseek-r1:7b") +ollama_base_url = os.environ.get("OLLAMA_BASE_URL", "http://localhost:11434") +llm = Ollama(model=ollama_model, base_url=ollama_base_url)patient-case-summarizer/summarizer_code.py (6)
437-439: Use context manager for file operations.Using a context manager for file operations is safer and ensures resources are properly released even if an exception occurs.
- patient_info_dict = json.load(open(str(patient_info_path), "r")) + with open(str(patient_info_path), "r") as file: + patient_info_dict = json.load(file)🧰 Tools
🪛 Ruff (0.8.2)
437-437: Use a context manager for opening files
(SIM115)
463-473: Use context manager for file operations and improve error handling.Similar to the previous comment, use a context manager for file operations. Also, consider adding error handling for potential JSON parsing issues.
if condition_info_path.exists(): - condition_bundles = ConditionBundles.model_validate( - json.load(open(str(condition_info_path), "r")) - ) + try: + with open(str(condition_info_path), "r") as file: + condition_bundles = ConditionBundles.model_validate( + json.load(file) + ) + except (json.JSONDecodeError, ValueError) as e: + logging.error(f"Error loading condition bundles: {e}") + # Regenerate if corrupt + condition_bundles = await create_condition_bundles(ev.patient_info) + with open(condition_info_path, "w") as fp: + fp.write(condition_bundles.model_dump_json())🧰 Tools
🪛 Ruff (0.8.2)
467-467: Use a context manager for opening files
(SIM115)
513-516: Potential for optimization.The code fetches guidelines for each query individually and updates a dictionary. This could be more efficiently implemented by batching the requests or using a more efficient data structure.
- guideline_docs_dict = {} - # fetch all relevant guidelines as text - for query in guideline_queries.queries: - if self._verbose: - ctx.write_event_to_stream(LogEvent(msg=f">> Generating query: {query}")) - cur_guideline_docs = self.guideline_retriever.retrieve(query) - guideline_docs_dict.update({ - d.id_: d for d in cur_guideline_docs - }) + # Fetch all relevant guidelines at once + guideline_docs_dict = {} + if guideline_queries.queries: + combined_query = " OR ".join(f"({query})" for query in guideline_queries.queries) + if self._verbose: + ctx.write_event_to_stream(LogEvent(msg=f">> Generating combined query: {combined_query}")) + all_guideline_docs = self.guideline_retriever.retrieve(combined_query) + guideline_docs_dict = {d.id_: d for d in all_guideline_docs}Note: This solution assumes the retriever supports OR operator in queries. If not, the original implementation might be necessary.
596-603: Duplicated workflow initialization.This workflow initialization is duplicated - one instance in the class and one here. Consider removing this global instance or making it configurable if needed for testing.
-llm = OpenAI(model="gpt-4o") -workflow = GuidelineRecommendationWorkflow( - guideline_retriever=retriever, - llm=llm, - verbose=True, - timeout=None, # don't worry about timeout to make sure it completes -) +# Factory function to create workflow with default settings +def create_workflow( + retriever=retriever, + model="gpt-4o", + verbose=True, + timeout=None +): + llm = OpenAI(model=model) + return GuidelineRecommendationWorkflow( + guideline_retriever=retriever, + llm=llm, + verbose=verbose, + timeout=timeout + )
605-607: Visualization code should be conditional.This visualization code will always run when the module is imported. This could cause issues in production environments or when imported by other modules.
-# Visualize the workflow -from llama_index.utils.workflow import draw_all_possible_flows -draw_all_possible_flows(GuidelineRecommendationWorkflow, filename="guideline_rec_workflow.html") +def visualize_workflow(): + """Generate a visualization of the workflow for documentation purposes.""" + from llama_index.utils.workflow import draw_all_possible_flows + draw_all_possible_flows(GuidelineRecommendationWorkflow, filename="guideline_rec_workflow.html") + print("Workflow visualization generated at guideline_rec_workflow.html") + +# Only visualize when explicitly requested +if os.environ.get("VISUALIZE_WORKFLOW", "").lower() == "true": + visualize_workflow()
609-622: Parameterize patient file path in main function.The
main()function hardcodes the patient JSON file path, which limits reusability. Consider parameterizing it.-async def main(): - handler = workflow.run(patient_json_path="data/almeta_buckridge.json") +async def main(patient_json_path="data/almeta_buckridge.json", workflow=None): + """Run the workflow on a patient JSON file. + + Args: + patient_json_path (str): Path to patient JSON file + workflow (GuidelineRecommendationWorkflow, optional): Workflow instance. + If None, creates a new instance with default settings. + + Returns: + The case summary generated by the workflow. + """ + if workflow is None: + workflow = create_workflow() + + handler = workflow.run(patient_json_path=patient_json_path)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
patient-case-summarizer/app.py(1 hunks)patient-case-summarizer/summarizer_code.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
patient-case-summarizer/summarizer_code.py
437-437: Use a context manager for opening files
(SIM115)
467-467: Use a context manager for opening files
(SIM115)
🔇 Additional comments (2)
patient-case-summarizer/app.py (2)
132-173: Good error handling and result presentation.The implementation properly handles potential errors during workflow execution and displays results in a structured format. The UI formatting for patient details and condition summary is well-designed.
176-178: Good cleanup practice.The code properly cleans up temporary files after processing, which helps prevent accumulation of unnecessary files on the server.
| project_name="llamacloud_demo", | ||
| organization_id="085b9680-2c3a-4fdc-a80c-e4ddf82c380d", | ||
| api_key=os.environ.get("LLAMA_CLOUD_API_KEY") | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Security concern: Hardcoded organization ID.
A hardcoded organization ID in production code can be a security risk and reduces code portability.
index = LlamaCloudIndex(
name="medical_guidelines_0",
project_name="llamacloud_demo",
- organization_id="085b9680-2c3a-4fdc-a80c-e4ddf82c380d",
+ organization_id=os.environ.get("LLAMA_CLOUD_ORG_ID"),
api_key=os.environ.get("LLAMA_CLOUD_API_KEY")
)Also, consider validating that the environment variables are set:
def validate_env_vars():
required_vars = ["LLAMA_CLOUD_ORG_ID", "LLAMA_CLOUD_API_KEY"]
missing = [var for var in required_vars if not os.environ.get(var)]
if missing:
raise EnvironmentError(f"Missing required environment variables: {', '.join(missing)}")
# Call this at app startup
validate_env_vars()| guideline_queries = await llm.astructured_predict( | ||
| GuidelineQueries, | ||
| prompt, | ||
| patient_info=patient_info.demographic_str, | ||
| condition_info=ev.bundle.json() | ||
| ) | ||
|
|
There was a problem hiding this comment.
Missing llm reference.
This code uses llm variable but it's not defined in the current scope - it should be self.llm.
- guideline_queries = await llm.astructured_predict(
+ guideline_queries = await self.llm.astructured_predict(📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| guideline_queries = await llm.astructured_predict( | |
| GuidelineQueries, | |
| prompt, | |
| patient_info=patient_info.demographic_str, | |
| condition_info=ev.bundle.json() | |
| ) | |
| guideline_queries = await self.llm.astructured_predict( | |
| GuidelineQueries, | |
| prompt, | |
| patient_info=patient_info.demographic_str, | |
| condition_info=ev.bundle.json() | |
| ) |
Added an AI Agent that can analyze uploaded patient data and generate medical case summaries for further review by clinicians.
Summary by CodeRabbit