Welcome to the Document Assistant project! This project will help you build a sophisticated document processing system using LangChain and LangGraph. You'll create an AI assistant that can answer questions, summarize documents, and perform calculations on financial and healthcare documents.
This document assistant uses a multi-agent architecture with LangGraph to handle different types of user requests:
- Q&A Agent: Answers specific questions about document content
- Summarization Agent: Creates summaries and extracts key points from documents
- Calculation Agent: Performs mathematical operations on document data
- Python 3.9+
- OpenAI API key
- Clone the repository:
cd <repository_path>- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Create a
.envfile:
cp .env.example .env
# Edit .env and add your OpenAI API keypython main.pydoc_assistant_project/
├── src/
│ ├── schemas.py # Pydantic models
│ ├── retrieval.py # Document retrieval
│ ├── tools.py # Agent tools
│ ├── prompts.py # Prompt templates
│ ├── agent.py # LangGraph workflow
│ └── assistant.py # Main agent
├── sessions/ # Saved conversation sessions
├── main.py # Entry point
├── requirements.txt # Dependencies
└── README.md # This file
The LangGraph agent follows this workflow:
Create a Pydantic model for structured Q&A responses with the following fields:
question: The original user question (string)answer: The generated answer (string)sources: List of source document IDs used (list of strings)confidence: Confidence score between 0 and 1 (float)timestamp: When the response was generated (datetime)
Purpose: This schema ensures consistent formatting of answers and tracks which documents were referenced.
Create a Pydantic model for intent classification with these fields:
intent_type: The classified intent ("qa", "summarization", "calculation", or "unknown")confidence: Confidence in classification (float between 0 and 1)reasoning: Explanation for the classification (string)
Purpose: This schema helps the system understand what type of request the user is making and route it to the appropriate agent.
The AgentState class is already defined, but you need to understand its structure:
user_input: Current user inputmessages: Conversation messages with LangGraph message annotationintent: Classified user intentnext_step: Next node to execute in the graphconversation_summary: Summary of recent conversationactive_documents: Document IDs currently being discussedcurrent_response: The response being builttools_used: List of tools used in current turnsession_idanduser_id: Session managementactions_taken: List of agent nodes executed (to be added in Task 2.6)
Implement the classify_intent function:
The classify_intent function is the first node in the graph. It just purpose to query the LLM, by providing both the user's input
and message history (if any exists) and instruct the LLM to classify the intent so that graph can direct the request to the appropriate node.
Some of the code for this function is already provided, but you need to complete it by doing the following steps:
- Configure the
llmto use structured output with theUserIntentschema - Create a prompt by calling the
get_intent_classification_prompt()function fromprompts.py.(HINT: you will need to callformaton the returned value and pass in theuser_inputandconversation_history) - Make sure you read the prompt on
prompts.pyto understand what input variables it expects and what it is asking the LLM to return. - Invoke the LLM with the prompt
- Implement conditional logic that sets the
next_stepbased on the classifiedintent:- "qa" --> "qa_agent"
- "summarization" --> "summarization_agent"
- "calculation" --> "calculation_agent"
- default --> "qa_agent"
- Update the state with
actions_taken = ["classify_intent"]also include the newintentvalue andnext_step, then return the updated state
Key concepts:
- Use
llm.with_structured_output(UserIntent)for structured responses - The function should return a state update with
actions_taken,intent, andnext_step
Take a look at the code for the qa_agent node in agent.py. Pay attention to the parameters it takes, how it retrieves and constructs the prompt, and how it enforces structured output and how it updates the state object.
- Implement the
calculation_agentandsummarization_agentfunctions to follow the same pattern by some function calls must be modified to accept values that are specific to the respective nodes. - Be sure to use properly retrieve the prompt templates
- Take a look at the defined structured output schemas that correspond to each node and pass them to the appropriate function.
- Make sure to return an updated state object that includes all the same fields that are updated by the
qa_agentnode.
Complete the update_memory function by doing the following steps:
- Extract the llm from the config parameter. (HINT: you may need to modify the function in order to do this)
- Pass in the correct schema to enforce structured output
- Updates the state with conversation_summary, active_documents, and next_step
Purpose: This function maintains conversation context and tracks document references across turns.
Complete the create_workflow function that:
- Adds all agent nodes (classify_intent, qa_agent, summarization_agent, calculation_agent, update_memory)
- In the
add_conditional_edgesmethod map each intent type to the corresponding agent node - Adds edges from each agent to update_memory
- Fix the returned value so that is compiled with a checkpointer (see Task 2.6)
Graph Structure:
classify_intent --> [qa_agent|summarization_agent|calculation_agent] --> update_memory --> END
To practice using state reducers and persistent memory, extend AgentState and your workflow as follows:
- Add
operator.addreducer to theactions_takenfield of theAgentStateschema. It will accumulate the names of each agent node that runs during a turn. For example: - (From Task 2.5) Import and use the InMemorySaver from the correct langgraph packagea and compile the workflow with a checkpointer using
InMemorySaver. A checkpointer persists state across invocations, so your assistant will remember prior state even if you invoke the workflow multiple times. Modifycreate_workflowto callworkflow.compile(checkpointer=InMemorySaver()). You will need to importInMemorySaver. - In the
process_messagemethod inassistant.py, you must properly set the values of theconfigurablevalue within theconfigobject. Specifically, you must set:- The
thread_idto the current_sessions.session_id - The
llmto the configured LLM instance - The
tools
- The
These additions will enable you to track the flow of the agent and experiment with persistent state. Refer back to the state management and memory demo exercises for examples.
Complete the get_chat_prompt_template function in prompts.py:
- Finishing implement the function so that it supports ALL the
intent_typeparameters which could be "qa", "summarization", or "calculation" - Review prompts.py so you are aware of all the prompts in the file then make sure the
get_chat_prompt_templatethe function sets the system_prompt to the correct value based on theintent_typeparameter.
Make sure to use existing prompts already defined in the file (QA_SYSTEM_PROMPT, SUMMARIZATION_SYSTEM_PROMPT, CALCULATION_SYSTEM_PROMPT)
Purpose: This provides context-aware prompts for different types of tasks.
Implement the CALCULATION_SYSTEM_PROMPT constant in prompts.py:
- Write a system prompt for the calculation agent that instructs the LLM to:
- Determine the document that must be retrieved and retrieve it using the document reader tool
- Determine the mathematical expression to calculate based on the user's input
- Use the calculator tool to perform the calculation
- Make sure the LLM uses the calculator tool for ALL calculations no matter how simple
Implement the create_calculator_tool function that:
- Uses the
@tooldecorator to create a LangChain tool - Takes a mathematical expression as input
- Validates the expression for safety (only allow basic math operations)
- Evaluates the expression using Python's
eval()function - Logs the tool usage with the ToolLogger
- Returns a formatted result string
- Handles errors gracefully
Tools are functions decorated with @tool that can be called by LLMs. They must:
- Have clear docstrings describing their purpose and parameters
- Handle errors gracefully
- Return string results
- Log their usage for debugging
The state flows through nodes and gets updated at each step. Key principles:
- Always return the updated state from node functions
- Use the state to pass information between nodes
- The state persists conversation context and intermediate results
Use llm.with_structured_output(YourSchema) to get reliable, typed responses from LLMs instead of parsing strings.
The system maintains conversation via the InMemorySaver checkpointer:
- Storing conversation messages with metadata
- Tracking active documents
- Summarizing conversations
- Providing context to subsequent requests
- Unit Testing: Test individual functions with sample inputs
- Integration Testing: Test the complete workflow with various user inputs
- Edge Cases: Test error handling and edge cases
- Missing Error Handling: Always wrap external calls in try-catch blocks
- Incorrect State Updates: Ensure you're updating and returning the state correctly
- Prompt Engineering: Make sure your prompts are clear and specific
- Tool Security: Validate all inputs to prevent security issues
After implementation, your assistant should be able to:
- Classify user intents correctly
- Search and retrieve relevant documents
- Answer questions with proper source citations
- Generate comprehensive summaries
- Perform calculations on document data
- Maintain conversation context across turns
Good luck with your implementation! Remember to test thoroughly and refer to the existing working code for guidance on patterns and best practices.
