An agent-based scheduling system that uses LlamaIndex workflows and the ReAct pattern to automate doctor discovery and appointment coordination through structured reasoning and tool execution.
This project implements a conversational AI agent that helps patients:
- Search for doctors by disease/specialty
- View doctor information (experience, contact details)
- Schedule appointments with selected doctors
- Maintain conversation context across multiple interactions
Tech Stack: LlamaIndex, Groq (LLaMA-3.3-70B), HuggingFace Embeddings, Python, Async Workflows
Manual scheduling in healthcare and service-based industries is time-consuming, error-prone, and difficult to scale. This project demonstrates how agent-based systems can automate scheduling workflows with controlled tool execution and observability by combining semantic search, reasoning, and tool execution in a controlled pipeline.
- Semantic Search: Uses vector embeddings to search doctors by specialty or disease
- Automated Scheduling: Books appointments with selected doctors
- ReAct Agent Pattern: Employs reasoning and action steps for intelligent decision-making
- Memory Management: Maintains conversation history for contextual responses
- Tool Integration: Combines multiple tools (search + scheduling) seamlessly
The system uses LlamaIndex workflows with three main events:
- PrepEvent: Prepares the request for processing
- InputEvent: Formats input for the LLM
- ToolCallEvent: Triggers appropriate tool calls
User Input → PrepEvent → InputEvent → LLM Processing → ToolCallEvent → Response
↑ ↓
└──────────────────────────────────────────────┘
In addition to the core workflow, the system includes structured agent traces and evaluation hooks. Each agent run logs intent detection, tool selection, execution outcomes, and termination reasons, enabling observability and debuggability without exposing chain-of-thought.
The agent emits structured traces for each execution, including:
- Detected intent
- Selected tool (if any)
- Tool execution outcome
- Stop reason and latency
These traces improve transparency, debugging, and evaluation, and are returned as part of the API response.
Since this is an agentic system, evaluation focuses on decision quality and behavior rather than text similarity alone.
Tracked metrics include:
- Intent Accuracy: Correct identification of user intent
- Tool Selection Accuracy: Correct choice of tools for a given intent
- Task Completion Rate: Successful end-to-end task execution
- Invalid Action Rate: Attempts to perform unsupported actions
- Latency: End-to-end response time
These metrics are collected via lightweight evaluation hooks during agent execution.
The project is organized as a modular, production-style codebase, with a notebook retained for demonstration purposes.
agent_scheduling_system/
│
├── app/
│ ├── api.py # FastAPI service exposing the agent
│ ├── agent.py # Agent assembly and wiring
│ ├── workflow.py # Event-driven agent workflow (core logic)
│ ├── tools.py # LLM-callable tools (search, scheduling)
│ ├── memory.py # Conversation memory management
│ ├── traces.py # Structured agent tracing and observability
│ ├── evaluation.py # Agent evaluation metrics
│ └── config.py # LLM and embedding configuration
│
├── data/
│ └── doctors.json # Sample doctor database
│
├── notebooks/
│ └── scheduling_demo.ipynb # Interactive demo and exploration
│
├── Dockerfile
├── requirements.txt
├── LICENSE.txt
├── README.md
├── .dockerignore
└── .env.example
The scheduling agent follows an event-driven workflow using LlamaIndex workflows and the ReAct (Reasoning + Acting) pattern.
- User input enters through a StartEvent and is normalized in a PrepEvent
- Conversation history is reconstructed to maintain multi-turn context
- The LLM reasons over the structured input and decides whether to:
- respond directly, or
- invoke scheduling/search tools
- Tool results are fed back into the workflow, enabling iterative reasoning
- The process terminates deterministically via a StopEvent
This design ensures controlled tool usage, clear reasoning boundaries, and safe agent execution.
- Python 3.8+
- Jupyter Notebook (for demo)
- Docker (optional, for deployment)
- Groq API Key
# Install required packages
pip install llama-index
pip install llama-index-llms-groq
pip install llama-index-embeddings-huggingface
pip install llama-index-readers-json
pip install llama-index-utils-workflow
-
API Key Configuration
- Obtain a Groq API key from Groq Console
- In Google Colab, store it as a secret named
GROQ_API_KEY
-
Prepare Data Files
doctors.json: JSON file containing doctor information[ { "name": "Dr. John Smith", "specialty": "Cardiology", "experience": "15 years", "email": "john.smith@hospital.com" } ]
-
Upload Required Files
- Upload
doctors.jsonto your Colab environment - The system will create
Doctor appointment requests.csvautomatically
- Upload
# Search for doctors by specialty
response = await scheduling_agent.run(
input="Which doctors are cardiologists?"
)
print(response.get("response"))
# Schedule an appointment
response = await scheduling_agent.run(
input="Please setup an appointment with John Smith for Ben Jones next week in the afternoons"
)
print(response.get("response"))
# Combined search and scheduling
response = await scheduling_agent.run(
input="Find a neurologist and request an appointment for Beth Wilson at the earliest"
)
print(response.get("response"))
- Loads doctor database from JSON
- Splits documents into chunks using
SentenceSplitter - Creates vector embeddings using HuggingFace's
BAAI/bge-small-en-v1.5 - Indexes documents for semantic search
- Doctor Query Tool: Searches for doctors by specialty/disease
- Appointment Tool: Schedules appointments with selected doctors
- Receives user input
- Reasons about which tools to use
- Executes tool calls
- Synthesizes final response
- Maintains conversation memory
The agent:
- Identifies intent (search vs scheduling)
- Resolves specialty constraints
- Extracts patient and timing preferences
- Selects learned tools via ReAct reasoning
- Confirms or gracefully fails if constraints cannot be met
llm = Groq(
model="llama-3.3-70b-versatile",
api_key=userdata.get('GROQ_API_KEY')
)
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
splitter = SentenceSplitter(chunk_size=200)
- Doctor appointment requests.csv: Contains all scheduled appointments with fields:
- Requested Date
- Patient Name
- Doctor Name
- Scheduling Comments
The CSV file represents a persistent side effect of agent actions and serves as a lightweight stand-in for a database or external scheduling system.
User: "Find a neurologist and schedule an appointment next week"
Agent reasoning:
- Identifies specialty = Neurology
- Uses semantic search to retrieve matching doctors
- Selects scheduling tool
- Writes appointment request to CSV
- Responds with confirmation summary
The agent is exposed via a FastAPI service.
uvicorn app.api:app --reload
POST /run
{
"query": "Find a neurologist and schedule an appointment"
}
{
"output": "Appointment request recorded for ...",
"traces": [
{ "step": "INTENT_DETECTED", "detail": "SCHEDULE_APPOINTMENT" },
{ "step": "TOOL_SELECTED", "detail": "schedule_appointment_tool" }
]
}
The API can be run as a containerized service.
docker build -t scheduling-agent-api .
docker run -p 8000:8000 --env-file .env scheduling-agent-api
The agent includes robust error handling:
- Tool parsing errors
- Non-existent tool calls
- Tool execution failures
- LLM response parsing issues
- Requires internet connection for LLM API calls
- JSON database must be properly formatted
- Appointment scheduling writes to CSV (no calendar integration)
- No real-time availability checking
- Integration with calendar APIs
- Real-time doctor availability checking
- Email notifications to doctors and patients
- Multi-language support
- Web interface for non-technical users
- Database backend instead of CSV
llama-index: Core frameworkllama-index-llms-groq: Groq LLM integrationllama-index-embeddings-huggingface: Embedding modelsllama-index-readers-json: JSON document readerllama-index-utils-workflow: Workflow visualizationnest-asyncio: Async support in notebooks
Contributions are welcome! Please feel free to submit a Pull Request.
MIT License - feel free to use this project for your own purposes.
- Built with LlamaIndex
- Powered by Groq
- Embeddings from HuggingFace
For questions or issues, please open an issue in the GitHub repository.
Note: This is a demonstration project. For production use, implement proper security, authentication, and data validation.