# 🤖 SmartAgent Project Report

## Introduction

**SmartAgent** is a multi-turn conversational AI assistant that demonstrates the integration of large language models (LLMs) with traditional programmatic tools. Unlike generic chatbot systems that rely solely on LLM responses, SmartAgent dynamically determines the intent behind a user's message and chooses the most appropriate way to respond—whether by using an LLM, a custom tool, or a combination of both.

This project was developed to showcase how AI agents can process unstructured user input, leverage external utilities, and retain memory across conversations, all within an intuitive Streamlit interface.



## Objectives

- Demonstrate tool-augmented LLM behavior using rule-based intent classification.
- Support user queries involving:
  - Natural language calculations
  - PDF or TXT document summarization
  - Simulated web search
- Add memory persistence for multi-turn chat
- Implement OCR fallback for scanned or image-based PDF files


## How SmartAgent Works

SmartAgent is powered by a modular architecture combining three key components:

1. **Prompt Classifier**
   - Routes the prompt to a calculator, file reader, search stub, or LLM based on keywords.

2. **Tool Layer**
   - `calculator.py`: Extracts numbers and operators from natural language math queries.
   - `file_reader.py`: Uses PyMuPDF to extract text, and `pytesseract`+`pdf2image` for OCR fallback.
   - `search_stub.py`: Returns mock search results.

3. **LLM Integration**
   - Uses OpenAI's latest SDK (`openai >= 1.0.0`) to call GPT-3.5.
   - Injects document content into prompt context for better answers.
   - Handles fallback queries if no tool match is found.


## Screenshots

Below are sample screenshots showing SmartAgent in action:

### Calculator
![calculator.png](attachment:calculator.png)

### OCR-Based PDF Parsing
![Text_PDF.png](attachment:Text_PDF.png)



## Sample Prompts and Responses

| Prompt                          | Tool Used       | Response Summary                          |
|--------------------------------|------------------|--------------------------------------------|
| `Add 5 and 7`                  | Calculator       | 12.0                                       |
| `Read this PDF`               | File Reader      | Summarized text from .pdf file             |
| `What is this scanned doc?`    | OCR + LLM        | GPT summary after OCR extraction           |
| `Search AI assistants`         | Search Stub      | 3 mock results shown                       |
| `Who are you?`                 | LLM              | Returns assistant persona and greeting     |


## Implementation Notes

- **OCR Integration**: When no extractable text is found using PyMuPDF, the system automatically converts the PDF to images and uses `pytesseract` for text extraction.
- **Prompt Construction**: For document-related queries, the file contents are prepended to the prompt to give GPT context.
- **Modularity**: Tools are isolated and extendable—new tools like weather APIs or summarizers can be added easily.


## Technologies Used

- Python 3.11+
- Streamlit for web interface
- OpenAI GPT-3.5 (via `openai>=1.0.0`)
- PyMuPDF for parsing PDFs
- Pytesseract and pdf2image for OCR
- dotenv for environment variable management


## Key Takeaways

- Combining LLMs with external tools provides flexible, powerful interactions
- OCR fallback enables document handling beyond text-based PDFs
- Prompt design is crucial for contextually relevant GPT responses
- Streamlit makes it easy to rapidly prototype agent-like systems with memory and user input flow


## Conclusion

SmartAgent is a functional prototype of how future LLM agents will combine reasoning, tool use, and memory to solve real-world tasks. Its hybrid architecture allows it to go beyond what LLMs can do alone—making it a solid foundation for resume analyzers, research bots, or intelligent assistants in any document-driven workflow.
