File Analysis Agent

A Python project for analyzing file formats and schemas using AI agents powered by LangChain and LangGraph. It includes two agents: FileAnalysisAgent for detailed multi-step reasoning and tool execution, and SimplifiedFileAnalysisAgent for streamlined processing.

Features

Analyze file formats and generate schemas for files like .tar and .json.
Uses asynchronous workflows with LangGraph for efficient tool execution.
Supports concurrent query processing with unique session IDs.
Modular design with separate reasoning, tool execution, and summarization steps.

Project Structure

file-analysis-agent/
├── demo.py                 # Demo script to run agents
├── src/
│   ├── agent/
│   │   ├── file_analysis_multi_models.py  # FileAnalysisAgent implementation
│   │   ├── file_analysis_simple_model.py  # SimplifiedFileAnalysisAgent implementation
│   │   ├── __init__.py
│   ├── tools/
│   │   ├── mock_async_file_tools.py  # Mock file analysis tools
│   │   ├── __init__.py
├── README.md               # Project documentation

Requirements

Python 3.8+
Dependencies: langchain-ollama, langgraph, asyncio, uuid, typing
Ollama server running locally at http://localhost:11434 with models qwen3:8b and llama3.2:3b

Installation

Clone the repository:

git clone <repository-url>
cd file-analysis-agent

Install dependencies:
```
pip install langchain-ollama langgraph
```
Ensure Ollama is running with the required models:
```
ollama pull qwen3:8b
ollama pull llama3.2:3b
```

Usage

Run the demo script to test the agents with example queries:

Single mode (one agent):

python demo.py --mode single --agent file        # Runs FileAnalysisAgent
python demo.py --mode single --agent simplified  # Runs SimplifiedFileAnalysisAgent

Multi mode (both agents concurrently):
```
python demo.py --mode multi
```

The demo processes queries like "can you generate schema for file, axxx.zz.tar" and "can you analyze the format of file, data.json", displaying results with session IDs, summaries, and tool outputs.

Sequence flow for multi-model agent

graph TD
    A[Start] --> B{agent_decision_node<br> reasoning_llm: qwen3:8b}
    B -->|needs_tool: true| C[agent_tool_node<br> tool_llm: llama3.2:3b <br>Tool Calling & Execution<br> invoke MCPFileAnalysisTools ]
    B -->|needs_tool: false| D[summarize_node<br> summarize_llm: llama3.2:3b ]
    C --> B
    D --> E[END]

Example Output

============================================================
AGENT: FileAnalysisAgent
QUERY: can you generate schema for file, axxx.zz.tar
SESSION ID: 9d7459b9-1de2-4e6d-8aaf-fcef2decddca
============================================================

FINAL SUMMARY:
The user requested a schema for the "axxx.zz.tar" file to analyze its structure. 

The tool detected that the file format is JSON. A schema was generated, which consists of an object with properties: id (integer), name (string), email (string), created_at (date-time string), and active (boolean). The schema specifies that both 'id' and 'name' are required.

No limitations or errors were reported in the analysis of this file.

TOOL RESULTS:
1. detect_file_format: json
2. generate_json_schema: {
  "type": "object",
  "properties": {
    "id": {
      "type": "integer",
      "description": "User ID"
    },
    "name": {
      "type": "string",
      "description": "User name"
    },
    
..........

Example detail analysis

MESSAGE HISTORY:
1. User: can you generate schema for file, axxx.zz.tar
2. Assistant: <think>
Okay, the user wants a schema for the file "axxx.zz.tar". Let me think about how to approach this.

First, I need to check the file format. The file has a .tar extension, which usually means i...
3. Assistant: Tool detect_file_format executed successfully. Result: json
4. Assistant: <think>
Okay, the user wants a schema generated for the file "axxx.zz.tar". Previously, the detect_file_format tool was called on this file, and it returned that the format is JSON. But wait, the file...
5. Assistant: Tool generate_json_schema executed successfully. Result: {
  "type": "object",
  "properties": {
    "id": {
      "type": "integer",
      "description": "User ID"
    },
    "name": {
      "type": ...
6. Assistant: <think>
Okay, let me try to figure out how to approach this. The user wants a schema for the file "axxx.zz.tar". First, I need to check what tools are available. The tools provided include functions t...
7. Assistant: Here's a summary of the results:

You requested a schema for the file "axxx.zz.tar". After analyzing the file using our tools, we detected that the file format is JSON. The JSON schema was generated, ...

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
LICENSE		LICENSE
README.md		README.md
demo_multi_llms.py		demo_multi_llms.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

File Analysis Agent

Features

Project Structure

Requirements

Installation

Usage

Sequence flow for multi-model agent

Example Output

Example detail analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

File Analysis Agent

Features

Project Structure

Requirements

Installation

Usage

Sequence flow for multi-model agent

Example Output

Example detail analysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages