Skip to content

Vincontra/Multi-Language-Code-Parser-using-Tree-sitter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Multi-Language Code Parser using Tree-sitter

πŸ“Œ Overview

This project is a Multi-Language Code Parser built using Tree-sitter, capable of parsing and visualizing code from multiple programming languages. It features an intuitive web-based UI powered by Flask, automatic language detection, and generates both text-based and visual Abstract Syntax Trees (AST).

Supported Languages:

  • C
  • C++
  • Java
  • JavaScript
  • Python

The system automatically detects the language based on file extension, parses the code into an Abstract Syntax Tree (AST), and generates both a text-based parse tree and a visual diagram.


🎯 Key Features

  • βœ… Automatic Language Detection - Identifies language by file extension
  • βœ… Multi-Language Parsing - Supports 5 programming languages
  • βœ… AST Generation - Uses Tree-sitter for accurate parse trees
  • βœ… Visual Representation - Generates beautiful parse tree diagrams using Graphviz
  • βœ… Web-Based Interface - Easy-to-use Flask UI for file uploads
  • βœ… Dual Output - Text (tree.txt) and visual (tree.png) parse trees

πŸ“ Project Structure

lpccProj/
β”‚
β”œβ”€β”€ app.py                  # Flask backend & web server
β”œβ”€β”€ parser.sh               # Core parsing script (multi-language dispatcher)
β”œβ”€β”€ visualize.py            # Converts AST β†’ Graphviz output
β”‚
β”œβ”€β”€ grammers/               # Language grammars (Tree-sitter parsers)
β”‚   β”œβ”€β”€ tree-sitter-c/      # C language parser
β”‚   β”œβ”€β”€ tree-sitter-cpp/    # C++ language parser
β”‚   β”œβ”€β”€ tree-sitter-java/   # Java language parser
β”‚   β”œβ”€β”€ tree-sitter-javascript/  # JavaScript parser
β”‚   └── tree-sitter-python/ # Python language parser
β”‚
β”œβ”€β”€ samples/                # Sample code files for testing
β”œβ”€β”€ uploads/                # Directory for uploaded files
β”œβ”€β”€ templates/              # HTML templates
β”‚   └── index.html          # Web UI
β”œβ”€β”€ static/                 # Static assets
β”œβ”€β”€ tree.txt                # Output: Raw parse tree
β”œβ”€β”€ tree.dot                # Output: Graphviz format
β”œβ”€β”€ tree.png                # Output: Visual parse tree diagram
β”‚
└── README.md               # This file

βš™οΈ How It Works

Processing Pipeline

User uploads code file
        ↓
App extracts file extension
        ↓
Parser selects appropriate Tree-sitter grammar
        ↓
Tree-sitter generates Parse Tree
        ↓
tree.txt created (raw output)
        ↓
visualize.py converts to tree.dot (Graphviz format)
        ↓
Graphviz renders tree.png
        ↓
Web UI displays results

πŸ› οΈ Prerequisites & Installation

System Requirements

Linux / WSL Environment:

sudo apt update
sudo apt install -y nodejs npm python3 python3-venv graphviz build-essential

Key Dependencies:

  • Tree-sitter - Parser generator framework
  • Graphviz - Graph visualization tool
  • Flask - Python web framework
  • Node.js & npm - For building Tree-sitter parsers

Setup Instructions

1. Clone the Repository

git clone <repository-url>
cd lpccProj

2. Create Python Virtual Environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Python Dependencies

pip install flask

4. Build Tree-Sitter Grammars (if needed)

Navigate to each grammar directory and build:

cd grammers/tree-sitter-c
npm install
npm run build
cd ../..

Repeat for other language parsers if needed.

5. Verify Installation

Test the parser with a sample file:

echo "samples/test.c" | ./parser.sh

You should see tree.txt and tree.png generated in the project root.


πŸš€ Usage

Command Line Interface (CLI)

Parse a single file:

echo "path/to/your/code.py" | ./parser.sh

This will generate:

  • tree.txt - Text representation of the parse tree
  • tree.dot - Graphviz format
  • tree.png - Visual diagram

Supported File Extensions

Extension Language
.c C
.cpp C++
.java Java
.js JavaScript
.py Python

Web Interface

Start the Flask Server

python app.py

Server runs at: http://localhost:5000

Using the Web UI

  1. Open http://localhost:5000 in your browser
  2. Click "Choose File" and select a code file
  3. Click "Submit"
  4. View the generated parse tree diagram

πŸ“Š Example Output

Input: Simple Python Function

def greet(name):
    return f"Hello, {name}!"

Generated Parse Tree (tree.txt)

module
  function_definition
    name: greet
    parameters
      identifier: name
    block
      return_statement
        f_string

Visual Output (tree.png)

The system generates a Graphviz diagram showing the hierarchical structure of your code.


πŸ”§ Troubleshooting

Issue: Parser not executable

chmod +x parser.sh

Issue: Graphviz not found

# Ubuntu/Debian
sudo apt install graphviz

# macOS
brew install graphviz

Issue: Python modules not found

source venv/bin/activate
pip install flask

Issue: Tree-sitter build fails

Ensure you have Node.js 14+ and npm installed:

node --version
npm --version

🀝 Contributing

We welcome contributions! Here's how to get involved:

Development Setup

  1. Create a feature branch: git checkout -b feature/your-feature
  2. Make your changes and commit: git commit -am 'Add new feature'
  3. Push to the branch: git push origin feature/your-feature
  4. Submit a Pull Request

Contribution Ideas

  • Add support for new languages
  • Improve the web UI design
  • Optimize parsing performance
  • Add more visualization options
  • Enhance error messages

Code Style

  • Use clear, descriptive variable names
  • Add comments for complex logic
  • Follow Python PEP 8 conventions for Python files
  • Test changes with the provided sample files

πŸ“š Documentation


πŸ“ License

This project is provided as-is for educational and development purposes. Please refer to the LICENSE file for full details.


πŸ‘€ Support & Contact

For questions, issues, or suggestions:

  1. Check Existing Issues - Browse GitHub Issues for solutions
  2. Create a New Issue - Report bugs or request features
  3. Discussion Forum - Use Discussions tab for general questions

πŸŽ“ Learning Resources

This project demonstrates:

  • Compiler Construction - AST generation and parsing
  • Language Processing - Multi-language support
  • Web Development - Flask backend integration
  • Graph Visualization - Graphviz output generation
  • Shell Scripting - Dynamic language detection

Perfect for students learning about parsing, compilers, or code analysis tools!


πŸ“ˆ Roadmap

  • Support for additional languages (Go, Rust, TypeScript)
  • Interactive tree visualization
  • Symbol table generation
  • Code metrics analysis
  • Docker containerization
  • REST API endpoints

Made with ❀️ for code visualization enthusiasts

Verify installations:

node -v
npm -v
python3 --version
dot -V

πŸ“¦ Installation Steps

1️⃣ Clone Repository

git clone <your-repo-url>
cd lpccProj

2️⃣ Install Tree-sitter CLI

sudo npm install -g tree-sitter-cli

Verify:

tree-sitter --version

3️⃣ Setup Python Virtual Environment

python3 -m venv venv
source venv/bin/activate

4️⃣ Install Python Dependencies

pip install flask --break-system-packages

▢️ Running the Project

🟒 Step 1: Start Flask Server

python app.py

Output:

Running on http://127.0.0.1:5000

🟒 Step 2: Open Browser

Go to:

http://127.0.0.1:5000

🟒 Step 3: Upload Code File

Upload any file:

  • .c
  • .cpp
  • .java
  • .js
  • .py

πŸ“Š Outputs Explained

πŸ“„ tree.txt

  • Raw AST in text format
  • Shows hierarchical structure of code

Example:

(function_definition
  (for_statement
    (if_statement

🌳 tree.png

  • Visual representation of AST
  • Generated using Graphviz

🧾 tree.dot

  • Intermediate graph format
  • Used to generate PNG

🧠 Core Components

πŸ”Ή parser.sh

Handles:

  • Language detection
  • Grammar selection
  • Parsing execution

πŸ”Ή visualize.py

  • Converts text AST β†’ graph
  • Uses Graphviz

πŸ”Ή app.py

  • Flask backend
  • Handles file upload
  • Triggers parsing pipeline

🎯 Supported Languages

Language Extension
C .c
C++ .cpp
Java .java
JavaScript .js
Python .py

⚠️ Common Issues & Fixes

❌ Tree-sitter not found

sudo npm install -g tree-sitter-cli

❌ Node not found

node -v

Install if missing.


❌ Flask not found

pip install flask --break-system-packages

❌ Image not opening (WSL)

Use Windows Explorer:

explorer.exe tree.png

🎀 Viva Explanation

You can explain like this:

"This project uses Tree-sitter to parse multiple programming languages. It dynamically selects grammar based on file extension, generates an abstract syntax tree, and visualizes it using Graphviz for better understanding of code structure."


πŸš€ Future Improvements

  • Interactive tree visualization
  • Syntax highlighting in UI
  • Real-time parsing
  • Support for more languages

πŸ™Œ Conclusion

This project demonstrates:

  • Compiler design concepts
  • Multi-language parsing
  • AST visualization
  • Real-world tool usage

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors