| title | ForgeSight | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| emoji | 🏗️ | ||||||||
| colorFrom | red | ||||||||
| colorTo | gray | ||||||||
| sdk | docker | ||||||||
| pinned | true | ||||||||
| license | mit | ||||||||
| short_description | Multimodal Civil QC Copilot on AMD MI300X + ROCm | ||||||||
| tags |
|
ForgeSight is a production-grade Agentic Quality Control (QC) Pipeline designed for civil engineering, construction, and infrastructure projects. Built exclusively for the AMD + lablab.ai Developer Hackathon, it leverages the massive 192GB VRAM of the AMD Instinct MI300X to run a state-of-the-art multimodal multi-agent workflow.
ForgeSight was explicitly designed to conquer the core objectives of this hackathon, working end-to-end and showing what AMD's compute stack can unlock:
- 🤖 Track 1: AI Agents & Agentic Workflows: We moved far beyond simple RAG. ForgeSight implements a sophisticated, coordinated 4-agent workflow (Inspector, Diagnostician, Action, Reporter) that automates the complex task of infrastructure quality control, reasoning sequentially to deliver concrete work orders.
- 🎨 Track 3: Vision & Multimodal AI: We process and understand complex high-resolution visual data using the massive memory bandwidth of AMD GPUs. ForgeSight is a true high-throughput industrial inspection application using
Qwen2-VL-7Boptimized for ROCm™. - 🚢 Extra Challenge: Ship It + Build in Public: Not only did we build in public, but we also built an agent for it. The pipeline features a 5th silent agent (the Social Agent) that automatically generates punchy, hashtag-ready X and LinkedIn posts for every inspection, tagging
@lablaband@AIatAMD.
ForgeSight is built on a distributed "Console-Agent-Compute" architecture, optimized for production stability:
- ForgeSight Console (Frontend): A React-based industrial dashboard built with Tailwind CSS and Radix UI. Hosted on Vercel for high availability and global edge performance.
- Agentic Backend (Orchestration): A hybrid FastAPI service.
- Vercel Serverless: Handles metadata, telemetry, and reporting.
- Hugging Face Spaces: Orchestrates the heavy 4-agent pipeline with longer timeouts.
- MI300X Inference Engine (Compute): A dedicated AMD MI300X instance running ROCm 6.2 and vLLM. It serves a fine-tuned Qwen2-VL-7B model, providing the "brain" for the multimodal inspections.
During development, we prioritized "Failure-Proof" engineering to handle real-world data inconsistencies:
- Defensive API Layer: Implemented global exception handling and "Graceful Fallbacks" (e.g., in-memory persistence) to ensure the UI stays up even if DB connections are interrupted.
- Robust Schema Mapping: Designed the pipeline to handle malformed agent outputs or missing visual data without crashing, ensuring site managers always get a response.
- SSL/TLS Hardening: Custom connection handlers for MongoDB Atlas to ensure secure, stable connectivity from serverless environments.
To make the agents responsive, we deployed the model using vLLM on the ROCm 6.2 stack.
- We utilized PagedAttention to handle the high VRAM requirements of the model.
- The massive 192GB VRAM of the MI300X allowed us to serve the full model without sharding, maximizing throughput for our concurrent agent calls.
- ROCm Tuning: Optimized the engine by enforcing eager execution and disabling chunked prefill to ensure flawless stability on the MI300X.
We implemented a 4-stage sequential pipeline in Python to ensure industrial-grade auditability:
- Inspector Agent: Performs the initial multimodal analysis of the image.
- Diagnostician Agent: Determines the root cause (e.g., thermal expansion).
- Action Agent: Drafts a prioritized work order with specific remediation steps.
- Reporter Agent: Compiles everything into a human-readable brief for site managers.
- Live Telemetry: Real-time visualization of GPU utilization and power consumption directly from the MI300X node.
- Agentic Transcripts: A dynamic UI that displays the "thought process" and JSON hand-offs of each agent.
- Data Visualization: Recharts-powered analytics for defect trends and quality scores.
- Hardware: AMD Instinct MI300X (192GB HBM3).
- Software Stack: ROCm 6.2, PyTorch, vLLM.
- Backend: FastAPI, Gradio, Python.
- Frontend: React, Tailwind CSS, Radix UI, Recharts.
- Persistence: MongoDB Atlas (via Motor).
- Clone the Repo:
git clone https://github.com/rasali535/hans.git - Install Frontend:
cd frontend && npm install - Install Backend:
pip install -r backend/requirements.txt - Configure Environment: Add the following to your
.envor Vercel settings:AMD_INFERENCE_URLAMD_INFERENCE_TOKENMONGO_URL
- Launch Local Dev:
- Backend:
python backend/server.py - Frontend:
npm start
- Backend:
The MI300X's 5.3 TB/s bandwidth allows ForgeSight to maintain >2500 tokens/sec throughput, enabling real-time visual inspection of massive infrastructure projects without the latency typical of cloud-based VLM APIs.
Built by Ras Ali Labs for the AMD Developer Hackathon.