FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts

Official repository for the (submitted) APMS 2026 paper: "FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts".

Overview

Maintenance procedures in manufacturing facilities are often documented as flowcharts in static PDFs or scanned images. These documents encode procedural knowledge essential for asset lifecycle management but remain inaccessible to modern operator support systems. While Vision-Language Models (VLMs) struggle to reconstruct complex connection topologies from such diagrams, FlowExtract offers a robust, hybrid alternative.

FlowExtract is a pipeline that deliberately separates element detection from connectivity reconstruction:

Node Detection: Single-stage object detection (YOLOv8s) localized and classified symbols.
Text Extraction: Deep-learning OCR (EasyOCR) extracts node content.
Edge Extraction: Classical line-tracing (Hough Transform) derives directed graphs from detected arrowheads.

By focusing on high precision rather than forced recall, FlowExtract is explicitly designed for Human-in-the-Loop (HITL) workflows. The system provides a highly reliable structural skeleton of the standard operating procedure, allowing human validators to efficiently contribute completeness without having to untangle hallucinatory cross-links.

Key Features & Results

Evaluated on a dataset of real-world ISO 5807-standardized industrial troubleshooting guides, FlowExtract substantially outperforms state-of-the-art vision-language model baselines (such as Qwen2-VL-7B and Pixtral-12B) on graph extraction tasks.

Node Detection (F1): 98.8% (vs best VLM: 34.0%)
Edge Detection (F1): 66.7% (vs best VLM: 10.7%)
Edge Precision: 85.5%

Qualitative Performance

The pipeline successfully handles dense technical terminology, tightly spaced nodes, and overlapping edges, tracing multi-branching procedural paths accurately.

The original textual content within the nodes has been computationally redacted to anonymize proprietary procedural data, while preserving the structural morphology.

Getting Started

Prerequisites

Python 3.9+
Tesseract (required by EasyOCR depending on OS)
MacOS M-series or CUDA-compatible GPU recommended for YOLO inference.

Installation

Clone the repository:

git clone https://github.com/guille-gil/FlowExtract.git
cd FlowExtract

Create a virtual environment and install dependencies:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Download the pre-trained model weights (if hosted externally) and place them in: runs/detect/train/weights/best.pt

Repository Structure

FlowExtract/
├── docs/                      # Auxiliary documentation and paper figures
├── data/
│   ├── input/                 # Raw legacy PDFs/images and YOLO annotations
│   ├── intermediate/          # Output of intermediate pipeline stages
│   └── output/                # Final JSON graphs and metric charts
├── scripts/
│   ├── train_yolo.py          # Script for fine-tuning YOLOv8s
│   └── generate_figure.py     # Qualitative validation chart generation
├── src/                       
│   ├── pipeline/              # Modulized extraction pipeline (Stages 1-3)
│   ├── utils/                 # Bounding box spatial heuristics & visualization
│   ├── main.py                # Main operational script
│   └── evaluate.py            # End-to-end ground-truth metric evaluation
└── README.md

Usage

1. Running the Pipeline

To extract a directed graph from a raw flowchart image, run the main entry point:

python src/main.py

This will parse the files in data/input/images/test/ and output the structural JSON graphs to data/intermediate/arrows/.

2. Evaluating Metrics

To replicate the evaluation results found in the paper, execute the evaluation script. This will compare the extracted JSON graphs against the data/input/final_annotations ground truth:

python src/evaluate.py --charts

Evaluation metrics will be printed to stdout, and publication-ready charts (like the ones generated for APMS) will be saved to data/output/charts/.

Note: The Vision-Language Model (VLM) baseline results reported in our paper are evaluated on the same dataset in our prior work. If you reference those comparisons, please cite:

@article{gilavalle2026procedural,
  title={Procedural Knowledge Extraction from Industrial Troubleshooting Guides Using Vision Language Models},
  author={Gil de Avalle, Guillermo and Maruster, Laura and Emmanouilidis, Christos},
  journal={arXiv preprint arXiv:2601.22754},
  year={2026}
}

License

This project's source code is licensed under the MIT License, which permits commercial use and modifications provided that original attribution remains. See the LICENSE file for full details.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
assets		assets
configs		configs
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts

Overview

Key Features & Results

Qualitative Performance

Getting Started

Prerequisites

Installation

Repository Structure

Usage

1. Running the Pipeline

2. Evaluating Metrics

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts

Overview

Key Features & Results

Qualitative Performance

Getting Started

Prerequisites

Installation

Repository Structure

Usage

1. Running the Pipeline

2. Evaluating Metrics

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages