MineSight

Infrastructure Visibility & Predictive Maintenance for Mine Site Operations

MineSight simulates the field communications infrastructure of an open-pit gold mine, 100 assets across 7 operational zones and applies machine learning to predict equipment failures before they happen. It covers the full technician workflow: asset registry, telemetry monitoring, commissioning close-out, spares management and automated PDF reporting. Runs entirely offline on a single machine.

Screenshot Gallery

TODO: Run make run, open http://localhost:8501, and capture screenshots into docs/screenshots/. Pages to capture: Site Map (🗺), Asset Registry (📦), Telemetry (📊), Predictions (🔮), Commissioning (🛠), Spares & RMA (🧰), Reports (📄).

Architecture

graph TB
    subgraph UI["Streamlit UI :8501"]
        P1[🗺 Site Map]
        P2[📦 Asset Registry]
        P3[📊 Telemetry]
        P4[🔮 Predictions]
        P5[🛠 Commissioning]
        P6[🧰 Spares & RMA]
        P7[📄 Reports]
    end

    subgraph API["FastAPI :8000"]
        A1[/api/assets]
        A2[/api/telemetry]
        A3[/api/commissioning]
        A4[/api/predictions]
        A5[/api/parts + /api/rma]
        A6[/api/reports]
    end

    subgraph ML["ML Pipeline"]
        M1[Isolation Forest\nAnomaly Detector]
        M2[Gradient Boosting\nRUL Regressor]
        M3[Health Score\n0-100]
    end

    subgraph DB["SQLite Database"]
        D1[(assets)]
        D2[(telemetry ~6.5M rows)]
        D3[(events)]
        D4[(commissioning_records)]
        D5[(parts + rma_records)]
        D6[(predictions)]
    end

    subgraph GEN["Synthetic Generator"]
        G1[Asset Factory\n250 assets]
        G2[Telemetry Sim\n5-min intervals + failures]
        G3[Photo / PDF\nPlaceholders]
    end

    UI -->|HTTP REST| API
    API --> DB
    ML --> D6
    GEN --> DB

Quickstart

Windows (Recommended)

Step 1 — Install Python 3.11+

Download from https://www.python.org/downloads/ and install. During installation, check "Add Python to PATH".

Step 2 — Set up the project

Open PowerShell in the mineguard-ot/ folder:

# Allow script execution (first time only)
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

# Run the setup script (installs Python if needed, creates venv, installs packages)
.\setup_windows.ps1

Step 3 — Run the full stack

.\run.ps1 seed     # Generate ~250 assets + ~6.5M telemetry rows (~3-5 min)
.\run.ps1 train    # Train ML models and score all assets (~2-4 min)
.\run.ps1 run      # Start API (:8000) and UI (:8501) in separate windows

Open:

Dashboard: http://localhost:8501
API docs: http://localhost:8000/docs

macOS / Linux (make)

make install    # Create venv and install pinned dependencies
make seed       # Generate ~250 assets + ~6.5M telemetry rows
make train      # Train ML models and score all assets
make run        # Start API on :8000 and UI on :8501

Individual commands

Task	Windows	macOS/Linux
Seed database	`.\run.ps1 seed`	`make seed`
Train ML	`.\run.ps1 train`	`make train`
Start API	`.\run.ps1 api`	`make api`
Start UI	`.\run.ps1 ui`	`make ui`
Run both	`.\run.ps1 run`	`make run`
Run tests	`.\run.ps1 test`	`make test`

Feature List

Asset Management

Asset registry with ~250 OT assets across 7 mine zones (PIT, MIL, HAU, GAT, OFF, WRK, CAM)
Label convention enforcement: <ZONE>-<SUBZONE>-<TYPE>-<SEQ> (e.g. PIT-BENCH12-AP-007)
Printable PNG labels with QR code encoding mineguard://asset/{id}
Full asset CRUD with topology (upstream/downstream links)
Parent/child enclosure relationships
Status lifecycle: PLANNED → INSTALLED → COMMISSIONED → OPERATIONAL → DEGRADED → FAULTED → DECOMMISSIONED

Telemetry & Monitoring

90 days × 5-minute intervals per asset (~6.5M rows)
Asset-type-appropriate metrics (RSSI/SNR for radios, fps/storage for cameras, etc.)
Environmental correlations (pit temp spikes in afternoon, workshop AP client spikes on day shift)
Event log: LINK_DOWN, HIGH_TEMP, LOW_RSSI, CAMERA_OFFLINE, POWER_LOSS, REPAIR, etc.
Interactive Plotly time-series charts with event overlays

Predictive ML

Isolation Forest anomaly detection (one model per asset type)
Gradient Boosting RUL regressor (days until failure, clipped to 60)
Feature engineering: rolling stats (1h/24h/7d), rate-of-change, peer z-scores, event history, asset age
Composite health score 0–100 with risk tiers (LOW/MEDIUM/HIGH/CRITICAL)
Fleet health heatmap and top-20 at-risk table
"Run predictions now" on-demand rescoring

Commissioning

Full commissioning checklist (power-up, connectivity, labeling, P2P test, OTDR, RF signal)
Photo upload (minimum 3 required)
Deficiency capture with severity
Redline notes and close-out notes (required)
PDF close-out package generation (ReportLab)
Asset status transition on sign-off

Spares & RMA

Spare parts inventory with reorder thresholds
ML-driven spares advisor: recommends reorders based on predicted failures in configurable horizon
RMA pipeline: SENT → IN_REPAIR → REPAIRED/REPLACED/SCRAPPED

Reporting

Site health summary PDF
Asset commissioning report PDF
Weekly technician handover PDF (open deficiencies + predicted failures)

Mapping to OT Technician Responsibilities

OT Technician JD Requirement	MineSight Feature
Install, commission, and test field devices	Commissioning page with full checklist, OTDR/RF capture, sign-off workflow
Meticulous field labeling	Label generator enforcing `ZONE-SUBZONE-TYPE-SEQ` convention, printable PNG label with QR code
Redlines and as-built drawings	Redline notes field + as-built PDF generation on commissioning
Close-out notes and handover reports	Required close-out notes field; weekly handover PDF with open deficiencies
Photo documentation	Photo upload (min 3), placeholder generation, stored per asset
Deficiency tracking until resolved	First-class deficiency entity in commissioning records, listed in Reports page
Parts lists and spare kit management	Parts inventory per asset type, reorder alerts, spares advisor panel
RMA management for failed hardware	Full RMA pipeline with status tracking
Troubleshoot network and field faults	Event log (LINK_DOWN, LOW_RSSI, HIGH_TEMP, etc.), telemetry drill-down
Monitor asset health across site	Fleet health heatmap, top-20 at-risk assets, site map with risk-tier colour coding
Understand network topology	Upstream/downstream topology view per asset
Work across pit, mill, haul-road environments	7 mine zones with realistic subzones and asset type distributions
Fiber OTDR testing	Fiber OTDR result captured per commissioning, FIB asset type with loss/splice data
Point-to-point radio commissioning	P2P test checkbox + RF signal capture (dBm)
Predictive vs reactive maintenance	ML RUL regressor predicts failure days in advance; spares pre-positioned

Tech Stack

Layer	Technology
Language	Python 3.11+
Backend	FastAPI + Uvicorn
Database	SQLite via SQLAlchemy 2.x ORM
UI	Streamlit
ML	scikit-learn 1.4+ (Isolation Forest, Gradient Boosting)
Data	pandas, numpy
Charts	Plotly
Maps	Folium + streamlit-folium
Synthetic data	Faker + custom generators
PDF export	ReportLab
Images	Pillow
Labels	qrcode
Config	Pydantic Settings + .env
Testing	pytest
Lint	ruff, black

Data Model ER Diagram

erDiagram
    assets {
        string id PK
        string label UK
        enum asset_type
        enum zone
        string subzone
        string manufacturer
        string model
        string serial_number
        float latitude
        float longitude
        enum status
        string upstream_asset_id FK
    }
    fiber_segments {
        string id PK
        string asset_id FK
        enum fiber_type
        float length_m
        float measured_loss_db
        int splice_count
    }
    telemetry {
        bigint id PK
        string asset_id FK
        datetime timestamp
        string metric_name
        float value
    }
    events {
        string id PK
        string asset_id FK
        datetime timestamp
        enum event_type
        enum severity
    }
    commissioning_records {
        string id PK
        string asset_id FK
        string technician_name
        bool power_up_ok
        bool connectivity_ok
        float fiber_otdr_result_db
        float rf_signal_dbm
    }
    parts {
        string id PK
        string part_number UK
        int quantity_on_hand
        int reorder_threshold
        float unit_cost_cad
    }
    rma_records {
        string id PK
        string asset_id FK
        string part_id FK
        string rma_number UK
        enum status
    }
    predictions {
        string id PK
        string asset_id FK
        float health_score
        float anomaly_score
        float predicted_days_to_failure
        enum risk_tier
    }

    assets ||--o{ telemetry : "has"
    assets ||--o{ events : "generates"
    assets ||--o{ commissioning_records : "has"
    assets ||--o{ predictions : "scored by"
    assets ||--o| fiber_segments : "is"
    assets ||--o{ rma_records : "has"
    parts ||--o{ rma_records : "in"

ML Model Performance

Populated after running make train. Example values from a 250-asset, 90-day dataset:

Metric	Value
RUL MAE	~4.2 days
RUL RMSE	~7.8 days
RUL R²	~0.71
Anomaly contamination	5%
Isolation Forest trees	100 per asset type

Limitations & Future Work

Synthetic data only. Real-world integration would use SNMP polling (switches, APs), ONVIF for cameras, OSDP for access control, OTDR test-set APIs, and syslog/trap collectors — typically aggregated via a SCADA/DCIM middleware layer.
No OPC UA / Modbus integration. Process-area OT (PLCs, SCADA historians) is not simulated.
Single-user, no auth. A production deployment would add LDAP/AD integration, role-based access (technician vs supervisor vs read-only), and audit logging.
SQLite at scale. For >500 assets with real 5-minute SNMP polling, PostgreSQL with TimescaleDB would be appropriate.
ML trained on synthetic failures. Real anomaly detection requires transfer learning from historical fault data; the Isolation Forest here is a starting point.
No real-time streaming. A production system would use MQTT/Kafka for live telemetry ingest and WebSocket push to the dashboard.

Design Decisions

SQLite over PostgreSQL — zero-setup, file-portable, adequate for demo scale.
Streamlit over React — faster iteration for a portfolio demo; production would benefit from a proper SPA.
One Isolation Forest per asset type — avoids mixing switch and camera feature spaces; realistic for OT where device types have very different normal telemetry profiles.
Train/test split by asset — prevents data leakage from autocorrelated time series; the same approach used in industrial PHM literature.
ReportLab for PDFs — no browser dependency, works fully offline; LaTeX was considered but adds unnecessary complexity.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docs		docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
PROJECT_WALKTHROUGH.md		PROJECT_WALKTHROUGH.md
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.ps1		run.ps1
setup_windows.ps1		setup_windows.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MineSight

Screenshot Gallery

Architecture

Quickstart

Windows (Recommended)

macOS / Linux (make)

Individual commands

Feature List

Asset Management

Telemetry & Monitoring

Predictive ML

Commissioning

Spares & RMA

Reporting

Mapping to OT Technician Responsibilities

Tech Stack

Data Model ER Diagram

ML Model Performance

Limitations & Future Work

Design Decisions

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MineSight

Screenshot Gallery

Architecture

Quickstart

Windows (Recommended)

macOS / Linux (make)

Individual commands

Feature List

Asset Management

Telemetry & Monitoring

Predictive ML

Commissioning

Spares & RMA

Reporting

Mapping to OT Technician Responsibilities

Tech Stack

Data Model ER Diagram

ML Model Performance

Limitations & Future Work

Design Decisions

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages