Mohsen Seyedkazemi Ardebili MSKazemi

Mohsen Seyedkazemi Ardebili

Platform Engineer · AIOps · MLOps · LLM-Orchestrated Infrastructure
Research Fellow, University of Bologna · Bologna, Italy

I build autonomous AI systems that act on infrastructure — not just explain it. Seven years of hands-on ops in mission-critical industrial environments before a PhD in HPC systems gives me a different lens: I care about correctness, observability, and production trust.

Featured Project

KubeIntellect — Autonomous Kubernetes Operations

LLM-orchestrated multi-agent framework for root cause analysis, diagnosis, and human-gated cluster operations across the full Kubernetes API surface.

LangGraph FSM supervisor with PostgreSQL checkpoints and human-in-the-loop approval gates
Dynamic Code-Generator agent: sandboxed tool synthesis and validation at runtime
Modular domain agents: logs, metrics, RBAC, lifecycle, scheduling, exec, proxy
93% tool synthesis success rate · 100% reliability across 200+ queries

Other Projects

Project	Description	Key Metrics	Stack
kube_q	CLI + Python SDK for KubeIntellect	Streaming responses, Rich TUI	Python
AOBench	Agent Operations Benchmark — role-aware, permission-enforced, trace-based HPC agent evaluation	80 tasks · 26 environments	Python, LLM Eval, MCP
GRAAFE	Graph anomaly anticipation for exascale HPC	AUC 0.91 · 1000+ nodes	Python, GCN
HazardNet	Thermal hazard prediction for datacenters	F1 0.99 · <100ms inference	Python, TCN/LSTM

Research

PhD: Design, Analysis, and Management of High-Performance Computing Systems · University of Bologna (2018–2022)

EU Projects: DECICE · Graph-Massivizer · EUROPEAN PILOT · REGALE · EPI SGA1 · SEANERGYS

Scholar:

Citations	h-index	i10-index
179 (154 since 2021)	7	6

Selected Publications

Title	Venue	Year	Citations
KubeIntellect: A Modular LLM-Orchestrated Agent Framework for Kubernetes Management	arXiv	2025	—
M100 ExaData: A Data Collection Campaign on CINECA's Marconi100 Tier-0 Supercomputer	Nature Scientific Data	2023	50
PM100: A Job Power Consumption Dataset of a Large-Scale Production HPC System	SC'23 Workshops	2023	21
GRAAFE: Graph Anomaly Anticipation Framework for Exascale HPC Systems	FGCS	2024	17
HazardNet: Thermal Hazard Prediction Framework for Datacenters	FGCS	2024	—
Multi-level Anomaly Prediction in Tier-0 Datacenter	ACM Computing Frontiers	2022	—

All publications →

Stack

Platform & Infrastructure

AI / ML

HPC

Observability

Academic Service

PC Member: PDP 2025 · PDP 2026 · AsHES 2026

Reviewer: IEEE TCAD · FGCS · Journal of Grid Computing · SC · ACM CF · DATE · PDP · AsHES

Supervision: 2 PhD co-advisees (ongoing) · 5 MSc theses completed · Lab of Big Data Architectures, UniBo (2020–2024)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mohsen Seyedkazemi Ardebili MSKazemi

Achievements

Achievements

Highlights

Block or report MSKazemi