Evaluation framework for LLM-generated network configurations.
LLMs can generate network configurations, but how do you know if the output is
correct, secure, and follows best practices? configeval provides automated
syntax validation, security auditing, completeness checks, and compliance
scoring against industry standards.
When using LLMs to generate network device configurations (Cisco IOS, JunOS, etc.), common failure modes include:
- Syntax errors: Mismatched braces, broken indentation, orphaned lines
- Security gaps: Plaintext passwords, telnet enabled, weak crypto, permit-any ACLs
- Missing essentials: No hostname, no NTP, no logging, no login banners
- Best practice violations: No CoPP, unused interfaces left up, no VTY ACLs
- Internal inconsistency: ACLs referenced but not defined, VLANs used but not created
configeval catches all of these automatically and gives you a score.
pip install configevalOr from source:
git clone https://github.com/cwccie/configeval.git
cd configeval
pip install -e ".[dev]"# Evaluate a Cisco IOS config
configeval check router.cfg
# Evaluate with JSON output
configeval check router.cfg --format json
# Evaluate a JunOS config
configeval check firewall.cfg --vendor junos
# Run only security checks
configeval check router.cfg --checks security
# Markdown report
configeval check router.cfg --format mdfrom configeval import ConfigEvaluator
evaluator = ConfigEvaluator()
config = open("router.cfg").read()
result = evaluator.evaluate(config, vendor="cisco")
print(f"Score: {result.score:.1%}")
print(f"Passed: {result.passed}")
print(f"Summary: {result.summary}")
for check in result.failed_checks:
print(f" [{check.severity.upper()}] {check.name}: {check.message}")# Only run security and completeness checks
evaluator = ConfigEvaluator(enabled_checks=["security", "completeness"])
result = evaluator.evaluate(config)# Make warnings count as much as errors in scoring
evaluator = ConfigEvaluator(severity_weights={
"error": 1.0,
"warning": 1.0,
"info": 0.5,
})| Check | Description |
|---|---|
syntax.junos_braces |
Matching braces in JunOS configs |
syntax.ios_indentation |
Consistent indentation in IOS configs |
syntax.orphaned_lines |
Trailing backslashes, misplaced separators |
syntax.section_closure |
Empty interface/router sections |
| Check | Description |
|---|---|
security.plaintext_passwords |
Type 0 or unencrypted passwords |
security.password_encryption |
service password-encryption enabled |
security.weak_crypto |
DES/3DES, MD5, SSHv1, TLS 1.0 |
security.permit_any_acl |
permit any any ACL rules |
security.snmp_community |
SNMP v1/v2c community strings (especially public/private) |
security.telnet_enabled |
Telnet on VTY lines |
security.ssh_configured |
SSH is set up |
| Check | Description |
|---|---|
completeness.hostname |
Hostname configured |
completeness.logging |
Logging host/buffer configured |
completeness.ntp |
NTP server configured |
completeness.banner |
Login banner present |
completeness.aaa |
AAA new-model enabled |
completeness.line_password |
Console/VTY authentication |
| Check | Description |
|---|---|
best_practice.control_plane_policing |
CoPP configured |
best_practice.unused_interfaces_shutdown |
Unused interfaces administratively down |
best_practice.logging_buffer_size |
Buffer >= 16384 bytes |
best_practice.vty_acl |
access-class on VTY lines |
| Check | Description |
|---|---|
consistency.acl_references |
Referenced ACLs exist |
consistency.route_map_references |
Prefix-lists in route-maps exist |
consistency.vlan_references |
Referenced VLANs are defined |
Define your own rules with regex patterns or callables:
from configeval import ConfigEvaluator, Rule, RuleSet
custom = RuleSet(name="my-rules")
# Flag configs with SNMP traps to a specific host
custom.add_rule(Rule(
name="custom.snmp_trap_host",
pattern=r"snmp-server host 10\.0\.0\.99",
severity="warning",
message="SNMP traps sent to deprecated monitoring server",
))
# Check that a specific feature IS present (invert=True fails if missing)
custom.add_rule(Rule(
name="custom.requires_cdp_disabled",
pattern=r"no cdp run",
severity="warning",
message="CDP should be disabled globally",
invert=True,
))
# Use a callable for complex logic
custom.add_rule(Rule(
name="custom.max_vty_lines",
pattern=lambda cfg: cfg.count("line vty") > 2,
severity="info",
message="More than 2 VTY line groups defined",
))
evaluator = ConfigEvaluator(custom_rules=custom)
result = evaluator.evaluate(config)from configeval.reporters import TextReporter, JSONReporter, MarkdownReporter
# Plain text (default)
print(TextReporter().render(result))
# JSON (for CI/CD pipelines)
print(JSONReporter().render(result))
# Markdown (for documentation)
print(MarkdownReporter().render(result))The score (0.0 to 1.0) is calculated as a weighted average of all checks:
| Severity | Default Weight |
|---|---|
error |
1.0 |
warning |
0.5 |
info |
0.2 |
A config passes with a score >= 0.70 (70%).
docker build -t configeval .
docker run -v $(pwd):/data configeval check /data/router.cfgpip install -e ".[dev]"
pytest -v --cov=configeval
ruff check src/ tests/MIT License. Copyright (c) 2026 Corey Wade.