Pre-computed and on-demand vector embeddings for networking concepts — protocol specs, CLI commands, RFC summaries, and vendor documentation snippets — designed for RAG pipelines, similarity search, and ML applications.
General-purpose embedding models treat "BGP" and "OSPF" as unrelated tokens. Network engineers building AI-powered tools need embeddings that understand the semantic relationships between networking concepts: that BGP and OSPF are both routing protocols, that VXLAN and MPLS are both tunneling/overlay technologies, and that show ip route is related to routing tables.
netembeddings provides:
- A curated registry of 50+ networking concepts with metadata (category, related terms, RFCs)
- Pre-built datasets of protocol and CLI command descriptions
- A lightweight TF-IDF embedding generator (no API keys or GPU required)
- Cosine similarity search over embedding vectors using pure numpy
- A CLI for quick lookups and exploration
pip install netembeddingsOr install from source:
git clone https://github.com/cwccie/netembeddings.git
cd netembeddings
pip install -e ".[dev]"from netembeddings import ConceptRegistry, EmbeddingStore, TFIDFGenerator
from netembeddings.registry import build_default_registry
# Load the built-in registry of networking concepts
registry = build_default_registry()
# Browse concepts
bgp = registry.get("BGP")
print(f"{bgp.name}: {bgp.description}")
print(f"RFC: {bgp.rfc}")
print(f"Related: {bgp.related_terms}")
# List all protocols
for concept in registry.by_category("protocol"):
print(f" {concept.name}")# Fuzzy search across names, descriptions, and related terms
results = registry.search("routing protocol")
for concept in results[:5]:
print(f"{concept.name} [{concept.category}]")# Generate TF-IDF embeddings (no API needed)
concepts = registry.list_all()
corpus = [c.description for c in concepts]
generator = TFIDFGenerator(output_dim=128)
generator.fit(corpus)
# Create an embedding store
store = EmbeddingStore(dimension=128)
for concept in concepts:
vec = generator.generate(concept.description)
store.add(concept.name, vec)
# Similarity search
query_vec = generator.generate("link-state routing with areas")
results = store.search(query_vec, top_k=5)
for name, score in results:
print(f" {name}: {score:.4f}")# Save embeddings
store.save("network_embeddings.npz")
# Load later
store = EmbeddingStore.load("network_embeddings.npz")from netembeddings import cosine_similarity, concept_similarity
# Between two vectors
sim = cosine_similarity(vec_a, vec_b)
# Between named concepts
sim = concept_similarity("BGP", "OSPF", store)
print(f"BGP-OSPF similarity: {sim:.4f}")from netembeddings.datasets import get_protocol_concepts, get_command_concepts
# 30 protocol descriptions
protocols = get_protocol_concepts()
for p in protocols[:3]:
print(f"{p['name']}: {p['description'][:60]}...")
# 20 CLI command descriptions
commands = get_command_concepts()# Search for concepts
netembeddings search "routing protocol" --top-k 5
# List all concepts or filter by category
netembeddings list
netembeddings list --category protocol
# Find similar concepts
netembeddings similar BGP --top-k 5The TF-IDF generator is a zero-dependency fallback. For production use, generate embeddings with your preferred model and store them:
store = EmbeddingStore(dimension=3072)
# Add vectors from any source (OpenAI, Gemini, sentence-transformers, etc.)
store.add("BGP", your_model.embed("Border Gateway Protocol..."))
store.add("OSPF", your_model.embed("Open Shortest Path First..."))
store.save("custom_embeddings.npz")# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest -v --cov=netembeddings
# Lint
ruff check src/ tests/MIT License. Copyright (c) 2026 Corey Wade.