graphnlp-intel is an open-source Python library and REST API that transforms unstructured documents into rich, interactive knowledge graphs using state-of-the-art NLP, relationship extraction, and GNN-based sentiment propagation.
Install the library and download the required spaCy model:
pip install graphnlp-intel
python -m spacy download en_core_web_smRun the pipeline in 6 lines of code:
from graphnlp import Pipeline
pipe = Pipeline(domain="finance")
result = pipe.run(["Goldman Sachs acquired a 5% stake in Microsoft for $2.3 billion."])
# Visualize, export, and summarize
result.graph.visualize("output.html") # Generates a Pyvis interactive HTML graph
result.export_json("output.json") # Exports D3 compatible JSON
print(result.summary()) # Output stats on nodes, edges, sentiment, and communitiesThe system processes unstructured text through a 5-stage pipeline:
π Ingestion π Extraction πΈοΈ Graph Build π§ GNN π Output
DocumentLoader β NERExtractor β GraphBuilder β GraphGNN β Pyvis HTML /
TextChunker RelationExtractor CommunityDetector D3 JSON /
EmailParser EmbeddingExtractor Neo4j / Redis
from graphnlp.extraction.ner import NERExtractor
from graphnlp.extraction.relations import RelationExtractor
ner = NERExtractor()
entities = ner.extract("Apple Inc reported revenue of $120 billion.")
rel_ext = RelationExtractor()
triples = rel_ext.extract("Apple Inc reported revenue of $120 billion.")from graphnlp.graph.builder import GraphBuilder
from graphnlp.graph.community import CommunityDetector
import networkx as nx
builder = GraphBuilder()
graph = builder.build(triples, entities, embeddings_dict)
detector = CommunityDetector()
communities = detector.detect(graph)Domain adapters supply contextual logic like schema mappings, preprocessing, and post-processing steps tailored to specific industries.
| Adapter | Entity Types | Use Case |
|---|---|---|
finance |
TICKER, ORG, AMOUNT, DATE |
Parse fund records, expand ticker syms, build COMPETITOR_OF graphs |
email |
PERSON, MERCHANT, MONEY |
Strip HTML/headers, parse invoices, generate PAID_TO expense clusters |
feedback |
PRODUCT, SCORE, FEATURE |
Normalize 5-star ratings, cluster feature complaints, link reviews |
incidents |
SERVICE, ERROR, SEV |
Standardize P0/P1 flags, deduplicate logs, build AFFECTS topological graphs |
from graphnlp.adapters.base import get_adapter
from graphnlp.adapters.email import EmailAdapter
import networkx as nx
adapter = get_adapter("email")
clean_text = adapter.preprocess(raw_email_string)
# Graph integration
g = nx.DiGraph()
g.add_edge("$234.56", "Amazon", predicate="paid_to")
spend_clusters = EmailAdapter.monthly_spend_summary(g)from graphnlp.adapters.base import DomainAdapter
class HealthcareAdapter(DomainAdapter):
@property
def domain(self) -> str:
return "healthcare"
@property
def entity_types(self) -> list[str]:
return ["PATIENT", "SYMPTOM", "DRUG"]
def preprocess(self, text: str) -> str:
return text.replace("Pt.", "Patient")Deploy the multi-tenant REST API via Docker:
make docker-up| Method | Path | Description |
|---|---|---|
GET |
/health |
Check service health and system status. |
POST |
/v1/analyze |
Submit documents for processing (sync or async). |
GET |
/v1/analyze/{job_id} |
Poll status of an async analysis job. |
GET |
/v1/graph/{graph_id} |
Retrieve D3.js compatible graph JSON by ID. |
GET |
/v1/graph/{graph_id}/summary |
Retrieve summarized stats of the graph. |
POST |
/v1/webhooks |
Register a new webhook endpoint for async complete events. |
GET |
/v1/webhooks |
List registered webhooks for the given tenant. |
# Submit Sync
curl -X POST http://localhost:8000/v1/analyze \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{"documents": ["Invoice 123 for $500 to AWS"], "domain": "finance", "async": false}'
# Submit Async
curl -X POST http://localhost:8000/v1/analyze \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{"documents": ["Massive batch 1...", "Massive batch 2..."], "async": true}'
# Poll Async Status
curl -X GET http://localhost:8000/v1/analyze/job-1234 \
-H "Authorization: Bearer sk-your-api-key"pip install graphnlp-clientfrom graphnlp_client.client import GraphNLPClient
client = GraphNLPClient(api_key="sk-your-api-key", base_url="http://localhost:8000")
# Sync
result = client.analyze(["Azure bill $300"], domain="email")
print(result["graph_id"])
# Get Graph data
graph = client.get_graph(result["graph_id"])npm install graphnlp-clientimport { GraphNLPClient } from 'graphnlp-client';
const client = new GraphNLPClient({ apiKey: 'sk-your-api-key' });
async function analyze() {
const result = await client.analyze(['Q4 earnings were up 12%'], { domain: 'finance' });
const graph = await client.getGraph(result.graph_id);
console.log(graph.nodes);
}Register webhooks to receive JSON payloads upon async task completion.
curl -X POST http://localhost:8000/v1/webhooks \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{"url": "https://yourapp.com/hook", "events": ["graph.ready"], "secret": "wh_sec_123"}'Webhook Payload Example
{
"event": "graph.ready",
"job_id": "job-1234",
"graph_id": "graph-5678",
"tenant_id": "tenant-abc",
"timestamp": "2026-04-18T10:00:00Z",
"signature": "sha256=d2b8b9a..."
}Configure the platform using config/default.yaml or environment variables:
# config/default.yaml
environment: production
neo4j:
uri: bolt://localhost:7687
redis:
url: redis://localhost:6379
api:
rate_limit_per_minute: 100
nlp:
ner_model: en_core_web_sm
embedding_model: all-MiniLM-L6-v2# .env
GRAPHNLP_ENVIRONMENT=production
GRAPHNLP_NEO4J_URI=bolt://neo4j:7687
GRAPHNLP_NEO4J_USER=neo4j
GRAPHNLP_NEO4J_PASSWORD=supersecret
GRAPHNLP_REDIS_URL=redis://redis:6379Manage the platform using the built-in Typer CLI:
graphnlp run --domain finance --file data.csv: Run pipeline on a local file.graphnlp serve --port 8000 --reload: Start the FastAPI server.graphnlp worker --concurrency 4: Start the Celery async worker.graphnlp generate-key -t my-tenant: Generate a new API key for the specified tenant.
graphnlp-intel/
βββ graphnlp/
β βββ config.py # Pydantic Settings
β βββ pipeline.py # Main Orchestrator
β βββ ingestion/ # Loaders, Chunkers, Email Parsers
β βββ extraction/ # NER, Relations, SBERT Embeddings
β βββ graph/ # NetworkX Builder, PyG GNN, Diff, Louvain
β βββ adapters/ # Domain-specific logic
β βββ storage/ # Neo4j & Redis handlers
β βββ api/ # FastAPI routes, Auth, Tenant Middleware
β βββ queue/ # Celery workers & tasks
β βββ webhooks/ # HMAC Dispatcher
βββ tests/
β βββ unit/ # Isolated logic blocks
β βββ integration/ # E2E API tests
β βββ fixtures/ # CSV/JSON samples
βββ sdk/
β βββ python/ # PyPI API wrapper
β βββ js/ # NPM API wrapper
βββ docker/
β βββ docker-compose.yml # Local orchestration
β βββ Dockerfile # API Container
β βββ Dockerfile.worker # Celery Container
βββ pyproject.toml # Dependencies & metadata
We stand on the shoulders of giants.
| Component | Library |
|---|---|
| NLP Base | spacy |
| Deep Learning | torch |
| Graph Neural Nets | torch-geometric |
| Language Models | transformers |
| Sentence Embeddings | sentence-transformers |
| Graph Analytics | networkx |
| Async Queue | celery |
| Web Framework | fastapi |
| Configuration | pydantic |
| Caching & Rate Limits | redis.asyncio |
| Graph Persistence | neo4j (async driver) |
| CLI Generation | typer |
| Phase | Milestone | Expected |
|---|---|---|
| Phase 1 | Streaming Engine (Kafka integration, real-time diffing) | Q3 2026 |
| Phase 2 | Custom Model Fine-Tuning (LoRA automated pipeline) | Q4 2026 |
| Phase 3 | Visual Graph Dashboard (React SPA for interactive analytics) | Q1 2027 |
| Tier | Price | Features |
|---|---|---|
| Open Source | Free | Apache 2.0 Β· Self-hosted Β· All adapters Β· CLI |
| Custom NER | $800β2,000 | Fine-tune NER Β· HF model delivery Β· Eval report |
| Hosted API | $2,500 + $400/mo | FEATURED Β· AWS/GCP/Azure deploy Β· Docker + TF Β· SDK |
| Enterprise | $8,000+ | Streaming Β· Dashboard Β· Alerting SLA Β· White-label |
Interested in Hosted API or Enterprise tiers? Get a quote on our site.
We welcome contributions!
git clone https://github.com/samvardhan03/GraphNLP-Intel.git
cd GraphNLP-Intel
./setup_dev.sh
make testThis project is licensed under the Apache License 2.0.
@software{graphnlpintel2026,
author = {GraphNLP Team},
title = {graphnlp-intel: Hybrid Graph-NLP Intelligence Platform},
year = {2026},
url = {https://github.com/samvardhan03/GraphNLP-Intel}
}