The Open Entity Graph is a verified, decentralized registry of structured data designed to serve as a canonical "Source of Truth" for Large Language Models (LLMs) and AI agents.
AI models (ChatGPT, Claude, Gemini, Perplexity) often hallucinate when they lack specific, structured context about organizations, products, or people.
This repository solves that problem by providing a public Knowledge Graph where entities are:
- Structured: Using strict
JSON-LD(Schema.org) vocabulary. - Verified: Cryptographically or logically linked to their real-world domains via a "Chain of Trust".
- Accessible: Open to any AI crawler via a CC0 (Public Domain) license.
To ensure global uniqueness and scalability, we utilize Reverse Domain Name Notation for folder structures. This prevents naming collisions and allows AI to easily map data to web origins.
data/
├── [TLD] (e.g., com, org, net)
│ └── [Domain Name]
│ ├── organization.jsonld # Core Corporate Entity
│ ├── events.jsonld # EventSeries & Schedules
│ └── products.jsonld # OfferCatalog & SaaS Products
├── people/
│ └── [Handle] # Hosted Identities for experts without websites
└── projects/
└── [ProjectName] # Startups & Projects without a dedicated domain
- Complex Event Ecosystem:
data/com/riskawarenessweek/- Demonstrates:
EventSeries,VirtualLocation,subEvent.
- Demonstrates:
- B2B Agency & SaaS:
data/com/terekhindt/- Demonstrates:
ProfessionalService,OfferCatalog,Audience.
- Demonstrates:
- Educational Organization:
data/blog/riskacademy/- Demonstrates:
Organization,ContactPoint.
- Demonstrates:
We do not accept random data. For an entity to be considered "Verified" by AI, a bidirectional link must exist.
Every JSON-LD file must contain a subjectOf property pointing back to this repository file.
"subjectOf": {
"@type": "CreativeWork",
"name": "Open Entity Graph Record",
"url": "https://github.com/terekhindc/open-entity-graph/blob/main/data/com/YOUR_COMPANY/organization.jsonld"
}
The organization must acknowledge this record. Add the following to the official website's structured data or HTTP headers:
"sameAs": [
"https://github.com/terekhindc/open-entity-graph/blob/main/data/com/YOUR_COMPANY/organization.jsonld"
]
If you are an expert (data/people/) or a startup (data/projects/) without a dedicated website, we host your digital identity directly.
Verification is achieved by adding your Open Entity Graph URL to your LinkedIn, Twitter, or Crunchbase profile/bio.
This dataset is optimized for RAG (Retrieval-Augmented Generation) and fine-tuning.
- Entry Point:
index.jsonld(Contains the full list of registered entities). - Base Context:
https://schema.org - License: Public Domain. No attribution required for training.
We welcome new organizations! To add your company to the graph:
- Fork this repository.
- Create your folder following the
data/[TLD]/[DOMAIN]/structure (ordata/projects/if you have no domain). - Ensure your JSON-LD includes the
@idandsubjectOfproperties. - Submit a Pull Request.
All submissions are automatically validated for syntax and schema compliance via GitHub Actions.
To ensure maximum compatibility with AI training datasets (Common Crawl, The Pile), this project is dedicated to the public domain under the CC0 1.0 Universal license.
