- Extracting the Entities from the Dataset using tabular data.
- Extracting the Entities from the Metadata using Language Model
Babelscape/rebel-large - The code walkthrough is in the research folder.
- The extracted entities and the defined SVO triplets are in the Entities Library.
- Populating the
Neo4jgraph database with the extracted svo triplets. - Embeddings are not added because some of the extracted entities are not in the Wikidata or DBpedia Open Knowledge base.
Note: Spacy is used for the NER of the extracted entities. However given more resources this could be done using custom finetuned NER model.
CALL apoc.import.json("https://raw.githubusercontent.com/ideepankarsharma2003/KnowledgeGraphs/main/json_files/svo_new_cat_ear_headphones_deduped.json") ;
MATCH (n:Node)
CALL apoc.create.addLabels(n, n.node_labels)
YIELD node
RETURN node;
MATCH (n) RETURN n;
MATCH (n) DETACH DELETE n;
- Code Walkthrough
- Entity Extraction using Language Model
- Final Generated Entities Guide
- Labelling the SVO triples
● World Bank Projects
● SAM.gov tenders
● Multi-modal data (images, videos, and textual descriptions)



