Backend - Checking on multiple LLMs for entity and relations extraction and populating KG #16

aashipandya · 2024-01-30T07:23:49Z

No description provided.

jexp · 2024-02-01T14:33:23Z

Comments from our discussion:

Thanks for putting the notebooks together my main feedback points summarized would be:

have summary / conclusion for each in readme (also findings, learnings)
use same PDF documents for each example
generate the same machine processable (JSON) output from each (plus the human baseline) for comparison and further analysis
use dotenv in the notebooks to allow loading environment variables from an .env file for the notebook so we don't need to ensure to add / remove credentials

Better use JSON than CSV actually because then we can handle multiple properties for each entity, align it with the structure we get from diffbot -> nodes / relationships

Just comments, no action needed here:

for the triple ones (like rebel / llama-index) our challenge here is that we can't use the results out of the box, we would either have to:

modify them to output property graph nodes or relationships
or post-process the triples to aggregate all entity attribute triples into properties and only keep the triples that represent semantic relationships as such
or do this during insertion of the data into the graph - aggregating when inserting, e.g. initially create/merge the nodes with their ID and subsequently merge on id + add property and for the relationships find start and end node with label and id and create relationship

for the Rebel one in: def create_triplets(tx, triplet) : if we want to look at this approach in the future we should see if we can carry the entity-type over, so we can use not just the generic :Node but in addition also a label for the type like :Person or :Organization
and then also do the attribute aggregation there

aashipandya assigned aashipandya and rakshita-arora Feb 1, 2024

rakshita-arora removed their assignment Feb 19, 2024

aashipandya closed this as completed Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backend - Checking on multiple LLMs for entity and relations extraction and populating KG #16

Backend - Checking on multiple LLMs for entity and relations extraction and populating KG #16

aashipandya commented Jan 30, 2024

jexp commented Feb 1, 2024