Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to use pytextrank for entity linking #70

Closed
chikubee opened this issue Sep 21, 2020 · 3 comments
Closed

how to use pytextrank for entity linking #70

chikubee opened this issue Sep 21, 2020 · 3 comments
Assignees
Labels
kg features based on kglab integration and use of knowledge graph approaches question
Projects

Comments

@chikubee
Copy link

README states pytextrank can be used for three tasks 1. phrase extraction 2. summarization 3. entity linking
I see that examples and usage are available for 1 and 2 but not 3.
can someone share a reference on how it can be used, how its lemma graph can be enriched with domain knowledge, etc.

@ceteri ceteri self-assigned this Sep 21, 2020
@ceteri
Copy link
Collaborator

ceteri commented Sep 21, 2020

Thank you @chikubee that's a good catch. The README.md describes one of the motivations for this project as entity linking although those features are a WIP. That has been explored in some tutorials, and there's WIP code in a private repo which is quite active work -- though not exposed as features here yet. We tentatively have a knowledge graph tutorial in collab with https://www.knowledgegraph.tech/ and https://connected-data.london/ scheduled for early December 2020 where that work will be presented.

I've updated the README.md to try to be more clear, as of cb51ba3 and you're added to the kudos for that.

As a simple example, the WordnetAnnotator section of https://github.com/DerwenAI/spaCy_tuTorial/blob/master/spaCy_tuTorial.ipynb gives at least a sketch of how entity linking could work:

  • make use of a KG -- in the spaCy tutorial above, WordNet supplies the semantic relations
  • use domain knowledge to constrain the search space for synsets
  • search the KG's graph neighborhood of a given entity to link hypernyms and hyponyms into the PTR lemma graph
  • benefits:
    • this enhances the centrality measures used to rank keyphrases
    • entity linking of keyphrases => KG is performed in the process

A couple questions for you:

  1. What kind of use cases do you have for entity linking features?
  2. How would you want to have the lemma graph exposed?

@chikubee
Copy link
Author

chikubee commented Sep 22, 2020

@ceteri thanks for your quick and detailed response.
Looking forward to the release., it's a great problem to solve, cheers.

I am trying to build a multi-tenant domain intelligence system.
Intent classification and entity recognition are solved problems. But understanding the utterance to identify links and map them to real world entities is challenging.

I was looking at elegant ways to identify entity groups and links
a. within the text
I want to have pizza with extra cheese, a taco, and 2 diet cokes.
(1 pizza, other: extra cheese), (1 taco), (2 diet cokes)
Who is the manager of Mike? what is his salary?
->here if graph was enriched with coref resolution, salary would get attributed to the manager
of Mike.
b. outside the text (i.e. mapped to real world entity from the domain KG and custom fed entities/attributes.
i.e. recognize ootb entities company names, positions, status, food, etc.

@ceteri
Copy link
Collaborator

ceteri commented Dec 6, 2020

FYI, here are some more related notes and discussion #78 (comment) with introduction to kglab which is intended to provide this kind of KG support in PyTextRank.

To your point above @chikubee then the KG used for the TextRank pipeline would:

  1. enrich its internal lemma graph by importing nodes and edges from the KG, leading to better keyphrase ranking
  2. have entity linking into the KG as a side-effect
  3. then you could query via SPARQL, SHACL, or perhaps even PSL and other probabilistic methods to achieve what you wanted (Mike, his salary, etc.)

@ceteri ceteri added this to To do in pytextrank Feb 14, 2021
@ceteri ceteri added this to the Entity Linking milestone Feb 15, 2021
@ceteri ceteri added the kg features based on kglab integration and use of knowledge graph approaches label Feb 15, 2021
@ceteri ceteri moved this from To do to In progress in pytextrank Jul 25, 2022
@DerwenAI DerwenAI locked and limited conversation to collaborators Aug 22, 2023
@ceteri ceteri converted this issue into discussion #246 Aug 22, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
kg features based on kglab integration and use of knowledge graph approaches question
Projects
pytextrank
  
In progress
Development

No branches or pull requests

2 participants