Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented TextRank #54

Merged
merged 6 commits into from Oct 17, 2018
Merged

Implemented TextRank #54

merged 6 commits into from Oct 17, 2018

Conversation

ygorg
Copy link
Collaborator

@ygorg ygorg commented Oct 11, 2018

textrank.py is the method described in the paper, along with its post-processing. the paper does not describe a candidate selection happening before the TextRank algorithm.
pketextrank.py is an implementation of the core method described in the paper. The candidate selection is the one used in pke to allow comparison between the weighting of the candidates.

textrank.py is the method described in the paper, along with its post-processing. the paper does not describe a candidate selection happening before the TextRank algorithm.
pketextrank.py is an implementation of the core method described in the paper. The candidate selection is the one used in PKE to allow comparison between the weighting of the candidates.
Keeping only the comparable version
The candidate generation detailed in the paper is used when `top` parameter is used in candidate_weighting
The graph creation is more accurate according to the paper
@ygorg
Copy link
Collaborator Author

ygorg commented Oct 16, 2018

The warning was added because for some algorithm returning n best candidates according to an other parameter is an approach by itself.
For example in TextRank the parameter for candidate generation is the number of term to keep in the graph (T), but if there is less than n candidate generated, changing the T parameter can be either done by slowly increasing T to increase the number of generated candidate, or always using 1, but the algorithm won't be as described in the TextRank algorithm.

@ygorg
Copy link
Collaborator Author

ygorg commented Oct 16, 2018

The candidate generation described in the paper is implemented in the candidate_weighting function because if they are generated and weighted in the get_n_best the weighting of the candidate will happen in two function depending on the parameters which does not comply to the use of the package.

@boudinfl boudinfl merged commit 129c44f into boudinfl:python3 Oct 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants