Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

damping arg ignored in lexRankFromSimil #15

Closed
jwijffels opened this issue Feb 8, 2018 · 5 comments
Closed

damping arg ignored in lexRankFromSimil #15

jwijffels opened this issue Feb 8, 2018 · 5 comments

Comments

@jwijffels
Copy link

jwijffels commented Feb 8, 2018

Just had a look at this package because I was creating a similar package recently called textrank (https://cran.r-project.org/web/packages/textrank/index.html). This package seems to follow the same approach although the textrank package starts with something which looks like the output of udpipe which contains already sentences and all the words tokenised.
While skimming the code, I noticed that you are not using the damping argument lexRankFromSimil, maybe that is something to fix.
I'm also interested to hear if you have found a way to reduce the computational burden of doing many sentence to sentence similarity calculations?

@AdamSpannbauer
Copy link
Owner

Thanks for pointing out the issue with damping; I will fix it.

The only step I have taken to reduce the computation (time) is to write a minimal Rcpp function to perform the similarity calc. Some parallelization could also provide some benefit here.

Another possibility would be to get creative, deviate from the methodologies laid out in lexrank/textrank, and think of a new way to get valid sentence ranks without the cumbersome pairwise comparisons. Of course, I haven't thought of a valid way, but I'd be interested in learning more about a method if there is one out there.

@jwijffels
Copy link
Author

There is the minhash algorithm which I tried to wrap in the textrank package: https://cran.r-project.org/web/packages/textrank/vignettes/textrank.html#minhash but that requires knowledge of the useR on that algorithm. If you know of another way, also feel free to let me know :).

@AdamSpannbauer
Copy link
Owner

FYI going to rename to focus on damping issue and close this when it is fixed

@AdamSpannbauer AdamSpannbauer changed the title damping / al lot of comparisons damping arg ignored in lexRankFromSimil Feb 12, 2018
@wenyi-tay
Copy link

Hi! Thank you for maintaining this package. This package saved me LOADS of time! I cannot thank you enough. I may have found the line in the code that is potentially the issue so I thought I might share this with you. The original is sentRank <- igraph::page_rank(sentGraph, directed = FALSE)$vector. I think to take in the damping argument, it should be sentRank <- igraph::page_rank(sentGraph, directed = FALSE, damping = damping)$vector. Hope it helps you =) Thanks!

@AdamSpannbauer
Copy link
Owner

Thanks for re-pinging this issue @wenyi-tay. I've committed the fix (which was exactly as you outlined) and will re-publish to CRAN assuming all tests are 👍.

If you do find any other issues and see the correct fix (as you did this time), please feel free to submit a PR. I haven't been very active with this project lately, so I'll rely on feedback/PRs from any users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants