-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
damping
arg ignored in lexRankFromSimil
#15
Comments
Thanks for pointing out the issue with The only step I have taken to reduce the computation (time) is to write a minimal Rcpp function to perform the similarity calc. Some parallelization could also provide some benefit here. Another possibility would be to get creative, deviate from the methodologies laid out in lexrank/textrank, and think of a new way to get valid sentence ranks without the cumbersome pairwise comparisons. Of course, I haven't thought of a valid way, but I'd be interested in learning more about a method if there is one out there. |
There is the minhash algorithm which I tried to wrap in the textrank package: https://cran.r-project.org/web/packages/textrank/vignettes/textrank.html#minhash but that requires knowledge of the useR on that algorithm. If you know of another way, also feel free to let me know :). |
FYI going to rename to focus on damping issue and close this when it is fixed |
damping
arg ignored in lexRankFromSimil
Hi! Thank you for maintaining this package. This package saved me LOADS of time! I cannot thank you enough. I may have found the line in the code that is potentially the issue so I thought I might share this with you. The original is |
Thanks for re-pinging this issue @wenyi-tay. I've committed the fix (which was exactly as you outlined) and will re-publish to CRAN assuming all tests are 👍. If you do find any other issues and see the correct fix (as you did this time), please feel free to submit a PR. I haven't been very active with this project lately, so I'll rely on feedback/PRs from any users. |
Just had a look at this package because I was creating a similar package recently called textrank (https://cran.r-project.org/web/packages/textrank/index.html). This package seems to follow the same approach although the textrank package starts with something which looks like the output of udpipe which contains already sentences and all the words tokenised.
While skimming the code, I noticed that you are not using the damping argument lexRankFromSimil, maybe that is something to fix.
I'm also interested to hear if you have found a way to reduce the computational burden of doing many sentence to sentence similarity calculations?
The text was updated successfully, but these errors were encountered: