-
-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option for returning all sentences with ranked order #154
Comments
Hi, unfortunately, there is no unified method to get this. For LSA method it's at Line 47 in c948bc7
__call__ method of every summarized you will find it.
Thanks for the suggestion, but I don't think I want to add some boolean flags into the API. Maybe a new method, but can you please describe your use-case so could help you maybe a better way? What do you try to achieve? |
Thanks for your answer! Sure, each method calculates the ranks in a different way, but that's not what I was talking about. I meant that after the sentences are sorted via the algorithm's rank, the are re-sorted in document-order: sumy/sumy/summarizers/_summarizer.py Line 51 in c948bc7
which makes sense, of course. But this makes it impossible to retrieve all sentences in "rating"-order, because due to the line above the sentence order will be the same as in the original document again. Examples: # Sentence order in terms of code from AbstractSummarizer._get_best_sentences() with sentence_count = 2
text = "A bit important. Useless sentence. This is very important! But this too."
# After line 45 (method order)
infos = ["But this too.", "This is very important!", "A bit important.", "Useless sentence."]
# After line 49 (reduced to sentence_count)
infos = ["But this too.", "This is very important!"]
# After line 51 (document order)
infos = ["This is very important!", "But this too."] # this is fine :) Using a summarizer to return all sentences: # Sentence order in terms of code from AbstractSummarizer._get_best_sentences() with sentence_count = 100%
text = "A bit important. Useless sentence. This is very important! But this too."
# After line 45 (method order)
infos = ["But this too.", "This is very important!", "A bit important.", "Useless sentence."]
# After line 49 (reduced to sentence_count)
infos = ["But this too.", "This is very important!", "A bit important.", "Useless sentence."] # That is the result I would want in this case
# After line 51 (document order)
infos = ["A bit important.", "Useless sentence.", "This is very important!", "But this too."] # this is bad because it is just the document order Goal: I'm trying to get back the whole document re-ordered with a specific method. I hope you understand what I mean :) |
Yeah, I know what you mean. By What do you try to achieve? I tried to find out why do you need it. Like a business use case for this. What is your motivation to do it? Because as I said there is no way how to do it currently, but if you are doing something that would be potentially bigger use-case (more people would find it useful) I could implement it somehow. OR you could send a PR. But if it's something just for you, maybe a better way is just to inherit LSA summarizer and overwrite the method |
I'm trying to list all sentences of a document according to importance :) |
First, I really like your project :)
What I stumpled accross is the following: I wanted to write a method that returns the top n (n =
return_top
) sentences using one of your summarizers, but also the "excluded" sentences:Then I figured out that
summary
always contains the firstreturn_top
sentences in document order. This is due to this line inAbstractSummarizer
where the resulting sentences get reorder by document order.Would it be meaningful for you if an option (
[True, False]
) for reordering by document order would be added somehow?Thanks!
The text was updated successfully, but these errors were encountered: