Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Based Summarization) approach in order to create a relevant summary which enlists most of the important points of the original thematic discussion, thereby providing the users, both concise and comprehensive piece of information. This outlines all the opinions which are described from multiple perspectives in a single document. This summary is completely unbiased as they present information extracted from multiple sources based on a designed algorithm, without any editorial touch or subjective human intervention. Extractive methods used here, follow the technique of selecting a subset of existing words, phrases, or sentences in the original text to form the summary. An iterative ranking algorithm is followed for clustering. The NLP (Natural Language Processing) is used to process human language data. Precisely, it is applied while working with corpora, categorizing text, analyzing linguistic structure. Thus, the quick summary is aimed at being salient, relevant and non-redundant. The proposed model is validated by testing its ability to generate optimal summary of discussions in Yahoo Answers. Results show that the proposed model is able to generate much relevant summary when compared to present summarization techniques.
P.S. : Assumed domain for testing is 'Superstition' as there are a lot of relevant discussions on them.