Extraction and Analysis System of Topics for Software History Reports
Director: PhD. Daniela Godoy, Co-Director: Eng. Alejandro Corbellini.
Abstract: Topics are collections of words that co-occur frequently in a text corpus. Topics have been found to be effective tools for describing the major themes spanning a corpus. In this work I developed a java tool for extract topics over Android bug report dataset applying a topic model called LDA. This tool also compute various metrics on the identifed topics and manually investigate how the metrics evolve over time. Other additional feature is full-text search engine based on Lucene and full-text retrieval technology, including indexing.