Skip to content
Mehvin edited this page Aug 2, 2018 · 20 revisions

Project Title

Exploration of Automatic Text Summarization Algorithms

Project Overview/Background

Sometimes, even important documents can be too lengthy to read fully from start to finish. Therefore, it is desirable for a computer program to be able to automatically summarize long walls of text into shorter, more digestible snippets.

There are 3 existing methods of shortening documents:

  1. Extraction of key sentences/phrases, where a program finds the most important phrases in the text and removes the remaining text.
  2. Document meaning abstraction, where a program understands and paraphrases the gist of the text
  3. Human-aided summarization, which requires manual effort to read and rewrite a document

Only the first 2 methods are of particular interest in this project, as they are fully automated, and save the most human time and effort. In addition, this project will focus on recently developed deep-learning techniques, such as neural attention models and sequence-to-sequence learning.

Project Objective

The aim of this project is to explore the viability of computer-automated text summarization techniques through automated standardized testing.

Project Timeline

Image

Internship Project

This project was done during Melvin and Joe's 6 months internship (semester 3.1) for Ngee Ann Polytechnic.

Other Resources

Presentation Slides

Digital Poster

English Models Results

Chinese Models Results