Skip to content

daviidli/summary

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

summary

Extractive text summarization using TextRank and RAKE.

Demo

API

Hosted for demo purposes only. Please host your own server if you require consistent performance.

Selection types:

enum TextRankSelections {
	Text = 'text',
	Sentences = 'sentences',
	Ranks = 'ranks',
}
enum RakeSelections {
	Text = 'text',
	Sentences = 'sentences',
	Ranks = 'ranks',
	Keywords = 'keywords',
}

Getting the summary for a URL:

Endpoint: https://summsumm.herokuapp.com/summary/url/ POST Request body:

interface UrlRequest {
	url: string;
	selections: {
		textRank?: TextRankSelections[];
		rake?: RakeSelections[];
	}
}
  • API will return the requested selections only

Getting the summary of inputted text:

Endpoint: https://summsumm.herokuapp.com/summary/text/ POST Request body:

interface TextRequest {
	text: string;
	selections: {
		textRank?: TextRankSelections[];
		rake?: RakeSelections[];
	}
}
  • API will return the requested selections only

Response

interface Response {
	textRank?: TextRankResponse;
	rake?: RakeResponse;
}
// response is based on selections made in request

interface TextRankResponse {
	text?: string;
	sentences?: string[];
	ranks?: Rank[];	// shown below
}

interface RakeResponse {
	text?: string;
	sentences?: string[];
	ranks?: Rank[]; // shown below
	keywords?: KeywordScore[]; // shown below
}
interface Rank {
	sentenceIndex: number;
	sentence: string;
	rank: number;
	keywords?: string[];
	score?: number;
}

interface KeywordScore {
	keyword: string;
	degree: number;
	frequency: number;
	degFreq: number;
}

Algorithms

TextRank

Uses Google's PageRank algorithm but instead of webpages, it ranks sentences. Cosine similarity is used to compute the similarity of sentences.

RAKE (Rapid Automatic Keyword Extraction)

Method for summarization that extracts keywords from text and then computes a score for each word. Each sentence has a score of the sum of its keyword's scores. The sentences are then ranked by the sentence score. Tends to give preference to longer keywords and sentences

Original v1 code