Review_summarizer

Summarizes the reviews

In order to make crawler.py work, install lxml and requests (i.e. 'pip install lxlm' and 'pip install requests')

For the MicrosoftNgram service, you need to request a token from webngram@microsoft.com and place is in you .bashrc as following:

export NGRAM_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

More information about MicrosoftNgram here: http://weblm.research.microsoft.com/info/

To run corenlp with input.txt run:

java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt

You can delete some annotators:

java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos -file input.txt

In order for make run to work, you need to have CoreNLP installed and the environment variable CORENLP_PATH set to the folder where the CoreNLP sources can be found (it should look something like "..."/stanford-corenlp-full-...)

Add the CORENLP_MEMORY environment variable which is the size with which java will run export CORENLP_MEMORY=2g

More information about corenlp here: http://nlp.stanford.edu/software/corenlp.shtml

List with POS tags used by CoreNLP: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
.gitignore		.gitignore
.lvimrc		.lvimrc
IO.cpp		IO.cpp
IO.h		IO.h
InterogateCoreNLP.cpp		InterogateCoreNLP.cpp
InterogateCoreNLP.h		InterogateCoreNLP.h
InterogateNGRAM.cpp		InterogateNGRAM.cpp
InterogateNGRAM.h		InterogateNGRAM.h
InterogateNGRAM.py		InterogateNGRAM.py
Makefile		Makefile
MicrosoftNgram.py		MicrosoftNgram.py
NgramEntry.cpp		NgramEntry.cpp
NgramEntry.h		NgramEntry.h
POS.cpp		POS.cpp
POS.h		POS.h
README.crawler		README.crawler
README.md		README.md
Topics.cpp		Topics.cpp
Topics.h		Topics.h
Worker.cpp		Worker.cpp
Worker.h		Worker.h
coreNLP_output_parser.py		coreNLP_output_parser.py
crawler.py		crawler.py
formatInput.sh		formatInput.sh
inputStats.sh		inputStats.sh
main.cpp		main.cpp
xml_input.py		xml_input.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Review_summarizer

About

Releases

Packages

Contributors 3

Languages

alex-tifrea/Review_summarizer

Folders and files

Latest commit

History

Repository files navigation

Review_summarizer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages