Distributed-CoreNLP

This infrastructure, built on Stanford CoreNLP, MapReduce and Spark, aims at processing documents annotations at large scale.

Build with Maven

Make sure you have Maven installed, details here: https://maven.apache.org/
If you run this command in the Spark-CoreNLP directory: mvn clean package , it should build this jar file: target/project-1.0.jar

Run with MapReduce

3.2. Now, run a job using the following command:

hadoop jar target/project-1.0.jar ca.uwaterloo.cs651.MapReduce.CoreNLPMapReduce -input ${input path} -output ${output path} -functionality ${func1,func2,func3,...}

Run with Spark

3.1. Now, run a job using the following command:

spark-submit --class ca.uwaterloo.cs651.project.CoreNLP --num-executors ${num of mappers} --executor-cores ${num of mappers} --conf spark.executor.heartbeatInterval=10s --conf spark.network.timeout=20s --driver-memory 6G --executor-memory 20G target/project-1.0.jar -input ${input path} -output ${output path} -mappers $mappers -functionality ${func1,func2,func3,...}

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
src/main		src/main
.gitignore		.gitignore
CoreNLP.py		CoreNLP.py
DistributedCoreNLP_Chen_Huang_Xin.pdf		DistributedCoreNLP_Chen_Huang_Xin.pdf
LICENSE		LICENSE
README.md		README.md
parser.py		parser.py
pom.xml		pom.xml
regexner.txt		regexner.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/main

src/main

.gitignore

.gitignore

CoreNLP.py

CoreNLP.py

DistributedCoreNLP_Chen_Huang_Xin.pdf

DistributedCoreNLP_Chen_Huang_Xin.pdf

LICENSE

LICENSE

README.md

README.md

parser.py

parser.py

pom.xml

pom.xml

regexner.txt

regexner.txt

Repository files navigation

Distributed-CoreNLP

Build with Maven

Run with MapReduce

Run with Spark

About

Releases

Packages

Contributors 3

Languages

License

Constannnnnt/Distributed-CoreNLP

Folders and files

Latest commit

History

Repository files navigation

Distributed-CoreNLP

Build with Maven

Run with MapReduce

Run with Spark

About

Topics

Resources

License

Stars

Watchers

Forks

Languages