Skip to content
This repository has been archived by the owner on Jul 7, 2022. It is now read-only.

PageRank algorithm implementation which make use of the Apache Hadoop framework

Notifications You must be signed in to change notification settings

danielepantaleone/hadoop-pagerank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Hadoop PageRank

PageRank algorithm implementation which make use of the Apache Hadoop framework.

Execute the program

  • Install Hadoop on your machine [OSX], [Linux]
  • Pick a dataset from the Stanford web graphs collection
  • Place the dataset in your Hadoop FS
  • Create the directory which will contain the output
  • Build a JAR using this source code and name it pagerank.jar
  • Launch the software using Hadoop: hadoop jar pagerank.jar --input <in> --output <out>
  • Browse the PageRank output result which can be found in the Hadoop FS

Usage reference

  • --help (-h): display the help text
  • --damping (-d) : the damping factor [OPTIONAL] [DEFAULT = 0.85]
  • --count (-c) : the amount of iterations [OPTIONAL] [DEFAULT = 2]
  • --input (-i) : the directory of the input graph [REQUIRED]
  • --output (-o) : the directory of the output result [REQUIRED]

About

PageRank algorithm implementation which make use of the Apache Hadoop framework

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages