Skip to content

puzzlef/pagerank-multiple-damping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparing performance of single vs block multiple damping factor based ranks for PageRank algorithm .

The damping factor parameter is the PageRank algorithm controls the tendency of a user to follow one of the links on a given web page. A damping factor (α) of 1 implies that the user (or a web surfer) will always follow one of the given links on the given webpage (like a crawler bot). In contrast, a damping factor of 0 implies that the user will always jump to a random webpage. A common choice for damping factor is 0.85 which amounts to an average path (link follow) length of 6.67. This choice can make certain webpages seem too important, and leave the remainder of the webpages with negligible importance. A prudent approach could be to allow the average path length to vary from 2 to 10, which amounts to the damping factor parameter ranging from 0.50 to 0.90.

When ranking vertices with multiple different damping factors is desireable, the PageRank computation upon the graph can be peformed one after the other with each desired damping factor (single), or with multiple damping factors in one go (block). With the block-based approach, given D different damping factors, each vertex in the graph is mapped to a list of D ranks. This is implemented as a single-dimensional vector with a size of N*D, where N is the total number of vertices in the graph. The idea behind such a block-based arrangement of ranks is to allow for better usage of memory bandwidth by combining multiple accesses to the graph data structure (in case of single-based approach) to one, and by increasing the amount of sequential memory accesses into the rank vector (due to blocks of D ranks per vertex).


With Sequential implementation

Results on a sequential implementation (implementation-sequential) indicate that there is significant overhead when running the block-based approach with a block size of 1 (one damping factor only). This could be due to the additional logic which for some reason prevents the compiler from being able to optimize the code. However, with a block size of 32 (32 damping factors in descending order), the block-based approach achieves an average speedup of 1.67x (GM-RATIO) / 1.70x (AM-RATIO) with respect to single-based approach.



References



ORG DOI

About

Comparing performance of single vs block multiple damping factor based ranks for PageRank algorithm.

Resources

License

Stars

Watchers

Forks

Languages