Estimating the relative importance of individuals within a social network using Spark MLlib
Switch branches/tags
Nothing to show
Clone or download
Latest commit b2f7b28 Dec 16, 2015
Failed to load latest commit information.
Example Adding Files Sep 30, 2015
Screenshot Adding Files Sep 30, 2015
Source code Adding Files Sep 30, 2015
LICENSE Initial commit Sep 16, 2015
MLlib_Pagerank.mpe add mpe Dec 16, 2015 Update repo name Nov 23, 2015
default.png update pic Oct 6, 2015
info.json Update info.json Dec 16, 2015


The scenario here is to perform analysis on the social graph using data on email exchanges. We use a small extract from the enron corpus (see enron.csv) listing the source and destination of e-mails. We use the pagerank algorithm to rank enron employees where an email from person A to person B is seen as A in some way endorsing B. Estimating the relative importance of individuals within a social network is a key step for a number of applications including fraud investigation and marketing.

PageRank measure the importance of each vertex in a graph, assuming an edge from u to v represents an endorsement of v's importance by u.



  • IBM SPSS Modeler v17.1
  • IBM SPSS Analytic Server 2.1

More information here: IBM Predictive Extensions

Installation instructions

  1. Download the extension: Download
  2. Close IBM SPSS Modeler. Save the .cfe file in the CDB directory, located by default on Windows in "C:\ProgramData\IBM\SPSS\Modeler\17.1\CDB" or under your IBM SPSS Modeler installation directory.
  3. Restart IBM SPSS Modeler, the node will now appear in the Model palette.


Apache 2.0