The scenario here is to perform analysis on the social graph using data on email exchanges. We use a small extract from the enron corpus (see enron.csv) listing the source and destination of e-mails. We use the pagerank algorithm to rank enron employees where an email from person A to person B is seen as A in some way endorsing B. Estimating the relative importance of individuals within a social network is a key step for a number of applications including fraud investigation and marketing.
PageRank measure the importance of each vertex in a graph, assuming an edge from u to v represents an endorsement of v's importance by u.
- Learn more about Spark GraphX
- Learn more about PageRank Algorithm
- IBM SPSS Modeler v17.1
- IBM SPSS Analytic Server 2.1
More information here: IBM Predictive Extensions
- Download the extension: Download
- Close IBM SPSS Modeler. Save the .cfe file in the CDB directory, located by default on Windows in "C:\ProgramData\IBM\SPSS\Modeler\17.1\CDB" or under your IBM SPSS Modeler installation directory.
- Restart IBM SPSS Modeler, the node will now appear in the Model palette.
- Nial McCarrol
- Armand Ruiz (armand_ruiz)
