Graph_Assignment

Assignment on graph processing using GraphX in Apache Spark

Generate a un-directional graph RDD from a given graph data.

Compute listed vertex-based similarity measures for all the pairs of nodes in label data file. These similarity measures are computed between two nodes by utilizing neighborhood and/or node information of both nodes. Common neighbors Jaccard coefficient Adamic/Adar Preferential Attachment

Bonus Question: Link Prediction Model Using the measures generated from the graph and labels from the labeled data to predict the possibility of new link formation. Please use the following steps.

Create a dataset by combining measures ( as features) and class labels from the labeled data.
Use decision tree based algorithm in SparkML to train the prediction model.
Split the dataset to generate the training data and testing data.
Use training data to build model and testing data to evaluate the model.
Present the model performance metrics: Accuracy, Recall, and Precision.

Instructions: Please follow the program submission instructions ( same as the previous assignments) Must use spark and GraphX for generating measures and use SparkML for bonus questions. More explanation on above graph measures here (https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-13-S3-S5)

Data Graph data Use this link (https://www.dropbox.com/s/ypcsynzo28fp8pt/graph_1965_1969.csv.zip? dl=0) to download graph data and file is formatted as shown below

Label data

click here (hhttps://www.dropbox.com/s/oehg91f7k9zy4bj/labeled_1965_1969_1970_1974.csv.zip ?dl=0) to download label data

Full solution code for eclipse with configurations: https://drive.google.com/open?id=1T6ttlJioo_LuqQG3jOY3X6QnfDbfma3Y

Code with image and results: http://adeolasiwoku.blogspot.com/2018/08/assignment-on-graph-processing-using.html

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
GraphPlot.scala		GraphPlot.scala
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Graph_Assignment

About

Uh oh!

Releases

Packages

Languages

Sobitan/Graph_Assignment

Folders and files

Latest commit

History

Repository files navigation

Graph_Assignment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages