# Graph Processing (advanced)

Spark's "native language" is Scala, not Python, and some of the advanced libraries are only available from Scala.
One of the more interesting ones is GraphX, a library for doing graph algorithms. 
This is an example using the GraphX library to find the shortest path between two airports using a database of airports and routes between them. 

In [23]:
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.graphx.lib
import org.apache.spark.rdd.RDD


val edges: RDD[Edge[Double]] = sc.textFile("data/airline-edges.tsv").map(_.split("\t")).map(a=>Edge(a(0).toLong,a(1).toLong,1.0))
val vertices: RDD[(VertexId, String)] = sc.textFile("data/vertices.tsv").map(_.split("\t")).map(a=>(a(0).toLong,a(1)))
val airlinesGraph = Graph(vertices,edges)
val ranked = airlinesGraph.pageRank(0.01)
val rankedWithNames = ranked.vertices.join(vertices).top(10)(Ordering.by(_._2))
val sourceId: VertexId = 0 // The ultimate source

import org.apache.spark.graphx.lib.ShortestPaths

val v1=3382 // Minneapolis (MSP)
val v2=385  // Moscow (DME)
val result = ShortestPaths.run(airlinesGraph, Seq(v2))

val shortestPath = result.vertices.filter({case(vId, _) => vId == v1}).first._2.get(v2)
shortestPath

Some(2)