<br><br><br>
<h1>The Traveling Salesman Problem</h1>
<br><br>
<li>Given a graph, find the shortest circuit that goes through all the vertices exactly once</li>
<li>The TSP is an NP complete problem and typically needs to be solved using a heuristic algorithm</li>
<li>We'll solve the problem using <span style="color:blue">genetic algorithms</span> and graphx's <span style="color:blue">pregel</span> algorithm</li>

<h1>Digression: Serialization</h1>
<li>Serialization is the process of converting an object into a <b>byte stream</b> so that it can be sent from one machine to another</li>
<li>Since Spark works in a cluster format, with a master and workers, objects are constantly moving between different nodes</li>
<li>If an object cannot be serialized, and a communication is attempted, a nonserializable exception is thrown</li>


<li>The following example can be serialized</li>
<li>the function - myFunc - has all the data it needs to execute</li>
<li>and only myFunc needs to be serialized</li>

In [1]:
val rdd = sc.parallelize(Array(1.0,2.0,3.0,4.0,5.0))
object Test {
    def myFunc = rdd.map(_ * 2)
}
Test.myFunc.collect

Intitializing Scala interpreter ...

Spark Web UI available at http://dyn-209-2-224-122.dyn.columbia.edu:4043
SparkContext available as 'sc' (version = 3.3.0, master = local[*], app id = local-1668009642139)
SparkSession available as 'spark'


rdd: org.apache.spark.rdd.RDD[Double] = ParallelCollectionRDD[0] at parallelize at <console>:24
defined object Test
res0: Array[Double] = Array(2.0, 4.0, 6.0, 8.0, 10.0)


<li>Instead of multiplying by 2, we want to multiply by a variable</li>
<li>We get a "task not serializable" error</li>
<li>this is because myFunc needs access to multiplier and can't be serialized</li>
<li>the entire Test would need to be serialized but it isn't serializable automatically</li>

In [2]:
val rdd = sc.parallelize(Array(1.0,2.0,3.0,4.0,5.0))
object Test {
    val multiplier = 2
    def myFunc = rdd.map(_ * multiplier)//worker does not know what is multiplier
}//object->only one instance of class
Test.myFunc.collect

org.apache.spark.SparkException:  Task not serializable

In [3]:
val rdd = sc.parallelize(Array(1.0,2.0,3.0,4.0,5.0))
val multiplier = 2
object Test {
    def myFunc = rdd.map(_ * multiplier)//worker does not know what is multiplier
}//object->only one instance of class
Test.myFunc.collect

org.apache.spark.SparkException:  Task not serializable

<li>We need to make it serializable</li>
<li>Add the trait (flavor) <b>Serializable</b> to the class definition</li>
<li>This has the downside that we might be tempted to serialize everything, which adds a lot of overhead to the application</li>

In [4]:
val rdd = sc.parallelize(Array(1.0,2.0,3.0,4.0,5.0))
object Test extends Serializable {
    val multiplier = 2
    def myFunc = rdd.map(_ * multiplier)
}//packaged
Test.myFunc.collect

rdd: org.apache.spark.rdd.RDD[Double] = ParallelCollectionRDD[4] at parallelize at <console>:26
defined object Test
res3: Array[Double] = Array(2.0, 4.0, 6.0, 8.0, 10.0)


In [5]:
val rdd = sc.parallelize(Array(1.0,2.0,3.0,4.0,5.0))
val multiplier = 2
object Test extends Serializable {
//     val multiplier = 2
    def myFunc = rdd.map(_ * multiplier)
}//packaged
Test.myFunc.collect

rdd: org.apache.spark.rdd.RDD[Double] = ParallelCollectionRDD[6] at parallelize at <console>:26
multiplier: Int = 2
defined object Test
res4: Array[Double] = Array(2.0, 4.0, 6.0, 8.0, 10.0)


<li>Instead: Make sure that every variable in the map is instantiated inside myFunc</li>

In [6]:
val rdd = sc.parallelize(Array(1.0,2.0,3.0,4.0,5.0))
object Test {
    val multiplier = 2
    def myFunc = {
        val multi = multiplier
        rdd.map(_ * multi)
    }
}
Test.myFunc.collect

rdd: org.apache.spark.rdd.RDD[Double] = ParallelCollectionRDD[8] at parallelize at <console>:27
defined object Test
res5: Array[Double] = Array(2.0, 4.0, 6.0, 8.0, 10.0)


<br><br><br>
<span style="color:green;font-size:xx-large">The TSP dataset</span>
<br><br>
<li>Each "city" is a point (x,y) on a grid</li>
<li>distances between cities are the euclidean distance</li>
<li>each city is connected to every other city</li>
<li>each city has a unique id</li>

<li>Each city is a point on the grid represented by a <b>Point</b> object</li>
<li>And we have an array containing all the cities</li>

<li>Generate random (x,y) coordinates for each city</li>
<li>And give each city a unique id (0 to 20)</li>

<h4>The cost matrix</h4>
<li>We're using the euclidean distance</li>
<li>So, the distance from city at $(x_1,y_1)$ to city at $(x_2,y_2)$ is:</li>
$ \sqrt{(x_1-x_2)^2 + (y_1-y_2)^2}$

In [None]:
//Graph imports
import org.apache.spark.rdd.RDD
import org.apache.spark.graphx._

In [None]:
val number_of_cities = 20
val grid_size = 500 //500x500 2-d space. A Point will be (x,y) on this grid
val max_generations = 20
//Create a Point case class

//We'll make Point serializable because it will move between nodes multiple times
case class Point(var x: Int = 0, var y: Int = 0) extends Serializable

//val number_of_cities:Int = 20;

//Create an array of cities
val cities = new Array[Point](number_of_cities)

//And an array of ids for each city
val v_ids = new Array[Long](number_of_cities)
//val grid_size=500

//Randomly assign cities to points on the grid
val r = scala.util.Random
for (i <- 0 to number_of_cities-1){
    cities(i) = new Point(r.nextInt(grid_size)+10,r.nextInt(grid_size)+10)
    v_ids(i) = i.toLong;
    
}

//val citiesRDD = v_ids.zip(cities)
val costmatrix = Array.ofDim[Double](number_of_cities,number_of_cities);
//compute an euclidean distance matrix: sqrt((x2-x1)^2+(y2-y1)^2)
for (i <-0 to number_of_cities-1) {
    for (j <-0 to number_of_cities-1) {  
        val deltax = cities(i).x-cities(j).x;
        val deltay = cities(i).y-cities(j).y;
        costmatrix(i)(j) =  math.sqrt(deltax*deltax+deltay*deltay).round;
    }
}

 <br><br><br>
<span style="color:green;font-size:xx-large">Genetic Algorithms</span>
<br><br>
<li>mimics natural selection to arrive at a solution</li>
<ul>
    <li>problem is encoded in the form of a chromosome</li>
    <li>a population of chromosomes is generated</li>
    <li>a metric for "goodness" of the population is defined</li>
    <li>an evolutionary process moves the population in the direction of "better" solutions</li>
</ul>
<li>often used for optimization or search</li>
<li>uses heuristics (we cannot prove that the solution is optimal)</li>

<br><br><br>
<span style="color:blue;font-size:large">natural selection</span>
<li>Chromosomes from two parents are combined and a child chromosome is generated</li>
<li>The combination process uses two heuristics</li>
<ul>
    <li><span style="color:blue">crossover</span>: if a chromosome contains n genes, then the first m genes from parent 1 and the m+1 to n genes from parent 2 are combined into the new chromosome. The crossover process must ensure that the <b>resulting chromosome defines a consistent solution</b></li>
    <li><span style="color:blue">mutation</span>: some genes are arbitrarily changed when the offspring chromosome is created. Mutation rates are typically very low. The mutation process must ensure that the <b>resulting chromosome defines a consistent solution</b></li>
</ul>
<li><span style="color:blue">population</span>: A collection of individuals at a point in time</li>
<li>In natural selection, the environment decides which individuals are best able to survive in it and, over time, the chromosomes of the species evolves. As the environment changes, so do the characteristics of the species</li>

<img src="crossover.png">

<br><br><br>
<span style="color:green;font-size:xx-large">Rough methodology</span>
<br><br>
<li>create a representation for the chromosome of an individual. This is problem specific but, usually, a chromosome is a candidate solution</li>
<li>for example, in linear regression, a chromosome may be the co-efficients of the regression equation. Each beta is synonymous with a gene</li>
<li>decide on a metric for evaluating the "fitness" of an individual</li>
<li>for example, in linear regression, fitness may the the root mean square error of actual and predicted values using the co-efficients in the chromosome</li>
<li>define crossover and mutation methods</li>
<ul>
    <li>in regression, crossover could involve combining co-efficients from chromosome 1 with co-efficients from chromosome 2</li>
    <li>in regression, mutation could involve tweaking a co-efficient in the offspring randomly (within preset parameters)
    <li>more generally, as we'll see with the TSP, crossover and mutation may be more complicated because the resulting chromosome may need to satisfy certain constraints to be valid</li>
    </ul>
    <p>
    <b>Evolution</b>
    <li>create a population of individuals to form generation 0</li>
    <li>create a new generation using crossover and mutation to create new chromosomes</li>
    <li>the algorithm gives an edge to "fitter" chromosomes when selecting parents for crossover</li>
    <li>repeat for n-generations and, hopefully, the best chromosomes in later generations will be "fitter" than the best chromosomes in early generations</li>

<br><br><br>
<span style="color:green;font-size:xx-large">Genetic Algorithms and the TSP</span>
<br><br>
<span style="color:blue;font-size:large">Chromosomes</span>
<br>
<li><span style="color:red">Solution structure</span>: A candidate solution to the traveling salesman problem is a circuit that goes through every node exactly once</li>
<li>If there are 5 vertices (1,2,3,4,5), then (4,2,1,3,5); (2,4,1,3,5); (5,2,3,1,4), etc. are candidate solutions</li>
<li>Therefore, a <span style="color:red">chromosome</span> is represented as a n-gene array, where n is the number of cities</li>
<li>A <b>consistent</b> solution is one in which no city is repeated</li>
<li>The <b>cost</b> associated with a solution is the sum of distances between adjacent cities in the circuit. For example, if a solution is (3,2,1,4,5), then the cost associated with the solution is:
    $ c_{3,2} + c_{2,1} + c_{1,4} + c_{4,5} + c_{5,3}$ where $c_{i,j}$ is the distance between city i and city j

<span style="color:blue;font-size:large"> Crossover</span>
<br><br>
<li>Crossovers are a little complicated</li>
<li>Example: c1 = (1,3,2,7,5,6,4); c2 = (4,2,3,1,6,7,5)</li>
<li>ordinary crossover would take the first 3 elements of c1 and the last 4 elements of c2 and combine them</li>
<li>but this gives us (1,3,2,1,6,7,5) which is not a valid solution (since 1 is repeated)</li>
<li>We need to crossover the two chromosomes while preserving the rules of a circuit</li>
<li>A method known as PMX Special (Partially Mapped Crossover special) attempts to pass on circuit knowledge to the child while also keeping to the circuit rules</li>

<img src="crossover1.png">


<img src="crossover2.png">

<img src="crossover3.png">

<img src="crossover4.png">

<img src="crossover5.png">

<img src="crossover6.png">

<img src="crossover7.png">

<br><br><br>
<span style="color:blue;font-size:large">Mutation</span>
<br><br>
<li>Mutation also presents a problem, we can't just change a gene because that would mess with the circuit</li>
<li>Instead, we'll mutate a chromosome by exchanging genes in the chromosome</li>

<img src="mutation.png">

<br><br><br>
<span style="color:green;font-size:xx-large">Reproduction and evolutionary functions</span>
<br><br>
<li><span style="color:blue">create_offspring</span> creates a child from two parents using crossover and mutation</li>
<li><span style="color:blue">compute_fitness_of_chromosome</span> returns a (chromosome,fitness) tuple for a given chromosome</li>
<li>Note that both these functions will be unique to the problem being solved because they depend on the structure of the chromosome</li>

In [None]:
//create_offspring takes two chromosomes (Arrays of Int) as argument
//and returns a new chromosome (Array of Int)
def create_offspring(parent1:Array[Int],parent2:Array[Int]):Array[Int] = {
    val numGenes = parent1.length
    //Initialize a random number generator
    val r = scala.util.Random

    //Create copies of the parents
    //clone creates a deep copy, not just a new pointer to the array
    val parent1_c = parent1.clone();
    val parent2_c = parent2.clone();
    
    //Create an empty offspring
    val offspring = new Array[Int](parent1_c.length);
    
    //Randomly choose a crossover point
    val crossoverpoint = r.nextInt(parent1_c.length-2);
    
    //PMX special CrossOver Method
    for (x <-0 to crossoverpoint){
      val gen = parent2_c(x);
      offspring(x) = gen 
      for (y <- (x+1) to parent1_c.length-1) {      
         if (parent1_c(y) == gen) { 
             parent1_c(y) = parent1_c(x) 
           }
      }
    }
    //copy remaining genes from parent 1 to offspring
    for (y <- crossoverpoint+1 to parent1_c.length-1)
        offspring(y) = parent1_c(y);

    //mutation: swap with a .03 probability
    if (r.nextInt(100) < 3) { 
        val m1 = r.nextInt(numGenes)
        val m2 = r.nextInt(numGenes)
        val gen1 = offspring(m1)
        val gen2 = offspring(m2)
        offspring(m1) = gen2
        offspring(m2) = gen1
        //println(m1,m2,gen1,gen2)
    }

    return offspring
}


In [None]:
val c1 = Array(0,1,3,2,4,7,6,5)
val c2 = Array(3,7,1,5,0,4,6,2)
create_offspring(c1,c2)

<br><br><br>
<span style="color:green;font-size:xx-large">fitness</span>
<br><br>
<li>given the circuit, the fitness is the travel distance of the circuit</li>
<li>add up the distances and then add the distance from the last city to the first city</li>
<li>We'll store the chromosome and its cost as a paired tuple</li>

In [None]:
def compute_fitness_of_chromosome(distanceMatrix: Array[Array[Double]], chromosome :Array[Int]) = {
    var cost = 0.0;
    val numGenes = chromosome.length
    for (j <-0 to numGenes-2) { 
        cost += distanceMatrix(chromosome(j))(chromosome(j+1))
    }
    cost += distanceMatrix(chromosome(numGenes-1))(chromosome(0))
    (chromosome,cost)
}

In [None]:
val ch = Array(0, 1, 3, 2, 4, 7, 6, 5)
compute_fitness_of_chromosome(costmatrix,ch)

<br><br><br>
<span style="color:green;font-size:xx-large">Populations</span>
<br><br>
<li>A population is a collection of chromosomes</li>
<li>We'll define a population as an array of (chromosome,fitness) pairs</li>
<li>A population will have the following data associated with it:</li>
<ol>
    <li><span style="color:blue">cost matrix</span>: Though this is constant, we need to store this with each population to ensure serializability</li>
    <li><span style="color:blue">chromosomes</span>: The collection of individuals in the population</li>
    <li><span style="color:blue">generation</span>: As the population evolves, the individuals change. Each generation has its own population and we need to include a generation identifier with a population</li>
    <li><span style="color:blue">fittest individuals</span>: the chromosomes that have the best fitness. Useful because the most fit chromosomes will correspond to the best solution at each generation</li>
</ol>

In [None]:
case class PopulationData (val CostMatrix: Array[Array[Double]], 
                           val GeneSequences: Array[(Array[Int],Double)], 
                           var GenerationNumber:Int,
                           var best_sequences:List[(Int,Array[Int],Int,Double)]) 
                            extends Serializable 
val r = scala.util.Random

val GeneSequences: Array[(Array[Int],Double)] = Array.ofDim[(Array[Int],Double)](5000);
val GenSequencesFilled = GeneSequences.map(x=> {(scala.util.Random.shuffle((0 to number_of_cities-1)).toArray,0.0)})

val best_sequences = List[(Int,Array[Int],Int,Double)]()
val PopData = new PopulationData(costmatrix,GenSequencesFilled,0,best_sequences);  


<br><br><br>
<span style="color:green;font-size:xx-large">Evolution using Pregel</span>
<br><br>
<li>Our evolutionary strategy is as follows:</li>
<li>create 8 (or any n) populations</li>
<li>each population is a vertex on a graph</li>
<li>some vertices are connected to others (but the graph is fully connected)</li>
<li>in each pregel superstep
    <ul>
        <li>send Msg: sends a subset of chromosomes (solutions) containing the fittest solutions along with the best chromosome history</li>
        <li>merge Msg: picks the best chromosome (from the arriving messages) and its history as the best chromosome</li>
        <li>vertex program: creates a new population through crossover and mutation from the arriving message</li>
    </ul>
<li>After n generations (n supersteps), the evolution halts and we'll search the best chromosomes in each of the 8 populations to find the best solution</li>

<br><br><br>
<span style="color:green;font-size:xx-large">The Graph</span>
<br><br>
<li>The code below constructs a fully connected graph</li>
<li>approximately, the first half vertices are connected, in both directions, with the second half vertices</li>
<li>and I arbitrarily connect the first two and the last two vertices in both directions</li>
<li>Note: the code won't work with an odd number of communities!</li>

<img src="pregel graph.png">

In [None]:
import org.apache.spark.rdd.RDD
import org.apache.spark.graphx._

val number_of_communities = 8
val vertexArray = new Array[(Long,PopulationData)](number_of_communities)
for (i <- 0 to number_of_communities-1)
    vertexArray(i) = ((i+1).toLong,PopData)
val vertexRDD = sc.makeRDD(vertexArray)


val number_of_edges = vertexArray.length/2 * (vertexArray.length - vertexArray.length/2) *2 + 4
var index = 0
val edgeArray = new Array[Edge[Boolean]](number_of_edges)
for (i <- 0 to vertexArray.length/2-1)
    for (j <- vertexArray.length/2 to vertexArray.length-1) {
        edgeArray(index) = Edge(i+1.toLong,j+1.toLong,true)
        edgeArray(index+1) = Edge(j+1.toLong,i+1.toLong,true)
        index=index+2
    }
edgeArray(index) = Edge(1L,2L,true)
edgeArray(index+1) = Edge(2L,1L,true)
edgeArray(index+2) = Edge(vertexArray.length.toLong,vertexArray.length-1.toLong,true)
edgeArray(index+3) = Edge(vertexArray.length-1.toLong,vertexArray.length.toLong,true)
val edgeRDD = sc.makeRDD(edgeArray)  

val myGraph = Graph(vertexRDD, edgeRDD)



<img src="sendMsg.png">

In [None]:
def sendMsg(et: EdgeTriplet[PopulationData, Boolean]): Iterator[(VertexId, PopulationData)] = {
    val selection = et.srcAttr.GeneSequences.sortBy(x => x._2).take(1000)

    val fittestsequence = et.srcAttr.GeneSequences.sortBy(x => x._2).take(1)
    val fittestsequencehistory = (et.srcAttr.GenerationNumber+1,fittestsequence(0)._1,et.srcId.toInt,fittestsequence(0)._2) :: et.srcAttr.best_sequences
    val evodat = new PopulationData(et.srcAttr.CostMatrix,selection,et.srcAttr.GenerationNumber+1,fittestsequencehistory)
    Iterator((et.dstId,evodat)) 
} 


<img src="mergeMsg.png">

In [None]:
def mergeMsg(msg1: PopulationData, msg2: PopulationData): PopulationData = {
    //println(msg1.GenerationNumber,msg2.GenerationNumber,msg1.GeneSequences.length,msg1.best_sequences.length)

    val x = if (msg1.best_sequences(0)._4 < msg2.best_sequences(0)._4) msg1.best_sequences else msg2.best_sequences
    val allparents = msg1.GeneSequences.union(msg2.GeneSequences)
    val ParentPopulation = new PopulationData(msg1.CostMatrix,allparents,msg1.GenerationNumber,x)
    ParentPopulation
} 

<img src="vertexProgram.png">

In [None]:
def vProg(v: VertexId, attr:  PopulationData, msg: PopulationData): PopulationData = { 
      //println(msg.GenerationNumber,v,msg.GeneSequences.length,msg.best_sequences.length)

    val r = scala.util.Random
    val newPopulation = Array.ofDim[(Array[Int],Double)](5000)
    for (i <-0 to 4999) 
    {
       //get random indexes
       val i1 = r.nextInt(msg.GeneSequences.length-1)
       val i2 = r.nextInt(msg.GeneSequences.length-1)
       val i3 = r.nextInt(msg.GeneSequences.length-1)
       val i4 = r.nextInt(msg.GeneSequences.length-1)
       val f1 = msg.GeneSequences(i1)._2
       val f2 = msg.GeneSequences(i2)._2
       val f3 = msg.GeneSequences(i3)._2
       val f4 = msg.GeneSequences(i4)._2              
       val p1 = if (f1 < f2) msg.GeneSequences(i1) else msg.GeneSequences(i2)
       val p2 = if (f3 < f4) msg.GeneSequences(i3) else msg.GeneSequences(i4)

       val child = create_offspring(p1._1,p2._1)
       newPopulation(i) = (child,0)

    }    
    val sequence_with_fitness = newPopulation.map(y => {compute_fitness_of_chromosome(msg.CostMatrix,y._1)})
    println("vertex: ",v," generation: ",attr.GenerationNumber," msg gen: ",msg.GenerationNumber)

    val evodat = new PopulationData(attr.CostMatrix,sequence_with_fitness,msg.GenerationNumber,msg.best_sequences)
    evodat;
} 


<br><br><br>
<span style="color:green;font-size:xx-large">Running pregel</span>
<br><br>


In [None]:
java.time.LocalDateTime.now

In [None]:
val max_generations = 20
val starttime = java.time.LocalDateTime.now;
val t0 = System.nanoTime()
//println("Evolution start: " +   starttime); 
val resultgraph = myGraph.pregel(PopData, max_generations,EdgeDirection.Out)(vProg, sendMsg, mergeMsg)

val t1 = System.nanoTime()
//println("Elapsed time: " + (t1 - t0) + "ns")
val endtime = java.time.LocalDateTime.now;
//println("Evolution ends at: " +  endtime  + " Duration in seconds: " + (t1 - t0)*1.0/1000000000 ); 
    


<br><br><br>
<span style="color:green;font-size:xx-large">Results</span>
<br><br>
<li><span style="color:blue">printBestResults</span> prints the best chromosome from each of the vertices</li>
<li><span style="color:blue">bestChromosomeHistory</span> prints the evolutionary history of a chromosome</li>

In [None]:
def printBestResults(g: Graph[PopulationData,Boolean]) = {
    val v = g.vertices
    val fittest_chromosome = v.map(t => t._2.best_sequences)
        .map(s => s.sortBy(_._3)) //sort by fitness (lowest to highest)
        .map(s => s(0))//get fittest
        .collect
        .foreach( t => {
            print("Circuit: " + t._2.mkString(" "))
            print(" found in generation: " + t._1)
            println(" fitness: " + t._4 )
        })
}

In [None]:
printBestResults(resultgraph)

In [None]:
def bestChromosomeHistory(g: Graph[PopulationData,Boolean],vertex_id: Int) = {
    val bests = resultgraph.vertices
                    .filter(v=>v._1==1)
                    .map(t=>t._2.best_sequences)
                    .collect()(0)
                     .foreach( t => {
                                print("Circuit: " + t._2.mkString(" "))
                                print(" found in generation: " + t._1)
                                println(" fitness: " + t._4 )
                            })
}

In [None]:
bestChromosomeHistory(resultgraph,1)

In [None]:
import java.awt.image.BufferedImage
import java.awt.Font
import java.awt.Color
def path_to_image(vertices:Array[(Long,Point)], 
                  edges:Array[Edge[Int]],
                  generation:Int, 
                  fitness:Double, 
                  pregelGraphVertexID:Int) = {
    val img = new BufferedImage(550, 550, BufferedImage.TYPE_INT_RGB)
    val g = img.createGraphics();
    g.setColor(Color.WHITE)
    g.fillRect(0, 0, 550, 550)
    val myFont = new Font("Serif", Font.BOLD, 14);  
    g.setFont(myFont)
    //g.setRenderingHint(java.awt.RenderingHints.KEY_ANTIALIASING, java.awt.RenderingHints.VALUE_ANTIALIAS_ON)
    g.setColor(Color.BLACK)
    g.drawString("Generation:" + generation.toString() + " Populations Vertex ID:"+ pregelGraphVertexID +" Fitness:" + fitness.toString(), 0, 540)
     
    for (v <- vertices) 
        g.fillOval(v._2.x-5, v._2.y-5, 10, 10);
    for (e <- edges)
        g.drawLine(vertices(e.srcId.toInt)._2.x, 
                   vertices(e.srcId.toInt)._2.y, 
                   vertices(e.dstId.toInt)._2.x, 
                   vertices(e.dstId.toInt)._2.y) 
    img    
}

In [None]:
import java.io.ByteArrayOutputStream;
import javax.imageio.ImageIO
import org.apache.hadoop.fs.{FileSystem, Path}

def drawtrip(circuit: Array[Int], 
             cities: Array[(Long,Point)], 
             file:String,
             generation:Int,
             fitness:Double,
             pregelGraphVertexId:Int) :Unit = {
    val sequence = circuit.map(_.toLong);  //Convert int vertex ids to long  
    //Construct edges that correspond to the circuit
    val EdgeArray:Array[Edge[Int]] = new Array[Edge[Int]](number_of_cities);
    val tspVertices = sc.makeRDD(cities); 
    for (i <-0 to sequence.length-2)
        EdgeArray(i) = new Edge(sequence(i),sequence(i+1),1) //add an edge of 1 unit length
    EdgeArray(number_of_cities-1) = new Edge(sequence(number_of_cities-1),sequence(0),1); //close the trip
    //val tspEdges = sc.makeRDD(EdgeArray); /
    val img = path_to_image(cities,EdgeArray,generation,fitness,pregelGraphVertexId)
    
    //ImageIO.write(img, "jpg", new File(file))
     
       
    val baos = new ByteArrayOutputStream();
    ImageIO.write(img, "jpg", baos);
     
       
    val pa = "results1/Generation_"+generation+"_Node_"+pregelGraphVertexId+".jpg";
    val fs = FileSystem.get(sc.hadoopConfiguration)
    val out = fs.create(new Path(pa))
    out.write(baos.toByteArray());
    out.close();      
   }

In [None]:
val example_history = resultgraph.vertices
                    .filter(v=>v._1==1)
                    .map(t=>t._2.best_sequences)
                    .collect()(0)

In [None]:
example_history.foreach(z => {
   println("Generation:" + z._1 + " Node:" + z._3 + " Sequence:" + z._2.mkString("-")+ " Fitness:" + z._4)
  drawtrip(z._2,v_ids.zip(cities),"Generation_"+z._1+"_Node_"+z._3+".jpg",z._1,z._4,z._3);
})