# 🍇 GraphX - Distance Challenge

All the details about the challenge is in the [repository](https://github.com/avcaliani/graphx-app).

<img src="https://raw.githubusercontent.com/avcaliani/graphx-app/main/.docs/basic-graph.png" height="400px">

## PySpark

Before starting, let's configure the PySpark.

In [7]:
!pip install pyspark==3.5.0



In [8]:
from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .master('local[*]') \
    .appName('graphx-app') \
    .config('spark.jars.packages', "graphframes:graphframes:0.8.3-spark3.5-s_2.12") \
    .getOrCreate()

# 👇 Required to run Pregel
spark.sparkContext.setCheckpointDir("/tmp/checkpoints")

In [9]:
from graphframes import GraphFrame

In [10]:
vertices_data = [
    (1, 1.0),
    (2, 1.5),
    (3, 2.0),
    (4, 1.0),
]
# GraphFrame demands a column called "id"
# https://graphframes.github.io/graphframes/docs/_site/api/python/graphframes.html#graphframes.GraphFrame
vertices = spark.createDataFrame(data = vertices_data, schema = ["id", "multiplier"])

edges_data = [
    (1, 2, 5.0),
    (1, 3, 7.0),
    (2, 4, 12.0),
    (3, 4, 8.0),
]
edges = spark.createDataFrame(data = edges_data, schema = ["src", "dst", "distance"])

graph = GraphFrame(vertices, edges)

In [11]:
graph.vertices.show()

+---+----------+
| id|multiplier|
+---+----------+
|  1|       1.0|
|  2|       1.5|
|  3|       2.0|
|  4|       1.0|
+---+----------+



In [12]:
graph.edges.show()

+---+---+--------+
|src|dst|distance|
+---+---+--------+
|  1|  2|     5.0|
|  1|  3|     7.0|
|  2|  4|    12.0|
|  3|  4|     8.0|
+---+---+--------+

