In [3]:
%%markdown

### GitHub
* This GitHub repository: https://github.com/data-engineering-helpers/databricks-examples
  + This Jupyter notebook: https://github.com/data-engineering-helpers/databricks-examples/tree/main/ipython-notebooks/graph-simple-pyvis.ipynb

### Articles
* [Medium - Graphs with Python - Overview and best libraries](https://towardsdatascience.com/graphs-with-python-overview-and-best-libraries-a92aa485c2f8)
  + [NetworkX](https://networkx.org/) for general graph analysis;
  + [PyVis](https://pyvis.readthedocs.io/en/latest/) for interactive graph visualizations right in your browser;
  + [PyG](https://www.pyg.org/) and [DGL](https://www.dgl.ai/) for solving various graph machine learning tasks.

### Spark
* [Spark GraphX](https://spark.apache.org/docs/latest/graphx-programming-guide.html)
* [Spark - GraphFrames](https://graphframes.github.io/graphframes/docs/_site/index.html)
  + Included in DataBricks Machine Learning (ML) clusters, _i.e._, clusters named like `datalead-ds-`
  + Latest open source version for Spark 3.2: `graphframes/graphframes:0.8.2-spark3.2-s_2.12`
  + Open source library: `pyspark --packages graphframes:graphframes:0.8.2-spark3.2-s_2.12`
* [StackOverflow - Spark GraphFrame](https://stackoverflow.com/a/36968166/798053)

#### GraphFrames
* [GraphFrames - Quick Start](https://graphframes.github.io/graphframes/docs/_site/quick-start.html)
* [StackOverflow - GraphFrame from edges only](https://stackoverflow.com/a/57791091/798053)
* [StackOverflow - GraphFrame to networkx](https://stackoverflow.com/questions/74636063/pyspark-graphframe-and-networkx-for-graphs-with-hierarchy)
> `import networkx as nx; employee_graph_nx = nx.from_pandas_edgelist(employee_graph.edges.toPandas(),'src','dst')`
In order to visualize GraphFrames, one may export GraphFrames into NetworkX graphs and,
from there, import them into PyVis grahs.

### NetworkX
* [NetworkX - Tutorial](https://networkx.org/documentation/stable/tutorial.html)
* [NetworkX - Reference documentation](https://networkx.org/documentation/stable/reference/index.html)
  + [NetworkX - Documenttion - `Graph.add_node()`](https://networkx.org/documentation/stable/reference/classes/generated/networkx.Graph.add_node.html)
  + [NetworkX - Documenttion - `Graph.add_nodes_from()`](https://networkx.org/documentation/stable/reference/classes/generated/networkx.Graph.add_nodes_from.html)

### PyVis
* PyVis: https://pyvis.readthedocs.io/en/latest/
  + [PyVis - Integration with NetworkX](https://pyvis.readthedocs.io/en/latest/tutorial.html#networkx-integration)
* [DataBricks community - Displaying HTML output](https://community.databricks.com/s/question/0D53f00001HKHnuCAH/displaying-html-output)
> `graph_html = open("/dbfs/tmp/graph.html", 'r').read(); displayHTML(graph_html)`
* [GitHub - Holoviz - Use PyViz on Azure Databricks](https://github.com/holoviz/holoviz/issues/170):
> run `write_html()` function and use `display_html(Network.html)`



### GitHub
* This GitHub repository: https://github.com/data-engineering-helpers/databricks-examples
  + This Jupyter notebook: https://github.com/data-engineering-helpers/databricks-examples/tree/main/ipython-notebooks/graph-simple-pyvis.ipynb

### Articles
* [Medium - Graphs with Python - Overview and best libraries](https://towardsdatascience.com/graphs-with-python-overview-and-best-libraries-a92aa485c2f8)
  + [NetworkX](https://networkx.org/) for general graph analysis;
  + [PyVis](https://pyvis.readthedocs.io/en/latest/) for interactive graph visualizations right in your browser;
  + [PyG](https://www.pyg.org/) and [DGL](https://www.dgl.ai/) for solving various graph machine learning tasks.

### Spark
* [Spark GraphX](https://spark.apache.org/docs/latest/graphx-programming-guide.html)
* [Spark - GraphFrames](https://graphframes.github.io/graphframes/docs/_site/index.html)
  + Included in DataBricks Machine Learning (ML) clusters, _i.e._, clusters named like `datalead-ds-`
  + Latest open source version for Spark 3.2: `graphframes/graphframes:0.8.2-spark3.2-s_2.12`
  + Open source library: `pyspark --packages graphframes:graphframes:0.8.2-spark3.2-s_2.12`
* [StackOverflow - Spark GraphFrame](https://stackoverflow.com/a/36968166/798053)

#### GraphFrames
* [GraphFrames - Quick Start](https://graphframes.github.io/graphframes/docs/_site/quick-start.html)
* [StackOverflow - GraphFrame from edges only](https://stackoverflow.com/a/57791091/798053)
* [StackOverflow - GraphFrame to networkx](https://stackoverflow.com/questions/74636063/pyspark-graphframe-and-networkx-for-graphs-with-hierarchy)
> `import networkx as nx; employee_graph_nx = nx.from_pandas_edgelist(employee_graph.edges.toPandas(),'src','dst')`
In order to visualize GraphFrames, one may export GraphFrames into NetworkX graphs and,
from there, import them into PyVis grahs.

### NetworkX
* [NetworkX - Tutorial](https://networkx.org/documentation/stable/tutorial.html)
* [NetworkX - Reference documentation](https://networkx.org/documentation/stable/reference/index.html)
  + [NetworkX - Documenttion - `Graph.add_node()`](https://networkx.org/documentation/stable/reference/classes/generated/networkx.Graph.add_node.html)
  + [NetworkX - Documenttion - `Graph.add_nodes_from()`](https://networkx.org/documentation/stable/reference/classes/generated/networkx.Graph.add_nodes_from.html)

### PyVis
* PyVis: https://pyvis.readthedocs.io/en/latest/
  + [PyVis - Integration with NetworkX](https://pyvis.readthedocs.io/en/latest/tutorial.html#networkx-integration)
* [DataBricks community - Displaying HTML output](https://community.databricks.com/s/question/0D53f00001HKHnuCAH/displaying-html-output)
> `graph_html = open("/dbfs/tmp/graph.html", 'r').read(); displayHTML(graph_html)`
* [GitHub - Holoviz - Use PyViz on Azure Databricks](https://github.com/holoviz/holoviz/issues/170):
> run `write_html()` function and use `display_html(Network.html)`


In [1]:
from pyvis.network import Network

sample_graph = Network(notebook=True, cdn_resources="remote")  # create graph object

# add nodes
sample_graph.add_nodes(
    [1, 2, 3, 4, 5],  # node ids
    label=['Node #1', 'Node #2', 'Node #3', 'Node #4', 'Node #5'],  # node labels
    # node titles (display on mouse hover)
    title=['Main node', 'Just node', 'Just node', 'Just node', 'Node with self-loop'],
    color=['#d47415', '#22b512', '#42adf5', '#4a21b0', '#e627a3']  # node colors (HEX)
)
# add list of edges, same as in the previous example
sample_graph.add_edges([(1, 2), (1, 3), (2, 3), (2, 4), (3, 5), (5, 5)])

sample_graph_filepath: str = "graph-simple-pyvis-sample-graph.html"
sample_graph.write_html(name=sample_graph_filepath) #, local=False, notebook=True)
sample_graph.show(sample_graph_filepath)

graph-simple-pyvis-sample-graph.html


In [2]:
#import pyspark.pandas as pd
import pandas as pd
import networkx as nx

edge_pdf: pd.DataFrame = pd.DataFrame(
    {
        "source": [0, 1, 2],
        "target": [2, 2, 3],
        "weight": [3, 4, 5],
        "title": ["1", "2", "3"],
        "color": ["red", "blue", "blue"],
    }
)
display(edge_pdf)

node_list = [
    (0, {"title": "zero", "label": "Zero"}),
    (1, {"title": "one", "label": "One"}),
    (2, {"title": "two", "label": "Two"}),
    (3, {"title": "three", "label": "Three"}),
]

simple_nx = nx.from_pandas_edgelist(df=edge_pdf, source="source", target="target", edge_attr=True)
simple_nx.add_nodes_from(node_list)
simple_nx[0][2]["color"]

# Create PyVis graph object
simple_net = Network(notebook=True, cdn_resources="remote")
simple_net.from_nx(simple_nx)

# Render as embedded HTML
simple_graph_filepath: str = "graph-simple-pyvis-simple-graph.html"
simple_net.write_html(name=simple_graph_filepath) #, local=False, notebook=True)
simple_net.show(simple_graph_filepath)

Unnamed: 0,source,target,weight,title,color
0,0,2,3,1,red
1,1,2,4,2,blue
2,2,3,5,3,blue


graph-simple-pyvis-simple-graph.html
