# Example Visualizations

This notebook provides example queries and visualizations.

## Setup
The cell below is used to 
* import required libraries
* setting up the connection to the Neo4j database
* define the D3 based HTML template for custom visualizations

In [None]:
import pandas as pd 
import plotly.express as px
import pygal as pg
from string import Template
from IPython.core.display import display, HTML
from IPython.display import HTML, Javascript, display

neo4j_url=%env NEO4J_URL

%reload_ext cypher
%config CypherMagic.uri=neo4j_url + "/db/data"

def configure_d3():
    """Tell require where to get d3 from in `require(['d3'])`"""
    display(Javascript("""
    require.config({ 
      paths: {
        lodash: "/notebooks/vis/lib/lodash.min",  
        d3: "/notebooks/vis/lib/d3.v4.min"
      }
    })"""))

configure_d3()

base_html = """
<!DOCTYPE html>
<html>
  <head>
  <script type="text/javascript" src="/notebooks/vis/lib/svg.jquery.js"></script>
  <script type="text/javascript" src="/notebooks/vis/lib/pygal-tooltips.min.js""></script>
  </head>
  <body>
    <figure>
      {rendered_chart}
    </figure>
  </body>
</html>
"""

## Table
The simplest visualization is a table, the rows and columns are rendered directly from the result returned by the query.

In [None]:
%cypher MATCH (a:Artifact)-[:CONTAINS]->(n:Type) RETURN a.fqn as Artifact, count(n) as TypesPerArtifact

## Pie Chart
A pie chart is used for illustrating proportions, e.g. artifact sizes. Therefore the query returns a row per attifact, each containing the name and the number of contained types.

In [None]:
artifactSizes = %cypher MATCH (artifact:Artifact)-[:CONTAINS]->(type:Type) \
                        RETURN coalesce(artifact.name, artifact.fileName) as Artifact, count(type) AS Types

df = artifactSizes.get_dataframe()
fig = px.pie(df, values='Types', names='Artifact', title='Artifact Size')
fig.show()

## Bar Chart and Stacked Bar Chart
Bar charts are another way to visualize proportions. The example query below returns an artifact per row, each containg each containing the name of the artifact and the number of contained types.

In [None]:
artifactSizes = %cypher MATCH (artifact:Artifact)-[:CONTAINS]->(type:Type) \
                        RETURN coalesce(artifact.name, artifact.fileName) as Artifact, count(type) as Types \
                        ORDER BY Types desc

df = artifactSizes.get_dataframe()
fig = px.bar(df, x='Artifact', y='Types', title='Artifact Size')
fig.show()    

Bar charts may be stacked, e.g. to visualize the different Java class types (i.e. class, interface, enum or annotation) per artifact. The query therefore is extended by a column `JavaType` which determines the color. 

In [None]:
artifactSizes = %cypher MATCH (artifact:Artifact)-[:CONTAINS]->(type:Type) \
                        RETURN coalesce(artifact.name, artifact.fileName) as Artifact, case \
                          when type:Class then 'Class' \
                          when type:Interface then 'Interface' \
                          when type:Enum then 'Enum' \
                          when type:Annotation then 'Annotation' \
                        end as JavaType, count(type) as Types \
                        ORDER BY Types desc

df = artifactSizes.get_dataframe()
fig = px.bar(df, x='Artifact', color='JavaType', y='Types', title='Artifact Size')
fig.show()    

## Circle Packing
A circle packing diagram can be used to illustrate hierarchical structures, e.g. packages and their children. The query returns a flattened tree structure containing one row per parent/child-combination with four columns:
* *Parent_Fqn*: the fully qualified name of the parent (e.g. type name including package name)
* *Parent_Name*: the name of the parent (e.g. type name without package name)
* *Child_Fqn*: the fully qualified name of the child
* *Child_Is_Leaf*: a boolean value that if true indicates that the child has no further children (e.g. true for a type, false for a package)

In [None]:
packageHierarchy = %cypher MATCH (package:Package) \
                           MATCH (package)-[:CONTAINS]->(element) \
                           WHERE (package)-[:CONTAINS*]->(:Type) and exists(element.fqn) \
                           WITH package, element, element:Type as leaf \
                           RETURN DISTINCT package.fqn AS Parent_Fqn, package.name AS Parent_Name, element.fqn AS Child_Fqn, element.name AS Child_Name, leaf AS Child_Is_Leaf


In [None]:
package_hierarchy_df = packageHierarchy.get_dataframe()
text = Template(open('../vis/circle-packing/circle-packing-diagram.html', 'r').read().replace("\n","")).substitute({
    'circle_data': package_hierarchy_df.to_csv(index = False).replace("\r\n","\\n").replace("\n","\\n"),
    'container': 'type-packing-diagram'
})

HTML(text)

## Treemap

A treemap is another way of visualizing hierarchical structures. Each element is represented by a rectangle, the size and the color represent metrics per element. The example query returns a flattened tree containing one row per package:

* *Element*: The name of the element to be displayed as rectangle
* *Parent*: The name of the element's parent (optional for root elements)
* *Size*: Determines the relative size of the rectangle
* *Color*: Determines the color of the rectangle

In [None]:
packageTree = %cypher MATCH (package:Package) \
                      OPTIONAL MATCH (parent:Package)-[:CONTAINS]->(package) \
                      OPTIONAL MATCH (package)-[:CONTAINS]->(type:Type) \
                      OPTIONAL MATCH (type)-[:DECLARES]->(method:Method) \
                      RETURN package.fqn as Element, parent.fqn as Parent, count(type) as Size, sum(method.effectiveLineCount) as Color
df = packageTree.get_dataframe()
fig = px.treemap(df, names = 'Element', parents = 'Parent', values = 'Size', color= 'Color')
fig.show()

## Chord Diagram
A chord diagram is used to illustrate dependencies between elements, e.g. packages. The query for each dependency returns
* *Source*: The name of the dependent element (e.g. source package)
* *Target*: The name of the element's dependency (e.g. target package)
* *X_Count*: The weight of of the dependency (e.g. the coupling between both packages)

In [None]:
packageDependencies = %cypher MATCH (p1:Package)-[:CONTAINS]->(t1:Type), \
                                    (p2:Package)-[:CONTAINS]->(t2:Type), \
                                    (t1)-[dep:DEPENDS_ON]->(t2) \
                             WHERE  p1 <> p2 \
                             RETURN p1.name AS Source, \
                                    p2.name AS Target, \
                                    COUNT(dep) AS X_Count

In [None]:
packageDependenciesData = packageDependencies.get_dataframe().to_csv(index = False).replace("\r\n","\\n").replace("\n","\\n")
text = Template(open('../vis/chord/chord-diagram.html', 'r').read().replace("\n","")).substitute({
    'chord_data': packageDependenciesData,
    'container': 'module-chord-diagram'})

HTML(text)