# Accesing and querying JanusGraph database

This notebook demonstrates how to create a connection to the graph database, how to query the graph database and how to visualize graph query results.


Let's start with imports and connection to JanusGraph database:

In [1]:
from thoth.lab import obtain_location
from thoth.storages import GraphDatabase

# Let's obtain location of our JanusGraph instance.
JANUSGRAPH_LOCATION = obtain_location(
    "thoth-test-core-janusgraph",  # name of JanusGraph instance
    verify=False,            # do not verify TLS on internal network
    only_netloc=True         # do not use http schema, use netloc instead
)

# Instantiate and connect the JanusGraph database
adapter = GraphDatabase.create(JANUSGRAPH_LOCATION, port=80)
adapter.connect()

adapter.is_connected()

True

Next, let's try to perform some generic queries. We will use `GraphQueryResult` wrapper that wraps our graph queries and exposes some useful methods for us:

In [2]:
from thoth.lab import GraphQueryResult as gqr

query = adapter.g.V().has('__label__', 'python_package_version').groupCount().by('package_name').next()
query_result = gqr(query)

If we would like to access raw response, we can do so by acessing the result attribute:

In [3]:
query_result.result

{'asn1crypto': 2,
 'attrs': 1,
 'babel': 1,
 'backcall': 1,
 'backports.shutil-get-terminal-size': 1,
 'backports.ssl-match-hostname': 1,
 'bleach': 1,
 'certifi': 32,
 'cffi': 2,
 'chardet': 2,
 'click': 24,
 'complex-dist': 1,
 'configparser': 1,
 'cryptography': 2,
 'cycler': 1,
 'cython': 1,
 'decorator': 22,
 'docopt': 11,
 'docutils': 10,
 'flask': 12,
 'flask-socketio': 1,
 'funcsigs': 1,
 'gevent': 45,
 'gevent-websocket': 1,
 'gitdb2': 1,
 'gitpython': 1,
 'glob2': 3,
 'greenlet': 4,
 'gyp': 1,
 'headers.dist': 1,
 'idna': 1,
 'iniparse': 1,
 'ipykernel': 23,
 'ipython': 86,
 'ipython-genutils': 2,
 'itsdangerous': 4,
 'jasmine-core': 1,
 'javapackages': 1,
 'jedi': 12,
 'jinja2': 25,
 'jsonschema': 1,
 'jupyter': 1,
 'jupyter-client': 16,
 'jupyter-console': 1,
 'jupyter-core': 14,
 'jupyterlab': 2,
 'jupyterlab-launcher': 1,
 'keras': 1,
 'kitchen': 1,
 'lxml': 1,
 'markupsafe': 18,
 'matplotlib': 2,
 'mistune': 2,
 'mock': 1,
 'more-itertools': 3,
 'nbconvert': 2,
 'nbforma

We can get results as Pandas dataframe:

In [4]:
import pandas as pd

r=query_result.result
df = pd.DataFrame(list(r.items()), columns=['Package Name', '# Versions'])
df

Unnamed: 0,Package Name,# Versions
0,greenlet,4
1,testpath,1
2,scikit-learn,2
3,cython,1
4,pickleshare,11
5,prometheus-client-model,1
6,psycopg2,1
7,py,3
8,pycparser,1
9,markupsafe,18


If we get result as a dictionary of values, we can easily plot some graphs:

In [5]:
query = adapter.g.V().has('__label__', 'python_package_version').groupCount().by('package_name').next()
query_result = gqr(query)

In [6]:
query_result.plot_bar()

In [7]:
# Or a pie chart

query_result.plot_pie()

To get more options on how to visualize and interact with data retrieved from graph database, use the `help` function to access up-to-date `GraphQueryResult` documentation:

In [8]:
help(gqr)

Help on class GraphQueryResult in module thoth.lab.graph:

class GraphQueryResult(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, result)
 |      Initialization.
 |      
 |      :param result: the result to be used as a query result, can be directly coroutine that is fired.
 |  
 |  plot_bar(self)
 |      Plot histogram of results obtained.
 |  
 |  plot_pie(self)
 |      Plot a pie of results into Jupyter notebook.
 |  
 |  serialize(self)
 |      Serialize the output of graph query.
 |  
 |  to_dataframe(self)
 |      Construct a panda's dataframe on results.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



For more info on how to construct queries to the graph database, see [Practical Gremlin: An Apache TinkerPop Tutorial](http://kelvinlawrence.net/book/Gremlin-Graph-Guide.html). Also see [pandas.DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html#pandas.DataFrame) documentation on how to use dataframes and [plotly documentation](https://plot.ly/api/) for creating interactive figures. To visualize data available in the graph, use the [graph explorer](https://url.corp.redhat.com/thoth-sbu-graphexp).