Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
390 changes: 390 additions & 0 deletions demos/demos_databases_apis/sql/postgres.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,390 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# SQL Example\n",
"\n",
"**Get & plot data from the first table in a SQL DB**\n",
"\n",
"* Uses SQL ODBC drivers prepacked with Graphistry\n",
"* Default: visualizes the schema migration table used within Graphistry\n",
"* Shows several viz modes + a convenience function for sql->interactive viz\n",
"* Try: Modify the indicated lines to change to visualize any other table\n",
"\n",
"Further docs\n",
" - [UI Guide](https://labs.graphistry.com/graphistry/ui.html)\n",
" - [More demos: database connectors, ...](/notebook/tree/demos/demos_databases_apis)\n",
" - [CSV upload notebook app](/notebook/tree/demos/upload_csv_miniapp.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"#### Graphistry"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import graphistry\n",
"\n",
"#pip install graphistry -q\n",
"#graphistry.register(protocol='http', server='my.server.com', key='MY_API_KEY')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### SQL connection string\n",
"* Modify with your own db connection strig\n",
"* For heavier use and sharing, see sample for hiding creds from the notebook while still reusing them across sessions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"user = \"graphistry\"\n",
"pwd = \"password\"\n",
"server = \"postgres:5432\"\n",
"\n",
"##OPTIONAL: Mount in installation's ${PWD}/.notebooks/db_secrets.json and read in\n",
"#import json\n",
"#with open('/home/graphistry/notebooks/db_secrets.json') as json_file:\n",
"# cfg = json.load(json_file)\n",
"# user = cfg['user']\n",
"# pwd = cfg['pwd']\n",
"# server = cfg['server']\n",
"#\n",
"## .. The first time you run this notebook, save the secret cfg to the system's persistent notebook folder:\n",
"#import json\n",
"#with open('/home/graphistry/notebooks/db_secrets.json', 'w') as outfile:\n",
"# json.dump({\n",
"# \"user\": \"graphistry\",\n",
"# \"pwd\": \"password\",\n",
"# \"server\": \"postgres:5432\"\n",
"# }, outfile)\n",
"## Delete ^^^ after use\n",
"\n",
"db_string = \"postgres://\" + user + \":\" + pwd + \"@\" + server\n",
"### Take care not to save a print of the result"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# OPTIONAL: Install ODBC drivers in other environments:\n",
"# ! apt-get update\n",
"# ! apt-get install -y g++ unixodbc unixodbc-dev\n",
"# ! conda install -c anaconda pyodbc=4.0.26 sqlalchemy=1.3.5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Connect to DB"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import sqlalchemy as db\n",
"from sqlalchemy import create_engine, MetaData, Table, Column, Integer, Float, String\n",
"from sqlalchemy.orm import sessionmaker"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"engine = create_engine(db_string)\n",
"Session = sessionmaker(bind=engine)\n",
"session = Session()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inspect available tables"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"table_names = engine.table_names()\n",
"\", \".join(table_names)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optional: Modify to pick your own table!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if 'django_migrations' in table_names:\n",
" table = 'django_migrations'\n",
"else:\n",
" table = table_names[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize viz: Get data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = engine.execute(\"SELECT * FROM \\\"\" + table + \"\\\" LIMIT 1000\")\n",
"df = pd.DataFrame(result.fetchall(), columns=result.keys())\n",
"print(\"table\", table, '# rows', len(df))\n",
"df.sample(min(3, len(df)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plot\n",
"\n",
"Several variants:\n",
"1. Treat each row & cell value as a node, and connect row<>cell values\n",
"2. Treat each cell value as a node, and connect all cell values together when they occur on the same row\n",
"3. Treat each cell value as an edge, and specify which columns to to connect values together on\n",
"4. Use explict node/edge tables\n",
"\n",
"#### 1. Treat each row & cell value as a node, and connect row<>cell values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"graphistry.hypergraph(df)['graph'].plot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2. Treat each cell value as a node, and connect all cell values together when they occur on the same row"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"graphistry.hypergraph(df, direct=True)['graph'].plot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3. Treat each cell value as an edge, and specify which columns to to connect values together on"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"graphistry.hypergraph(df, direct=True,\n",
" opts={\n",
" 'EDGES': {\n",
" 'id': ['name'],\n",
" 'applied': ['name'],\n",
" 'name': ['app']\n",
" }\n",
" })['graph'].plot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4. Use explict node/edge tables"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"g = graphistry.bind(source='name', destination='app').edges(df.assign(name=df['name'].apply(lambda x: 'id_' + x)))\n",
"g.plot()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Add node bindings..\n",
"nodes_df = pd.concat([\n",
" df[['name', 'id', 'applied']],\n",
" df[['app']].drop_duplicates().assign(\\\n",
" id = df[['app']].drop_duplicates()['app'], \\\n",
" name = df[['app']].drop_duplicates()['app'])\n",
"], ignore_index=True, sort=False)\n",
"\n",
"g = g.bind(node='id', point_title='name').nodes(nodes_df)\n",
"\n",
"g.plot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Convenience function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def explore(sql, *args, **kvargs):\n",
" result = engine.execute(sql)\n",
" df = pd.DataFrame(result.fetchall(), columns=result.keys())\n",
" print('# rows', len(df))\n",
" g = graphistry.hypergraph(df, *args, **kvargs)['graph']\n",
" return g"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Simple use"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"explore(\"SELECT * FROM django_migrations LIMIT 1000\").plot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Pass in graphistry.hypergraph() options"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"explore(\"SELECT * FROM django_migrations LIMIT 1000\", direct=True).plot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Get data back"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"explore(\"SELECT * FROM django_migrations LIMIT 1000\")._nodes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Further docs\n",
" - [UI Guide](https://labs.graphistry.com/graphistry/ui.html)\n",
" - [More demos: database connectors, ...](/notebook/tree/demos/demos_databases_apis)\n",
" - [CSV upload notebook app](/notebook/tree/demos/upload_csv_miniapp.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading