Lux extends its visualization exploration operations to data within SQL databases. Currently we support connections to Postgresql databases, but are planning to provide greater support to more systems in the future. By using the SQL Executor users can connect database tables and views to the LuxSQLTable object, gaining access to all of Lux' recommendation capabilities without having to pull all of the data locally.

In this environment we have already set up a Postgres database and populated it with two database tables used in this demo. The script used to upload the data can be found within the scripts folder.

Once we have created a postgresql connection, we can now create a LuxSQLTable and connect it to our database. You can now take advantage of all of Lux's visualization recommendation system without having to pull the table locally.

In [None]:
import pandas as pd
import lux
from sqlalchemy import create_engine

engine = create_engine("postgresql://testuser:testpass@localhost:5432/testdb")
lux.config.set_SQL_connection(engine)

sql_tbl = lux.LuxSQLTable(table_name="college")
sql_tbl

Looking at Lux' recommendations we see that the information about ACTMedian and SATAverage has a very strong correlation.From the Category tab, we see that there are few records where PredominantDegree is "Certificate". In addition, there are not a lot of colleges with "Private For-Profit" as FundingModel.

We are interested in picking a college to attend and want to understand the AverageCost of attending different colleges and how that relates to other information in the dataset.

In [None]:
sql_tbl.intent = ["AverageCost"]
sql_tbl

We see that there are a large number of colleges that cost around $20000 per year. Scrolling through the Enhance tab, we also see that Bachelor degree colleges and colleges in New England and large cities tend to have a higher AverageCost than its counterparts.

We are interested in the trend of AverageCost v.s. SATAverage since there is a rough upwards relationship above AverageCost of $30000, but below that the trend is less clear.

In [None]:
sql_tbl.intent = ["AverageCost","SATAverage"]
sql_tbl

By adding the FundingModel, we see that the cluster of points on the left can clearly be attributed to public colleges, whereas private colleges more or less follow a trend that shows that colleges with higher SATAverage tends to have higher AverageCost.

We can also leverage Lux' vis library to quickly create visualizations from our database data. 

In [None]:
from lux.vis.Vis import Vis
from lux.vis.Vis import Clause

x_clause = Clause(attribute = "AdmissionRate", channel = "x")
y_clause = Clause(attribute = "AverageCost", channel = "y")
color_clause = Clause(attribute = "FundingModel", channel = "color")

new_vis = Vis([x_clause, y_clause, color_clause], sql_tbl)
new_vis

We can also create a new LuxSQLTable object and connect it to a different database table. This will let you explore both datasets at once.

In [None]:
sql_tbl2 = lux.LuxSQLTable(table_name="car")
sql_tbl2

In [None]:
sql_tbl