## Lux now supports PostgreSQL database (March 2021)

In the tutorial, we describe how Lux can be used with data stored in a Postgres database. Postgres is a popular relational databases used by data professionals. By the end of the tutorial, you should be able to: 

- query data using lux on PostgreSQL backend directly
- get lux's recommended visualizations based on selected variables
- gain an overview of the data inside your Postgres database
- analyze relationships between attributes
- create custom visualizations
- download your graph and code

**Note that SQL support is still experimental and undergoing development. If you are excited about the SQL feature and there is something that you would like to see, please let us know by submitting a [Github Issue here](https://github.com/lux-org/lux/issues/).**

## A. Setup in 3 steps:
- 1) Connect to the database
- 2) Create another LuxSQLTable object for multiple tables
- 3) Check installed packages

## 1) Create and connect to the database
First, we setup a Postgres database connection, create a LuxSQLtable named ```tbl```, configure the SQL connection, then specify the table we are using in the demo to be ```car```.

In [None]:
import lux
import pandas as pd
from sqlalchemy import create_engine
import psycopg2

connection = psycopg2.connect("host=localhost dbname=testdb user=testuser password=testpass")
tbl = lux.LuxSQLTable()
lux.config.set_SQL_connection(connection)
tbl.set_SQL_table("car")

## 2) Create another LuxSQLTable object for multiple tables
To explore multiple datasets at the same time, you simply create another new LuxSQLTable object for lux to operate on and specify its database table name. For example, we created another variable ```sql_tbl2``` by specifiying a new LuxSQLtable with ```lux.LuxSQLTable(table_name="college")```. This way, we can make parallel comparisons of two tables side-by-side.

In [None]:
sql_tbl2 = lux.LuxSQLTable(table_name="college")
sql_tbl2

## 3) Check installed packages

Currently, lux's recommendation capability is expanded to support postgreSQL, so check you have all the installation of necessary packages below.

In [None]:
import pandas as pd
import lux
from sqlalchemy import create_engine
import psycopg2
from lux.vis.Vis import Vis
from lux.vis.VisList import VisList

Woot! Setup is now complete. Let's explore these datasets using lux.

## B. Explore data in 3 steps:
- 1) Preview data table 
- 2) Get recommended visualizations using lux
- 3) Deep dive with intent or Clause
- 4) Create a visualization between multiple attributes with Vis or VisList

## 1) Preview data table 

By printing ```tbl```, you can preview the data table and access the lux toggle. The preview table allows you to view top 5 rows of the dataset.

Note: it does not support panda functionalities at the moment.

In [None]:
tbl

## 2) Get recommended visualizations using lux

Once you click on the ```Toggle Table/Lux```, you get a set of recommended visualizations. To gain an overall read of the data by category, we click on the ```Occurrence``` tab and see the top counts of records in the dataset by origin of carmaker, number of cylinders in the car, brand of car, and name of car model. 

We learned in the dataset:
- majority of the cars are made by American manufacturers with Japanese and European carmakers coming in close second and third ranking
- cars with 4 cylinders are the most popular, followed by 8 and 6
- top car brands are ford, chevrolet, and plymouth by count
- amc matador, ford pinto, and toyota corolla are the most common car models by count

To see the relationship between **two quantitative attributes in a scatterplot**, you can refer to the ```Correlation``` tab, whereas to see the relationship between **two quantitative attributes in an univariate histogram**, you can refer to the ```Distribution```tab.

<img src="SQLtutorial_image1.png">

## 3) Deep dive with intent

With ```Correlation``` tab, we find that ```weight``` and ```milespergal``` are inversely related. As weight of the car increases, its fuel efficiency decreases. We wonder how this relates to origin (country) of carmaker or the number of cylinders the car has, or car brand. As with lux's existing capabilties, you can further select single or multiples attributes and generate different recommended visualizations with ```intent```,```Clause```, ```Vis```, and ```VisList```. Let's start with ```intent```.

In [None]:
tbl.intent = ["Weight"]
tbl.intent
tbl

In [None]:
tbl.intent = ["Weight","MilesPerGal"]
tbl.intent
tbl

We can also compare and contrast each origin country similarities and differences with a selected group of multiple attributes. It turns out Europe and Japan look similar in terms of horsepower, weight, milespergal, acceleration, and concentration on 4-cyclinders-cars. In comparison, America looks pretty different in the aforementioned attributes. With a deep dive by cylinder and brand, we see American cars have more cars equipped with 6 and 8-cylinders and offer more variety of brands to choose from than their European and Japanese counterparts.

In [None]:
selected_attributes = "Weight|MilesPerGal|Horsepower|Acceleration|Cylinders|Brand|Year"
tbl.intent = [selected_attributes,"Origin"]
tbl.intent
tbl

Using ```Clause```, we can also deep dive and see the breakdown of ```MilesPerGal``` by comparing the car's country of origin after noticing the gap between the average miles per gallon between American cars, and European and Japanese cars.

In [None]:
tbl.intent = ['MilesPerGal',
            lux.Clause(attribute='Origin',filter_op='=', value=['Europe','Japan','USA'])]
tbl.intent
tbl

## 4) Create a visualization between multiple attributes with Vis or VisList

Using ```Vis``` or ```VisList```, we can create custom visualizations. For example, we might also be interested the distribution of horsepower in the dataset and how it differs by origin (country) of car. With ```Vis```, we specify exactly that we are interested in the distribution of horsepower by counts of records from American carmakers.

In [None]:
from lux.vis.Vis import Vis
intent = ["Horsepower"]
vis = Vis(intent,tbl)
vis

In [None]:
new_intent = [lux.Clause("Horsepower",bin_size=50),"Origin=USA"]
vis.set_intent(new_intent)
vis

Unsure if there are other relationships of interest with ```Horsepower```, we can use the wildcard "?" symbol to examine and create a vis collection of Horsepower with respect to all other attributes. ```VisList``` is helpful to get an overview of relationships between multiple attributes.

In [None]:
from lux.vis.VisList import VisList
vc = VisList(["Horsepower","?"],tbl)
vc

Alternatively, we can also specify desired attributes via a ```VisList``` with respect to Horsepower:

In [None]:
vc = VisList(["Horsepower",['MilesPerGal','Year','Weight','Origin','Cylinders','Name']],tbl)
vc

## C. Download data:
- Select graph and download visualizations as graph
- Save as html, altair, vegalite for further editing or revisions

## Select graph and download visualizations as graph



In [None]:
vis = tbl.exported[0]
vis

## Save as html, altair, vegalite for further editing or revisions

In [None]:
tbl.save_as_html()
tbl

In [None]:
tbl1 = vis
tbl1

In [None]:
print (tbl1.to_Altair())

In [None]:
print (tbl1.to_VegaLite())


We hope the newly supported Postgres feature helps streamline your data exploration process with Lux. 

If you have any feedback, please let us know via [Slack](https://communityinviter.com/apps/lux-project/lux) or  [Github](https://github.com/lux-org/lux/issues/)!

__For more information:__

Other additional resources:

- Sign up for the early-user [mailing list](https://forms.gle/XKv3ejrshkCi3FJE6) to stay tuned for upcoming releases, updates, or user studies. 
- Visit [ReadTheDoc](https://lux-api.readthedocs.io/en/latest/) for more detailed documentation.
- Try out these hands-on [exercises](https://mybinder.org/v2/gh/lux-org/lux-binder/master?urlpath=tree/exercise) or [tutorials](https://mybinder.org/v2/gh/lux-org/lux-binder/master?urlpath=tree/tutorial) on [Binder](https://mybinder.org/v2/gh/lux-org/lux-binder/master). Or clone and run [lux-binder](https://github.com/lux-org/lux-binder) locally.
- Join our community [Slack](https://communityinviter.com/apps/lux-project/lux) to discuss and ask questions.
- Report any bugs, issues, or requests through [Github Issues](https://github.com/lux-org/lux/issues). 