Skip to content

Commit

Permalink
Merge pull request #327 from thyneb19/Database-Executor
Browse files Browse the repository at this point in the history
Merge in Master branch changes, add len() functionality to the LuxSQLTable
  • Loading branch information
thyneb19 committed Mar 27, 2021
2 parents fcad97a + b5998c7 commit 75c5cae
Show file tree
Hide file tree
Showing 35 changed files with 419 additions and 279 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<a href='https://lux-api.readthedocs.io/en/latest/?badge=latest'>
<img src='https://readthedocs.org/projects/lux-api/badge/?version=latest' alt='Documentation Status' align="center"/>
</a>
<a href='https://lux-project.slack.com/join/shared_invite/zt-lilu4e87-TM4EDTq9HWzlDRycFsrkLg'>
<a href='https://communityinviter.com/apps/lux-project/lux'>
<img src='https://img.shields.io/static/v1?label=chat&logo=slack&message=Slack&color=brightgreen' alt='Slack' align="center"/>
</a>
<a href='https://forms.gle/XKv3ejrshkCi3FJE6'>
Expand Down Expand Up @@ -147,9 +147,9 @@ conda install -c conda-forge lux-api

Both the PyPI and conda installation include includes the Lux Jupyter widget frontend, [lux-widget](https://pypi.org/project/lux-widget/).

## Setup in Jupyter Notebook, VSCode
## Setup in Jupyter Notebook, VSCode, JupyterHub

To use Lux in [Jupyter notebook](https://github.com/jupyter/notebook) or [VSCode](https://code.visualstudio.com/docs/python/jupyter-support), activate the notebook extension:
To use Lux with any Jupyter notebook-based frontends (e.g., [Jupyter notebook](https://github.com/jupyter/notebook), [JupyterHub](https://github.com/jupyterhub/jupyterhub), or [VSCode](https://code.visualstudio.com/docs/python/jupyter-support)), activate the notebook extension:

```bash
jupyter nbextension install --py luxwidget
Expand Down Expand Up @@ -179,5 +179,5 @@ Other additional resources:
- Sign up for the early-user [mailing list](https://forms.gle/XKv3ejrshkCi3FJE6) to stay tuned for upcoming releases, updates, or user studies.
- Visit [ReadTheDoc](https://lux-api.readthedocs.io/en/latest/) for more detailed documentation.
- Try out these hands-on [exercises](https://mybinder.org/v2/gh/lux-org/lux-binder/master?urlpath=tree/exercise) or [tutorials](https://mybinder.org/v2/gh/lux-org/lux-binder/master?urlpath=tree/tutorial) on [Binder](https://mybinder.org/v2/gh/lux-org/lux-binder/master). Or clone and run [lux-binder](https://github.com/lux-org/lux-binder) locally.
- Join our community [Slack](https://lux-project.slack.com/join/shared_invite/zt-lilu4e87-TM4EDTq9HWzlDRycFsrkLg) to discuss and ask questions.
- Join our community [Slack](https://communityinviter.com/apps/lux-project/lux) to discuss and ask questions.
- Report any bugs, issues, or requests through [Github Issues](https://github.com/lux-org/lux/issues).
66 changes: 44 additions & 22 deletions doc/source/advanced/custom.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,37 +11,51 @@ In this tutorial, we will look at how you can register custom recommendation act
df = pd.read_csv("https://raw.githubusercontent.com/lux-org/lux-datasets/master/data/hpi.csv")
df["G10"] = df["Country"].isin(["Belgium","Canada","France","Germany","Italy","Japan","Netherlands","United Kingdom","Switzerland","Sweden","United States"])
lux.config.default_display = "lux"
As we can see, Lux registers a set of default recommendations to display to users, such as Correlation, Distribution, etc.

.. code-block:: python
df
As we can see, Lux displays several recommendation actions, such as Correlation and Distributions, which is globally registered by default.
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-3.png?raw=true
:width: 700
:align: center
:alt: Displays default actions after print df.

Registering Custom Actions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Let's define a custom function to generate the recommendations on the dataframe. In this example, we register a custom action called `G10` to generate a collection of visualizations that showcases numerical measures that differs significantly across `G10 <https://en.wikipedia.org/wiki/Group_of_Ten_(economics)>`_ and non-G10 countries. In other words, we want to understand how the G10 and non-G10 countries differs based on the measures present in the dataframe.
Let's define a custom function to generate the recommendations on the dataframe. In this example, we register a custom action that showcases numerical measures that differs significantly across G10 and non-G10 countries. `G10 countries<https://en.wikipedia.org/wiki/Group_of_Ten_(economics)>` are composed of the ten most industrialized countries in the world, so comparing G10 and non-G10 countries allows us to understand how industrialized and non-industrialized economies differs based on the measures present in the dataframe.

Here, we first generate a VisList that looks at how various quantitative attributes breakdown between G10 and non-G10 countries. Then, we score and rank these visualization by calculating the percentage difference in means across G10 v.s. non-G10 countries.

.. code-block:: python
from lux.vis.VisList import VisList
# Create a VisList containing G10 with respect to all possible quantitative columns in the dataframe
intent = [lux.Clause("?",data_type="quantitative"),lux.Clause("G10")]
vlist = VisList(intent,df)
for vis in vlist:
# Percentage Change Between G10 v.s. non-G10 countries
# Percentage Change Between G10 v.s. non-G10 countries
a = vis.data.iloc[0,1]
b = vis.data.iloc[1,1]
vis.score = (b-a)/a
lux.config.topK = 15
vlist = vlist.showK()
vlist.sort()
vlist.showK()
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-0.png?raw=true
:width: 700
:align: center
:alt: Custom VisList of G10 v.s. non G10 countries

Let's define a custom function to generate the recommendations on the dataframe. In this example, we will use G10 to generate a VisList to calculate the percentage change of means Between G10 v.s. non-G10 countries.
To define a custom action, we simply wrap our earlier VisList example into a function. We can even use short texts and emojis as the title to display on the tabs for the custom recommendation.

.. code-block:: python
def G10_mean_difference(ldf):
# Define a VisList of quantitative distribution between G10 and non-G10 countries
# Define a VisList of quantitative distribution between G10 and non-G10 countries
intent = [lux.Clause("?",data_type="quantitative"),lux.Clause("G10")]
vlist = VisList(intent,ldf)
Expand All @@ -50,11 +64,13 @@ Let's define a custom function to generate the recommendations on the dataframe.
a = vis.data.iloc[0,1]
b = vis.data.iloc[1,1]
vis.score = (b-a)/a
lux.config.topK = 15
vlist = vlist.showK()
return {"action":"G10", "description": "Percentage Change of Means Between G10 v.s. non-G10 countries", "collection": vlist}
vlist.sort()
vlist.showK()
return {"action":"Compare 馃彮馃彟馃寧",
"description": "Percentage Change of Means Between G10 v.s. non-G10 countries",
"collection": vlist}
In the code below, we define a display condition function to determine whether or not we want to generate recommendations for the custom action. In this example, we simply check if we are using the HPI dataset to generate recommendations for the custom action `G10`.
In the code below, we define a display condition function to determine whether or not we want to generate recommendations for the custom action. In this example, we simply check if we are using the HPI dataset to generate recommendations for the `Compare industrialized` action.

.. code-block:: python
Expand All @@ -68,13 +84,13 @@ In the code below, we define a display condition function to determine whether o
except:
return False
To register the `G10` action in Lux, we apply the `register_action` function, which takes a name and action as inputs, as well as a display condition and additional arguments as optional parameters.
To register the `Compare industrialized` action in Lux, we apply the :code:`register_action` function, which takes a name and action as inputs, as well as a display condition and additional arguments as optional parameters.

.. code-block:: python
lux.config.register_action("G10", G10_mean_difference, is_G10_hpi_dataset)
lux.config.register_action("Compare industrialized", G10_mean_difference, is_G10_hpi_dataset)
After registering the action, the G10 recomendation action is automatically generated when we display the Lux dataframe again.
After registering the action, the custom action is automatically generated when we display the Lux dataframe again.

.. code-block:: python
Expand All @@ -93,7 +109,7 @@ Since the registered action is globally defined, the G10 action is displayed whe
df[df["GDPPerCapita"]>40000]
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-1.png?raw=true
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-1-filtered.png?raw=true
:width: 700
:align: center
:alt: Displays countries with GDPPerCapita > 40000 to compare G10 results.
Expand All @@ -103,17 +119,22 @@ As we can see, there is a less of a distinction between G10 and non-G10 countrie
Navigating the Action Manager
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can inspect a list of actions that are currently registered in the Lux Action Manager. The following code displays both default and user-defined actions.
You can inspect a list of actions that are currently registered in Lux's Action Manager. The following code displays both default and user-defined actions.

.. code-block:: python
lux.config.actions
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-5.png?raw=true
:width: 700
:align: center
:alt: Retrieves a list of actions from Lux's action manager.

You can also get a single action attribute by calling this function with the action's name.

.. code-block:: python
lux.config.actions.get("G10")
lux.config.actions.get("Compare industrialized")
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-2.png?raw=true
:width: 700
Expand All @@ -123,19 +144,20 @@ You can also get a single action attribute by calling this function with the act
Removing Custom Actions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Let's say that we are no longer in looking at the `G10` action, the `remove_action` function allows you to remove from Lux's action manager an action with its id. The action will no longer display with the Lux dataframe.
Let's say that we are no longer interested in looking at the `Compare industrialized` action, the `remove_action` function allows you to remove from Lux's action manager an action with its id. The action will no longer display with the Lux dataframe.

.. code-block:: python
lux.config.remove_action("G10")
lux.config.remove_action("Compare industrialized")
After removing the action, when we print the dataframe again, the `Compare industrialized` action is no longer displayed.

After removing the action, when we print the dataframe again, the `G10` action is no longer displayed.

.. code-block:: python
df
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-4.png?raw=true
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-3.png?raw=true
:width: 700
:align: center
:alt: Demonstrates removing custom action from Lux Action Manager.
34 changes: 25 additions & 9 deletions doc/source/advanced/executor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,41 +11,57 @@ Please refer to :mod:`lux.executor.Executor`, if you are interested in extending
SQL Executor
=============

Lux extends its visualization exploration operations to data within SQL databases. By using the SQL Executor, users can specify a SQL database to connect a Lux Dataframe for generating all the visualizations recommended in Lux.
Lux extends its visualization exploration operations to data within SQL databases. By using the SQL Executor, users can specify a SQL database to connect a LuxSQLTable for generating all the visualizations recommended in Lux.

Connecting Lux to a Database
----------------------------

Before Lux can operate on data within a Postgresql database, users have to connect their Lux Dataframe to their database.
Before Lux can operate on data within a Postgresql database, users have to connect their LuxSQLTable to their database.
To do this, users first need to specify a connection to their SQL database. This can be done using the psycopg2 package's functionality.

.. code-block:: python
import psycopg2
connection = psycopg2.connect("dbname=example_database user=example_user, password=example_password")
Once this connection is created, users can connect their Lux Dataframe to the database using the Lux Dataframe's set_SQL_connection command.
Once this connection is created, users can connect the lux config to the database using the set_SQL_connection command.

.. code-block:: python
lux_df.set_SQL_connection(connection, "my_table")
lux.config.set_SQL_connection(connection)
When the set_SQL_connection function is called, Lux will then populate the Dataframe with all the metadata it needs to run its intent from the database table.
When the set_SQL_connection function is called, Lux will then populate the LuxSQLTable with all the metadata it needs to run its intent from the database table.

Connecting a LuxSQLTable to a Table/View
--------------------------

LuxSQLTables can be connected to individual tables or views created within your Postgresql database. This can be done by either specifying the table/view name in the constructor.

.. code-block:: python
sql_tbl = LuxSQLTable(table_name = "my_table")
You can also connect a LuxSQLTable to a table/view by using the set_SQL_table function.

.. code-block:: python
sql_tbl = LuxSQLTable()
sql_tbl.set_SQL_table("my_table")
Choosing an Executor
--------------------------

Once a user has created a connection to their Postgresql database, they need to change Lux's execution engine so that the system can collect and process the data properly.
By default Lux uses the Pandas executor to process local data in the Lux Dataframe, but users need to use the SQL executor when their Lux Dataframe is connected to a database.
Users can specify the executor that a Lux Dataframe will use via the set_executor_type function as follows:
By default Lux uses the Pandas executor to process local data in the LuxDataframe, but users will use the SQL executor when their LuxSQLTable is connected to a database.
Users can specify the executor that Lux will use via the set_executor_type function as follows:

.. code-block:: python
lux_df.set_executor_type("SQL")
Once a Lux Dataframe has been connected to a Postgresql table and set to use the SQL Executor, users can take full advantage of Lux's visual exploration capabilities as-is. Users can set their intent to specify which variables they are most interested in and discover insightful visualizations from their database.
Once a LuxSQLTable has been connected to a Postgresql table and set to use the SQL Executor, users can take full advantage of Lux's visual exploration capabilities as-is. Users can set their intent to specify which variables they are most interested in and discover insightful visualizations from their database.

SQL Executor Limitations
--------------------------

While users can make full use of Lux's functionalities on data within a database table, they will not be able to use any of Pandas' Dataframe functions to manipulate the data. Since the Lux SQL Executor delegates most data processing to the Postgresql database, it does not pull in the entire dataset into the Lux Dataframe. As such there is no actual data within the Lux Dataframe to manipulate, only the relevant metadata required to for Lux to manage its intent. Thus, if users are interested in manipulating or querying their data, this needs to be done through SQL or an alternative RDBMS interface.
While users can make full use of Lux's functionalities on data within a database table, they will not be able to use any of Pandas' Dataframe functions to manipulate the data in the LuxSQLTable object. Since the Lux SQL Executor delegates most data processing to the Postgresql database, it does not pull in the entire dataset into the Lux Dataframe. As such there is no actual data within the LuxSQLTable to manipulate, only the relevant metadata required to for Lux to manage its intent. Thus, if users are interested in manipulating or querying their data, this needs to be done through SQL or an alternative RDBMS interface.
4 changes: 2 additions & 2 deletions doc/source/guide/FAQ.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Note that you must perform :code:`import lux` before you load in or create the d

What if my data is stored in a relational database?
""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Lux has `some limited support <https://lux-api.readthedocs.io/en/latest/source/advanced/executor.html#sql-executor>`__ for SQL (currently only tested for Postgres). We are actively working on extending Lux to databases. If you are interested in using this feature, please `contact us <http://lux-project.slack.com/>`_ for more information.
Lux has `some limited support <https://lux-api.readthedocs.io/en/latest/source/advanced/executor.html#sql-executor>`__ for SQL (currently only tested for Postgres). We are actively working on extending Lux to databases. If you are interested in using this feature, please `contact us <https://communityinviter.com/apps/lux-project/lux>`_ for more information.

What do I do with date-related attributes in my dataset?
""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Expand Down Expand Up @@ -153,7 +153,7 @@ I'm not able to export my visualizations via the :code:`exported` property.
I have an issue that is not addressed by any of the FAQs.
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Please submit a `Github Issue <https://github.com/lux-org/lux/issues>`__ or ask a question on `Slack <http://lux-project.slack.com/>`__.
Please submit a `Github Issue <https://github.com/lux-org/lux/issues>`__ or ask a question on `Slack <https://communityinviter.com/apps/lux-project/lux>`__.

.. Not Currently Supported
.. - What do I do if I want to change the data type of an attribute?
Expand Down
2 changes: 2 additions & 0 deletions lux/_config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,8 @@ def set_executor_type(self, exe):

self.SQLconnection = ""
self.executor = PandasExecutor()
else:
raise ValueError("Executor type must be either 'Pandas' or 'SQL'")


def warning_format(message, category, filename, lineno, file=None, line=None):
Expand Down
2 changes: 1 addition & 1 deletion lux/action/correlation.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ def correlation(ldf: LuxDataFrame, ignore_transpose: bool = True):
}
ignore_rec_flag = False
# Doesn't make sense to compute correlation if less than 4 data values
if ldf.length < 5:
if ldf._length < 5:
ignore_rec_flag = True
# Then use the data populated in the vis list to compute score
for vis in vlist:
Expand Down
Loading

0 comments on commit 75c5cae

Please sign in to comment.