"Lux is built on the philosophy that users should always be able to visualize anything they want without having to think about how the visualization should look like".

-- In Lux, you don’t explicitly create plots; you simply specify your analysis intent, i.e., what attributes/subset interest you; Lux takes care of the rest. Apart from this, Lux is tightly integrated with Pandas and can be used without modifying any code with just one import statement. It preserves the Pandas data frame semantics, so all the commands from the Pandas’s API work in Lux as expected.

-- Lux is a python API for intelligent visual discovery, which comes with an inbuilt interactive jupyter widget.

-- Lux could be your intelligent assistant which can automate the visual aspects of the exploratory data analysis.

-- It provides powerful abstractions of the visualizations soon after the data frame has been displayed in the jupyter notebook with just a click.

-- Lux is a very rich user intent-based language.

-- The main intention of the Lux Library is,to make the visualizations as simple as loading a .

-- The interactive Lux widget assists the user to quickly browse through the data and view important trends and patterns.

-- It provides recommendations for the user to analyze further. Lux, can also create visualizations for those sections of the data, you have no clear idea about.



# Installing Libraries and Dependencies

In [None]:
#### Installation is needed to be done using terminal

In [None]:
#Install Lux from PyPI

!pip install lux-api

In [None]:
#Install and activate the Lux notebook extension (lux-widget) included in the package.
jupyter nbextension install --py luxwidget
jupyter nbextension enable --py luxwidget

# Importing Libraries and Dependencies

In [1]:
import lux
import pandas as pd

# Reading Dataset

In [2]:
df = pd.read_spss('DataManPower.sav')
df = df[0:500]

In [3]:
#hide
df['BIRTH_DATE'] = pd.to_datetime(df['BIRTH_DATE'], errors = 'coerce')
df['BIRTH_DATE']

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

In [4]:
#hide
def from_dob_to_age(born):
    today = datetime.today()
    return (today - born).days / 365.25

In [5]:
#hide
from datetime import datetime
df['Age'] = df['BIRTH_DATE'].apply(lambda x: from_dob_to_age(x))
df['Age'] 

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

In [6]:
df['Age'] = df['Age'].round(2)
df['Age']

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

In [7]:
#hide
df['CIVIL_ID'] = pd.to_numeric(df['CIVIL_ID'])
df['COUNTRY_CODE'] = pd.to_numeric(df['COUNTRY_CODE'])
df['GENDER_CODE'] = pd.to_numeric(df['GENDER_CODE'])
df['COUNTRY_CODE'] = pd.to_numeric(df['COUNTRY_CODE'])
df['RLGION_CODE'] = pd.to_numeric(df['RLGION_CODE'])
df['JOB_CODE'] = pd.to_numeric(df['JOB_CODE'])
df['ECONOMIC_ACT_CODE'] = pd.to_numeric(df['ECONOMIC_ACT_CODE'])
df['EDUCATION_CODE'] = pd.to_numeric(df['EDUCATION_CODE'])
df['MAJOR_CODE'] = pd.to_numeric(df['MAJOR_CODE'])
df['SALARY'] = pd.to_numeric(df['SALARY'])
df['SALARY_TYPE'] = pd.to_numeric(df['SALARY_TYPE'])
df['ONR_GVRN_CODE'] = pd.to_numeric(df['ONR_GVRN_CODE'])
df['MARITAL_STATUS_CODE'] = pd.to_numeric(df['MARITAL_STATUS_CODE'])
df['ADDRESS_AUTO_NO'] = pd.to_numeric(df['ADDRESS_AUTO_NO'])
df['ONR_ID'] = pd.to_numeric(df['ONR_ID'])
df['جنسية'] = pd.to_numeric(df['جنسية'])

#hide
from datetime import datetime
df['HIRE_DATE'] = pd.to_datetime(df['HIRE_DATE'], errors = 'coerce')
df['BIRTH_DATE'] = pd.to_datetime(df['BIRTH_DATE'],errors = 'coerce')

#hide
df['COUNTRY_DESC'] = df['COUNTRY_DESC'].astype('str')
df['GENDER_DESC'] = df['GENDER_DESC'].astype('str')
df['RLGION_DESC'] = df['RLGION_DESC'].astype('str')
df['JOB_DESC'] = df['JOB_DESC'].astype('str')
df['SECTOR'] = df['SECTOR'].astype('str')
df['ECONOMIC_ACT_DESC'] = df['ECONOMIC_ACT_DESC'].astype('str')
df['EDUCATION_DESC'] = df['EDUCATION_DESC'].astype('str')
df['GOVERNORATE_DESC'] = df['GOVERNORATE_DESC'].astype('str')
df['MARITAL_STATUS_DESC'] = df['MARITAL_STATUS_DESC'].astype('str')
df['COMPANY_NAME'] = df['COMPANY_NAME'].astype('str')

In [8]:
#after printing a data frame, a toggle option is facilitated to view the Lux visualizations

In [9]:
df

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

This creates several plots divided into three tabs:

-- **Correlation:** 

Visualizes the relationships between two qualitative attributes. 

The plots are arranged from the highest to the lowest correlated pair of attributes.

-- **Distribution:** 

Shows histogram distributions of different quantitative attributes, ranked from the most to least skewed. 

-- **Occurrence:** 

Displays bar chart distributions of different categorical attributes, ranked from most to least uneven plots.  

# Intent

In addition to dataframe visualizations at every step in the exploration, attributes and values of interest can be specified in Lux.

Based on this intent, Lux guides users towards potential next-steps in their exploration.

In [10]:
df.columns

Index(['CIVIL_ID', 'BIRTH_DATE', 'COUNTRY_CODE', 'COUNTRY_DESC', 'GENDER_CODE',
       'GENDER_DESC', 'RLGION_CODE', 'RLGION_DESC', 'JOB_CODE', 'JOB_DESC',
       'SECTOR', 'ECONOMIC_ACT_CODE', 'ECONOMIC_ACT_DESC', 'EDUCATION_CODE',
       'EDUCATION_DESC', 'MAJOR_CODE', 'SALARY', 'SALARY_TYPE',
       'ONR_GVRN_CODE', 'GOVERNORATE_DESC', 'MARITAL_STATUS_CODE',
       'MARITAL_STATUS_DESC', 'COMPANY_NAME', 'HIRE_DATE', 'ADDRESS_AUTO_NO',
       'ONR_ID', 'جنسية', 'Age'],
      dtype='object')

In [11]:
df.intent = ["SALARY","JOB_CODE"]
df

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

In [12]:
df["JOB_CODE"]

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

In [16]:
df[df["JOB_CODE"]==21930].to_pandas()

Unnamed: 0,CIVIL_ID,BIRTH_DATE,COUNTRY_CODE,COUNTRY_DESC,GENDER_CODE,GENDER_DESC,RLGION_CODE,RLGION_DESC,JOB_CODE,JOB_DESC,...,ONR_GVRN_CODE,GOVERNORATE_DESC,MARITAL_STATUS_CODE,MARITAL_STATUS_DESC,COMPANY_NAME,HIRE_DATE,ADDRESS_AUTO_NO,ONR_ID,جنسية,Age
48,279100200406,1979-01-01,101.0,الكويت,2,انثى,,,21930,مدير مبيعات,...,5,محافظة الفروانية,2,متزوج,مصنع نسيم الكويت لتصنيع الزجاج المشغول والمراي...,2020-04-01,13161911,293436000000.0,1.0,42.4
117,272051201434,1972-01-01,721.0,باكستان,1,ذكر,1.0,مسلم,21930,مدير مبيعات,...,1,محافظة العاصمة,2,متزوج,المشاعل لصياغة الحلى الذهبية والفضية,2008-12-16,10051709,250042700000.0,2.0,49.4
498,262122800233,1962-01-01,106.0,الأردن,1,ذكر,1.0,مسلم,21930,مدير مبيعات,...,1,محافظة العاصمة,2,متزوج,شركة الانارة العالمية لبيع الادوات الكهربائية ...,2010-02-21,10527279,282999700000.0,2.0,59.39


In [15]:
df[df["JOB_CODE"]==32125].to_pandas()

Unnamed: 0,CIVIL_ID,BIRTH_DATE,COUNTRY_CODE,COUNTRY_DESC,GENDER_CODE,GENDER_DESC,RLGION_CODE,RLGION_DESC,JOB_CODE,JOB_DESC,...,ONR_GVRN_CODE,GOVERNORATE_DESC,MARITAL_STATUS_CODE,MARITAL_STATUS_DESC,COMPANY_NAME,HIRE_DATE,ADDRESS_AUTO_NO,ONR_ID,جنسية,Age


In [17]:
df = df[(df["JOB_CODE"]!=43390) & (df["JOB_CODE"]!=21930)]
df

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

The left-hand side of the widget shows the current visualization, i.e. the current visualization generated based on what the user is interested in. On the right, Lux generates three sets of recommendations, organized as separate tabs on the widget:

Enhance adds an additional attribute to the current selection, essentially highlighting how additional variables affect the relationship of Job DESC and SALARY. 

**Filter** adds a filter to the current selection, while keeping attributes (on the X and Y axes) fixed. These visualizations shows how the relationship of JOB DESC and SALARY changes for different subsets of data. 

**Generalize** removes an attribute to display a more general trend, showing the distributions of AverageCost and SATAverage on its own

# Vis

Visualizations are represented as Vis objects in Lux. These Vis objects can be translated into Altair or VegaLite code

ux is built on the philosophy that users should always be able to visualize anything they want, without having to think about how the visualization should look like. Lux automatically determines the mark and channel mappings based on a set of best practices from Tableau. The visualizations are rendered via Altair into Vega-Lite specifications.

In [18]:
from lux.vis.Vis import Vis
job_salary = Vis(["JOB_CODE","SALARY"],df)
job_salary

LuxWidget(current_vis={'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}, 'axis': {'labelCo…

# Vis List

we can create a set of visualizations of Weight with respect to all other attributes, using the wildcard “?” symbol

In [19]:
from lux.vis.VisList import VisList
VisList(["GENDER_CODE","?"],df)

LuxWidget(recommendations=[{'action': 'Vis List', 'description': 'Shows a vis list defined by the intent', 'vs…

In [22]:
VisList(["ONR_GVRN_CODE","?"],df)

LuxWidget(recommendations=[{'action': 'Vis List', 'description': 'Shows a vis list defined by the intent', 'vs…

# Recommendation

We can also access the set of recommendations generated for the data frames via the properties recommendation. The output is a dictionary, keyed by the name of the recommendation category.

In [23]:
df.recommendation

{'Enhance': [<Vis  (x: SALARY, y: JOB_CODE, color: SECTOR) mark: scatter, score: 1.00 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: SALARY_TYPE) mark: scatter, score: 1.00 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: جنسية) mark: scatter, score: 0.50 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: GENDER_CODE) mark: scatter, score: 0.50 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: GENDER_DESC) mark: scatter, score: 0.50 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: MARITAL_STATUS_CODE) mark: scatter, score: 0.25 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: MARITAL_STATUS_DESC) mark: scatter, score: 0.25 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: RLGION_CODE) mark: scatter, score: 0.20 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: RLGION_DESC) mark: scatter, score: 0.20 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: ONR_GVRN_CODE) mark: scatter, score: 0.17 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: GOVERNORATE_DESC) mark: scatter, score: 0.17 >,
  <Vis  (x: SALARY, y: JOB_CODE, color: EDUCATION_DESC) mark: sc

# Exporting Visualizations as Code

In [24]:
vis = df.recommendation["Enhance"][1]
vis

LuxWidget(current_vis={'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}, 'axis': {'labelCo…

In [25]:

print(vis.to_altair())

import altair as alt

chart = alt.Chart(df).mark_circle().encode(
    x=alt.X('SALARY',scale=alt.Scale(domain=(60, 5000)),type='quantitative', axis=alt.Axis(title='SALARY')),
    y=alt.Y('JOB_CODE',scale=alt.Scale(domain=(2120, 99982)),type='quantitative', axis=alt.Axis(title='JOB_CODE'))
)
chart = chart.configure_mark(tooltip=alt.TooltipContent('encoding')) # Setting tooltip as non-null
chart = chart.interactive() # Enable Zooming and Panning
chart = chart.encode(color=alt.Color('SALARY_TYPE',type='nominal'))

chart = chart.configure_title(fontWeight=500,fontSize=13,font='Helvetica Neue')
chart = chart.configure_axis(titleFontWeight=500,titleFontSize=11,titleFont='Helvetica Neue',
			labelFontWeight=400,labelFontSize=8,labelFont='Helvetica Neue',labelColor='#505050')
chart = chart.configure_legend(titleFontWeight=500,titleFontSize=10,titleFont='Helvetica Neue',
			labelFontWeight=400,labelFontSize=8,labelFont='Helvetica Neue')
chart = chart.properties(width=160,height=150)

chart


In [29]:
import altair as alt

chart = alt.Chart(df).mark_circle().encode(
    x=alt.X('SALARY',scale=alt.Scale(domain=(60, 5000)),type='quantitative', axis=alt.Axis(title='SALARY')),
    y=alt.Y('JOB_CODE',scale=alt.Scale(domain=(2120, 99982)),type='quantitative', axis=alt.Axis(title='JOB_CODE'))
)
chart = chart.configure_mark(tooltip=alt.TooltipContent('encoding')) # Setting tooltip as non-null
chart = chart.interactive() # Enable Zooming and Panning
chart = chart.encode(color=alt.Color('SALARY_TYPE',type='nominal'))

chart = chart.configure_title(fontWeight=500,fontSize=13,font='Helvetica Neue')
chart = chart.configure_axis(titleFontWeight=500,titleFontSize=11,titleFont='Helvetica Neue',
			labelFontWeight=400,labelFontSize=8,labelFont='Helvetica Neue',labelColor='#505050')
chart = chart.configure_legend(titleFontWeight=500,titleFontSize=10,titleFont='Helvetica Neue',
			labelFontWeight=400,labelFontSize=8,labelFont='Helvetica Neue')
chart = chart.properties(width=160,height=150)

chart


In [26]:
print(vis.to_vegalite())

** Remove this comment -- Copy Text Below to Vega Editor(vega.github.io/editor) to visualize and edit **
{
  "config": {
    "view": {
      "continuousWidth": 400,
      "continuousHeight": 300
    },
    "axis": {
      "labelColor": "#505050",
      "labelFont": "Helvetica Neue",
      "labelFontSize": 9,
      "labelFontWeight": 400,
      "titleFont": "Helvetica Neue",
      "titleFontSize": 11,
      "titleFontWeight": 500
    },
    "legend": {
      "labelFont": "Helvetica Neue",
      "labelFontSize": 9,
      "labelFontWeight": 400,
      "titleFont": "Helvetica Neue",
      "titleFontSize": 10,
      "titleFontWeight": 500
    },
    "mark": {
      "tooltip": {
        "content": "encoding"
      }
    },
    "title": {
      "font": "Helvetica Neue",
      "fontSize": 13,
      "fontWeight": 500
    }
  },
  "data": {
    "name": "data-ad34afc27c89be127544f878c799b6ec"
  },
  "mark": "circle",
  "encoding": {
    "color": {
      "type": "nominal",
      "field": "SALARY_T

Lux provides a powerful abstraction for working with collections of visualizations based on a partially specified queries. Users can provide a list or a wildcard to iterate over combinations of filter or attribute values and quickly browse through large numbers of visualizations. The partial specification is inspired by existing work on intent languages for visualization languages

# Lux.Clause

There’s only so much one can accomplish with string-based intent specifications, lux.Claus offers a more complex and expressive way of specifying intent. Additionally, it allows us to override auto-inferred details about the plots, such as the attribute’s default axis or the aggregation function used for the quantitative attributes.

In [33]:
df.intent = [lux.Clause(attribute='SALARY', channel='y')]
df

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

In [34]:
df.intent = ["SALARY",lux.Clause("SALARY",aggregation="sum")]
df

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

Lux offers data scientists a quick way to easily explore patterns and profile their data through automated visualizations inside of their Jupyter notebook. The ability to quickly slice and dice datasets without the need for extensive code provides efficiency and helps speed up the end-to-end process of analyzing new datasets.

The library also includes the ability to export graphs in a number of ways, including into code that can be called outside of the Lux widget itself giving data scientists a quick and easy way to share their analysis.