# LUX-Python library for Exploratory Data Analysis
* LUX is known to be as an open-source, python library for accelerating and simplifying the process of data exploration.
* LUX recommends interesting visualizations to guide users towards potential next-steps in their data analysis.

![image](https://drive.google.com/uc?id=1YkYMsjKI3p56cnKi5K-SYOt__RoWxEkU)

# Importance of Data Visualization in Data Science

There are various tools available that facilitate data visualizations. But at times, these tools aren't intelligent enough. This leads to users to write a considerable amount of code. This may eventually result into shift of focus from establishment of critical realtionships within the data to the mechanics of visualization.

Therefore, a new library called LUX has been introduced which simplifies data exploration/exploratory data analysis which recommends relevant visualizations to the users.

* It can also be said that the LUX Library has some features which can automate the whole visualization process; thus saving time and effort.
* In LUX, we dont create plots explicitly;we simply specify the purpose of analysis i.e. what attributes are of our interest and LUX takes care of the rest.

## Current challenges to efficient data exploration

The three major obstacles faced by data analysts are as follows:
* *Plotting requires a considerable amount of code and prior decisions:*
  In order to visualize the data,a user needs to make sure that the visualizations are upto the mark. Users need to think about how the visualizations should look like with all the specifications. Then we need to translate all these specification details into code.
* *The disconnect between code and interactive tools:*
   Most of the programming tools are inaccesible to beginners although they provide flexibility. Whereas the tools which are easy to use for beginners are not flexible and are hard to customize.
* *Trial and Error procedure is tedious and overwhelming:*
  A user has to try multiple visualizations before settling/choosing a final one. Every EDA requires continuous cycle of trial and error. This might lead to data analysts missing out on important insights which are present in their datasets. It is also possible that the user might not know what set of operations they should perform on their data set. This may lead to analysts losing track of the analysis.
  

## How LUX Library helps to overcome these challenges in data exploration
* The goal of LUX is to make it easier for data scientists to explore their data even when they don't have a clear idea of what they're looking for.
* LUX brings the power of interactive visualizations directly into Jupyter Notebook which bridge the gap between code and interactive interfaces.
* It also features a powerful intent language that allows users to specify their analysis interests to lower the programming costs.
* It also provides visulaization recommendations of data frames automatically to users.

*Note: Lux does not work in Colab because Colab doesn’t support custom widgets yet.*

**Getting Started with LUX Installation and Configuration**

Lux requires Scipy, Altair, Pandas and Scikit-learn to be installed. Once these have been configured, installing Lux can be done through pip install commands in your console.

In [None]:
pip install lux-api

**Now, we will add an additional import statement alongside Pandas import.**

In [1]:
import lux
import pandas as pd

We will now import a dataset into a dataframe object and call it.
LUX appears as a toggle button after we call the dataframe object.

In [2]:
df = pd.read_csv("https://raw.githubusercontent.com/lux-org/lux-datasets/master/data/college.csv")
df


Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

**After calling the dataframe object we can see a Toggle button**
![image](https://drive.google.com/uc?id=1q4swtjHwzUxQfSWBiLKlBblpaK6BNKsx)

**After clicking on the Toggle button;we can see certain recommendations which are segregated by three tabs i.e.
Correlation,Distribution and Occurence.**
<img src="https://drive.google.com/uc?id=1Ffv6FFibu9kow-724H7PaylkZBK1A-jA">

**By default,LUX creates correlation,distribution and occurrence charts across the data-frame object. These tabs can be used to identify interesting patterns within the datasets.**
* *Correlation shows a set of pairwise relationships between quantitative attributes ranked by the most correlated to the least correlated one.*
* *Distribution shows a set of unvariate distributions ranked by the most skewed to the least skewed.*
* *Occurence shows a set of bar charts that can be generated from the data set.*



## Intent based visualization 
We can further customize the presentations of these charts by specifying the intent.

*Intent is defined as the attributes which we want to analyse.*
We specify the AverageCost and SATAverage as an intent and call the dataframe again.



In [3]:
df.intent = ["AverageCost","SATAverage"]
df

Button(description='Toggle Pandas/Lux', layout=Layout(top='5px', width='140px'), style=ButtonStyle())

Output()

<img src="https://drive.google.com/uc?id=11yf1Al7H-_JDK3DI4HJrYdXWID9u7Hln">

* After clicking on the Toggle button;LUX now displays charts based on the intent which we specified.These graphs show the correlation of other attributes against the intent specified by us in the **Enhance** tab.
* LUX automatically includes the new attribute as the intent and graphs are generated on the basis of these intents.


**LUX recommends interesting plots in certain tabs which are as follows:**
* *Enhance tab:* Enhance lets the user visualize the relationship between the intent specified and different attributes.
* *Filter tab:* It adds filters to the intended visualization,it lets the user quickly browse through the subsets of data.
* *Generalize tab:* It completely deletes the additional attributes and filters from the plots to display a more generalized form of features relationship.

# Quick , on demand visualizations with the help of automatic encoding
As we now know that LUX can be used to discover some interesting visulaizations,we can dig into these visualizations a bit more.

We have seen that how visualizations are recommended to the users. Users can also create their own visualizations by using the similar syntax and specifying the intent.

Visualizations are represented as ```Vis``` objects in LUX. These visualizations can further be edited.


In [4]:
from lux.vis.Vis import Vis

The user can also create their own visualization with the ```Vis``` function.

In [5]:
RegionIncome = Vis(["Region=Southeast","MedianFamilyIncome"],df)
RegionIncome

LuxWidget(current_vis={'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}, 'axis': {'labelCo…

![image](https://drive.google.com/uc?id=1od8VlpvI9OXLUTfVQ0gIPWZ53uMyJd6s)

Therefore from the tutorial given above,we can conclude that the LUX library has features which automate whole visualization process in less time.