In [1]:
import pandas as pd
import lux

Load in a dataset of 392 different cars from 1970-1982:

In [2]:
df = pd.read_csv("lux/data/car.csv")
df["Year"] = pd.to_datetime(df["Year"], format='%Y') # change pandas dtype for the column "Year" to datetype
df.head().toPandas()

Unnamed: 0,Name,MilesPerGal,Cylinders,Displacement,Horsepower,Weight,Acceleration,Year,Origin
0,chevrolet chevelle malibu,18.0,8,307.0,130,3504,12.0,1970-01-01,USA
1,buick skylark 320,15.0,8,350.0,165,3693,11.5,1970-01-01,USA
2,plymouth satellite,18.0,8,318.0,150,3436,11.0,1970-01-01,USA
3,amc rebel sst,16.0,8,304.0,150,3433,12.0,1970-01-01,USA
4,ford torino,17.0,8,302.0,140,3449,10.5,1970-01-01,USA


In [3]:
df

LuxWidget(recommendations=[{'action': 'Correlation', 'description': 'Show relationships between two quantitati…



Intuitively, we expect cars with more horsepower means higher acceleration, but we are actually seeing the opposite of that trend.
Let's learn more about whether there are additional factors that is affecting this relationship.

In [8]:
df.setContext([lux.Spec(attribute = "Acceleration"),lux.Spec(attribute = "Horsepower")])
df

LuxWidget(current_view={'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}, 'mark': {'toolti…



In Enhance, all the added variable (color), except MilesPerGal, shows a trend for the value being higher on the upper-left end, and value decreases towards the bottom-right.
Now given these three other variables, let's look at what the Displacement and Weight is like for different Cylinder cars.

In [14]:
# df.setContext([lux.Spec(attribute= ["Weight","Displacement"]),lux.Spec(attribute = "Cylinders")])
df.setContext([lux.Spec(attribute = "Cylinders")])
df

LuxWidget(current_view={'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}, 'mark': {'toolti…



The Count distribution shows that there is not a lot of cars with 3 and 5 cylinders, so let's clean the data up to remove those.

In [15]:
df[df["Cylinders"]==3].toPandas()


Unnamed: 0,index,Name,MilesPerGal,Cylinders,Displacement,Horsepower,Weight,Acceleration,Year,Origin
70,70,mazda rx2 coupe,19.0,3,70.0,97,2330,13.5,1972-01-01,Japan
110,110,maxda rx3,18.0,3,70.0,90,2124,13.5,1973-01-01,Japan
241,241,mazda rx-4,21.5,3,80.0,110,2720,13.5,1977-01-01,Japan
331,331,mazda rx-7 gs,23.7,3,70.0,100,2420,12.5,1980-01-01,Japan


In [16]:
df[df["Cylinders"]==5].toPandas()


Unnamed: 0,index,Name,MilesPerGal,Cylinders,Displacement,Horsepower,Weight,Acceleration,Year,Origin
272,272,audi 5000,20.3,5,131.0,103,2830,15.9,1978-01-01,Europe
295,295,mercedes benz 300d,25.4,5,183.0,77,3530,20.1,1979-01-01,Europe
325,325,audi 5000s (diesel),36.4,5,121.0,67,2950,19.9,1980-01-01,Europe


In [17]:
newdf = df[(df["Cylinders"]!=3) & (df["Cylinders"]!=5)]
newdf

LuxWidget(current_view={'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}, 'mark': {'toolti…



<ViewCollection: [<View: Mark: bar, Specs: [Spec < description:,channel:x,attribute:Weight,aggregation:mean,value:>dataModel:measure,dataType:quantitative,binSize:0, Spec < description:,channel:y,attribute:Cylinders,aggregation:,value:>dataModel:dimension,dataType:nominal,binSize:0], Score:0.0>, <View: Mark: bar, Specs: [Spec < description:,channel:x,attribute:Displacement,aggregation:mean,value:>dataModel:measure,dataType:quantitative,binSize:0, Spec < description:,channel:y,attribute:Cylinders,aggregation:,value:>dataModel:dimension,dataType:nominal,binSize:0], Score:0.0>]>

[{'action': 'Enhance',
  'description': 'Shows possible visualizations when an additional attribute is added to the current view.',
  'collection': <ViewCollection: [<View: Mark: scatter, Specs: [Spec < description:,channel:x,attribute:Acceleration,aggregation:,value:>dataModel:measure,dataType:quantitative,binSize:0, Spec < description:,channel:y,attribute:Horsepower,aggregation:,value:>dataModel:measure,dataType:quantitative,binSize:0, Spec < description:,channel:color,attribute:index,aggregation:,value:>dataModel:measure,dataType:quantitative,binSize:0], Score:0.5>, <View: Mark: scatter, Specs: [Spec < description:,channel:x,attribute:Acceleration,aggregation:,value:>dataModel:measure,dataType:quantitative,binSize:0, Spec < description:,channel:y,attribute:Horsepower,aggregation:,value:>dataModel:measure,dataType:quantitative,binSize:0, Spec < description:,channel:color,attribute:Name,aggregation:,value:>dataModel:dimension,dataType:nominal,binSize:0], Score:0.5>, <View: Mark: scatt