Skip to content

Commit

Permalink
[DOCS] Registering Custom Recommendations: (#256)
Browse files Browse the repository at this point in the history
* Merge upstream

* Add new default action for geographic data types (longitude and latitude): geoshape

* Add 'country' as a new secondary geographical feature

* Reformat

* Update documentation for custom actions

* Reformat and address comments

* revert black changes

Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com>
  • Loading branch information
micahtyong and dorisjlee committed Feb 9, 2021
1 parent 2e9aa8b commit b12bf54
Show file tree
Hide file tree
Showing 11 changed files with 326 additions and 29 deletions.
50 changes: 26 additions & 24 deletions doc/source/advanced/custom.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,44 +33,46 @@ Here, we first generate a VisList that looks at how various quantitative attribu
a = vis.data.iloc[0,1]
b = vis.data.iloc[1,1]
vis.score = (b-a)/a
vlist = vlist.topK(15)
lux.config.topK = 15
vlist = vlist.showK()
Let's define a custom function to generate the recommendations on the dataframe. In this example, we will use G10 to generate a VisList to calculate the percentage change of means Between G10 v.s. non-G10 countries.

.. code-block:: python
def G10_mean_difference(ldf):
# Define a VisList of quantitative distribution between G10 and non-G10 countries
intent = [lux.Clause("?",data_type="quantitative"),lux.Clause("G10")]
vlist = VisList(intent,df)
# Score each Vis based on the how different G10 and non-G10 bars are
for vis in vlist:
a = vis.data.iloc[0,1]
b = vis.data.iloc[1,1]
vis.score = (b-a)/a
vlist = vlist.topK(15)
return {"action":"G10", "description": "Percentage Change of Means Between G10 v.s. non-G10 countries", "collection": vlist}
# Define a VisList of quantitative distribution between G10 and non-G10 countries
intent = [lux.Clause("?",data_type="quantitative"),lux.Clause("G10")]
vlist = VisList(intent,ldf)
# Score each Vis based on the how different G10 and non-G10 bars are
for vis in vlist:
a = vis.data.iloc[0,1]
b = vis.data.iloc[1,1]
vis.score = (b-a)/a
lux.config.topK = 15
vlist = vlist.showK()
return {"action":"G10", "description": "Percentage Change of Means Between G10 v.s. non-G10 countries", "collection": vlist}
In the code below, we define a display condition function to determine whether or not we want to generate recommendations for the custom action. In this example, we simply check if we are using the HPI dataset to generate recommendations for the custom action `G10`.

.. code-block:: python
def is_G10_hpi_dataset(df):
try:
return all(df.columns == ['HPIRank', 'Country', 'SubRegion', 'AverageLifeExpectancy',
'AverageWellBeing', 'HappyLifeYears', 'Footprint',
'InequalityOfOutcomes', 'InequalityAdjustedLifeExpectancy',
'InequalityAdjustedWellbeing', 'HappyPlanetIndex', 'GDPPerCapita',
'Population', 'G10'])
except:
return False
try:
return all(df.columns == ['HPIRank', 'Country', 'SubRegion', 'AverageLifeExpectancy',
'AverageWellBeing', 'HappyLifeYears', 'Footprint',
'InequalityOfOutcomes', 'InequalityAdjustedLifeExpectancy',
'InequalityAdjustedWellbeing', 'HappyPlanetIndex', 'GDPPerCapita',
'Population', 'G10'])
except:
return False
To register the `G10` action in Lux, we apply the `register_action` function, which takes a name and action as inputs, as well as a display condition and additional arguments as optional parameters.

.. code-block:: python
lux.register_action("G10", G10_mean_difference, is_G10_hpi_dataset)
lux.config.register_action("G10", G10_mean_difference, is_G10_hpi_dataset)
After registering the action, the G10 recomendation action is automatically generated when we display the Lux dataframe again.

Expand Down Expand Up @@ -105,13 +107,13 @@ You can inspect a list of actions that are currently registered in the Lux Actio

.. code-block:: python
lux.actions.__getactions__()
lux.config.actions
You can also get a single action attribute by calling this function with the action's name.

.. code-block:: python
lux.actions.__getattr__("G10")
lux.config.actions.get("G10")
.. image:: https://github.com/lux-org/lux-resources/blob/master/doc_img/custom-2.png?raw=true
:width: 700
Expand All @@ -125,7 +127,7 @@ Let's say that we are no longer in looking at the `G10` action, the `remove_acti

.. code-block:: python
lux.remove_action("G10")
lux.config.remove_action("G10")
After removing the action, when we print the dataframe again, the `G10` action is no longer displayed.

Expand Down
2 changes: 2 additions & 0 deletions lux/action/default.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ def register_default_actions():
from lux.action.enhance import enhance
from lux.action.filter import add_filter
from lux.action.generalize import generalize
from lux.action.map import geomap

# display conditions for default actions
no_vis = lambda ldf: (ldf.current_vis is None) or (
Expand All @@ -19,6 +20,7 @@ def register_default_actions():
lux.config.register_action("distribution", univariate, no_vis, "quantitative")
lux.config.register_action("occurrence", univariate, no_vis, "nominal")
lux.config.register_action("temporal", univariate, no_vis, "temporal")
lux.config.register_action("geographical", geomap, no_vis, "geographical")

lux.config.register_action("Enhance", enhance, one_current_vis)
lux.config.register_action("Filter", add_filter, one_current_vis)
Expand Down
108 changes: 108 additions & 0 deletions lux/action/map.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Copyright 2019-2020 The Lux Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from lux.interestingness.interestingness import interestingness
from lux.vis.VisList import VisList
import lux
from lux.utils import utils


def geomap(ldf, ignore_transpose: bool = True):
"""
Generates map distributions of different attributes in the dataframe.
Parameters
----------
ldf : lux.core.frame
LuxDataFrame with underspecified intent.
data_type_constraint: str
Controls the type of distribution chart that will be rendered.
Returns
-------
recommendations : Dict[str,obj]
object with a collection of visualizations that result from the Distribution action.
"""
import numpy as np

ignore_rec_flag = False
possible_attributes = [
c
for c in ldf.columns
if ldf.data_type[c] == "geoshape" and ldf.cardinality[c] > 5 and c != "Number of Records"
]
recommendation = {
"action": "Geographic",
"description": "Show proportional symbol maps of <p class='highlight-descriptor'>geographic</p> attributes.",
}

if len(ldf) < 5 or len(possible_attributes) < 2:
ignore_rec_flag = True
if ignore_rec_flag or not valid_geoshape(possible_attributes):
recommendation["collection"] = []
return recommendation

intent = [lux.Clause("?", data_model="measure"), lux.Clause("?", data_model="measure")]
intent.append("?")

vlist = VisList(intent, ldf)
for i in range(len(vlist)):
vis = vlist[i]
if has_secondary_geographical_attribute(vis):
vis._mark = "geoshape"
measures = vis.get_attr_by_data_model("measure")
msr1, msr2 = measures[0].attribute, measures[1].attribute
check_transpose = (
check_transpose_not_computed(vlist, msr1, msr2) if ignore_transpose else True
)
vis.score = interestingness(vis, ldf) if check_transpose else -1
else:
vis.score = -1

vlist.sort()
recommendation["collection"] = vlist
return recommendation


def check_transpose_not_computed(vlist: VisList, a: str, b: str):
transpose_exist = list(
filter(
lambda x: (x._inferred_intent[0].attribute == b) and (x._inferred_intent[1].attribute == a),
vlist,
)
)
if len(transpose_exist) > 0:
return transpose_exist[0].score == -1
else:
return False


def has_secondary_geographical_attribute(vis):
assert len(vis.intent) == 3
secondary_attributes = {"state", "country"}
color = vis.intent[2].get_attr()
if color in secondary_attributes:
return True
return False


def valid_geoshape(possible_attributes):
lat, long = {"latitude", "lat"}, {"longitude", "long"}
possible_attributes = set(possible_attributes)
has_lat, has_long = (
len(lat.intersection(possible_attributes)) > 0,
len(long.intersection(possible_attributes)) > 0,
)
return True if has_lat and has_long else False
4 changes: 2 additions & 2 deletions lux/executor/Executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def compute_data_type(self):

def mapping(self, rmap):
group_map = {}
for val in ["quantitative", "id", "nominal", "temporal"]:
for val in ["quantitative", "id", "nominal", "temporal", "geoshape"]:
group_map[val] = list(filter(lambda x: rmap[x] == val, rmap))
return group_map

Expand All @@ -74,7 +74,7 @@ def invert_data_type(self, data_type):
def compute_data_model(self, data_type):
data_type_inverted = self.invert_data_type(data_type)
data_model = {
"measure": data_type_inverted["quantitative"],
"measure": data_type_inverted["quantitative"] + data_type_inverted["geoshape"],
"dimension": data_type_inverted["nominal"]
+ data_type_inverted["temporal"]
+ data_type_inverted["id"],
Expand Down
11 changes: 11 additions & 0 deletions lux/executor/PandasExecutor.py
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,8 @@ def compute_data_type(self, ldf: LuxDataFrame):
ldf._data_type[attr] = "temporal"
elif self._is_datetime_number(ldf[attr]):
ldf._data_type[attr] = "temporal"
elif self._is_geographical_attribute(ldf[attr]):
ldf._data_type[attr] = "geoshape"
elif pd.api.types.is_float_dtype(ldf.dtypes[attr]):
# int columns gets coerced into floats if contain NaN
convertible2int = pd.api.types.is_integer_dtype(ldf[attr].convert_dtypes())
Expand Down Expand Up @@ -497,6 +499,15 @@ def _is_datetime_number(series):
return False
return False

@staticmethod
def _is_geographical_attribute(series):
# run detection algorithm
geographical_var_list = ["longitude", "latitude"]
name = str(series.name).lower()
if name in geographical_var_list:
return True
return False

def compute_stats(self, ldf: LuxDataFrame):
# precompute statistics
ldf.unique_values = {}
Expand Down
3 changes: 2 additions & 1 deletion lux/interestingness/interestingness.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,6 @@ def interestingness(vis: Vis, ldf: LuxDataFrame) -> int:
return 1 - euclidean_dist(query_vis, vis)

# Line/Bar Chart
# print("r:", n_record, "m:", n_msr, "d:",n_dim)
if n_dim == 1 and (n_msr == 0 or n_msr == 1):
if v_size < 2:
return -1
Expand Down Expand Up @@ -113,6 +112,8 @@ def interestingness(vis: Vis, ldf: LuxDataFrame) -> int:
if v_size < 10:
return -1
color_attr = vis.get_attr_by_channel("color")[0].attribute
if vis.mark == "geoshape":
return vis.data[dimension_lst[0].get_attr()].nunique()

C = ldf.cardinality[color_attr]
if C < 40:
Expand Down
2 changes: 2 additions & 0 deletions lux/processor/Compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,8 @@ def populate_data_type_model(ldf, vlist):
clause.data_type = ldf.data_type[clause.attribute]
if clause.data_type == "id":
clause.data_type = "nominal"
if clause.data_type == "geoshape":
clause.data_type = "quantitative"
if clause.data_model == "":
clause.data_model = data_model_lookup[clause.attribute]
if clause.value != "":
Expand Down
3 changes: 3 additions & 0 deletions lux/vislib/altair/AltairRenderer.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from lux.vislib.altair.LineChart import LineChart
from lux.vislib.altair.Histogram import Histogram
from lux.vislib.altair.Heatmap import Heatmap
from lux.vislib.altair.SymbolMap import SymbolMap


class AltairRenderer:
Expand Down Expand Up @@ -82,6 +83,8 @@ def create_vis(self, vis, standalone=True):
chart = LineChart(vis)
elif vis.mark == "heatmap":
chart = Heatmap(vis)
elif vis.mark == "geoshape":
chart = SymbolMap(vis)
else:
chart = None

Expand Down

0 comments on commit b12bf54

Please sign in to comment.