# How to use Facets for interactive visualization of data

[Facets](https://pair-code.github.io/facets/) is part of Google's [People+AI Research Initiative (PAIR)](https://ai.google/pair).

Note - As an alternative to this notebook, data can be explored using the [1000 Genomes Data Explorer](https://test-data-explorer.appspot.com). For other datasets, see [if there is a Data Explorer](https://app.terra.bio/#library/datasets) for your dataset.

# Setup

In [None]:
!pip3 install facets-overview

In [None]:
import base64
import os

import pandas as pd
from facets_overview.generic_feature_statistics_generator import \
    GenericFeatureStatisticsGenerator

## Add the wrapper code.

In [None]:
FACETS_DEPENDENCIES = {
    "facets_html": "https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html",
    "webcomponents_js": "https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js",
}

# Terra notebook Content Security Policy prohibits pulling these files from
# a remote location, so this code depends on the fact we can refer to it
# from a location relative to the notebook.
for dep, url in FACETS_DEPENDENCIES.items():
    if not os.path.exists(os.path.basename(url)):
        os.system("wget --no-clobber " + url)
    # Update dictionary to replace absolute url with relative url.
    FACETS_DEPENDENCIES[dep] = os.path.basename(url)


class FacetsOverview(object):
    """Methods for Facets Overview notebook integration."""

    def __init__(self, data):
        # This takes the dataframe and computes all the inputs to the Facets
        # Overview plots such as:
        # - numeric variables: histogram bins, mean, min, median, max, etc..
        # - categorical variables: num unique, counts per category for bar chart,
        #     top category, etc.
        gfsg = GenericFeatureStatisticsGenerator()
        self._proto = gfsg.ProtoFromDataFrames(
            [{"name": "data", "table": data}],
        )

    def _repr_html_(self):
        """Html representation of Facets Overview for use in a Jupyter notebook."""
        protostr = base64.b64encode(self._proto.SerializeToString()).decode("utf-8")
        html_template = """
        <script src="{webcomponents_js}"></script>
        <link rel="import" href="{facets_html}">
        <facets-overview id="overview_elem"></facets-overview>
        <script>
          document.querySelector("#overview_elem").protoInput = "{protostr}";
        </script>"""
        html = html_template.format(
            facets_html=FACETS_DEPENDENCIES["facets_html"],
            webcomponents_js=FACETS_DEPENDENCIES["webcomponents_js"],
            protostr=protostr,
        )
        return html


class FacetsDive(object):
    """Methods for Facets Dive notebook integration."""

    def __init__(self, data, height=1000):
        self._data = data
        self.height = height

    def _repr_html_(self):
        """Html representation of Facets Dive for use in a Jupyter notebook."""
        html_template = """
        <script src="{webcomponents_js}"></script>
        <link rel="import" href="{facets_html}">
        <facets-dive id="dive_elem" height="{height}"></facets-dive>
        <script>
          document.querySelector("#dive_elem").data = {data};
        </script>"""
        html = html_template.format(
            facets_html=FACETS_DEPENDENCIES["facets_html"],
            webcomponents_js=FACETS_DEPENDENCIES["webcomponents_js"],
            data=self._data.to_json(orient="records"),
            height=self.height,
        )
        return html

# Load some public data from BigQuery

In [None]:
df = pd.io.gbq.read_gbq(
    """
  SELECT
    *
  FROM
    `genomics-public-data.1000_genomes.sample_info`
"""
)

df.shape

In [None]:
df.head()

# Facets Overview

See https://ipython.org/ipython-doc/3/notebook/security.html for more detail about 'trusted' and 'untrusted' notebooks.

**If you do not see FacetsOverview**, click on the 'Not Trusted' button in the upper right hand corner of the screen and change to 'Trusted'.

In [None]:
FacetsOverview(df)

# Facets Dive

See https://ipython.org/ipython-doc/3/notebook/security.html for more detail about 'trusted' and 'untrusted' notebooks.

**If you do not see Facets Dive**, click on the 'Not Trusted' button in the upper right hand corner of the screen and change to 'Trusted'.

In [None]:
FacetsDive(df)

# Provenance

In [None]:
import datetime

print(datetime.datetime.now())

In [None]:
!pip3 freeze

Copyright 2018 The Broad Institute, Inc., Verily Life Sciences, LLC All rights reserved.

This software may be modified and distributed under the terms of the BSD license. See the LICENSE file for details.