# 2021 July 14 Victoria Analytics Meetup Talk

Welcome!  This document has been prepared for the 2021 July 14 Victoria Analytics Meetup Talk.  Topics address include:

* CanvasXpress for Python
* Providing charts with data in different formats
* Chart configuration options in different formats
* Chart Javascript functions

In [1]:
import os
import json
import pandas as pd

# The essential CanvasXpress chart
from canvasxpress.canvas import CanvasXpress

# Data acquisition and reference assistance 
from canvasxpress.data.url import CXUrlData
from canvasxpress.data.text import CXTextData
from canvasxpress.data.keypair import CXJSONData
from canvasxpress.data.matrix import CXDataframeData
from canvasxpress.data.profile import CXStandardProfile

# Configuration assistance
from canvasxpress.config.collection import CXConfigs
from canvasxpress.config.type import CXString, CXInt, CXBool, CXList, CXGraphType, CXGraphTypeOptions

# Event definition assistance
from canvasxpress.js.collection import CXEvents
from canvasxpress.js.function import CXEvent

# Visualization assistance
from canvasxpress.render.jupyter import CXNoteBook

# Data and Formats for CanvasXpress Charts

## The JSON Data Object

CanvasXpress makes use of JSON data objects for describing chart data and annotations.  There are four significant kinds of JSON data objects:

1. Standard
2. Venn
3. Network
4. Genome

The [CanvasXpress Documentation](https://www.canvasxpress.org/docs.html#data) provides an extensive discussion about each type, so today we will focus on the most common format.

### The Standard (XYZ) Object

Most charts make use of the Standard JSON data object.  It's structured as follows:

```text
{
    "y": {
        "smps": [Column Names as a list],
        "vars": [Row names as a list],
        "data": [
            [per Row, one list item per Column]
        ]
    },
    "x": {
        "Topic": [List of annotations per Column]
    },
    "z": {
        "Topic": [List of annotations per Row]
    }
}
```

The `y` attribute tracks the data core to the chart.  Data is matrix in nature, which each column representing a sample and each row representing a variable.  For example:

```javascript
{
  "y": {
      "smps": ["Melting Point (C)", "Exploding Point (C)"],
      "vars": ["Computer", "Phone", "Car"],
      "data": [
       [200, 400],  // Computer
       [100, 200],  // Phone
       [500, 2000], // Car
    ]
  }
}
```

The `x` and `z` attributes permit descriptions or supplemental values (AKA annotations) to be provided for the column or row perspectives, respectively.  As an example:

```javascript
{
  "y": {
    "vars": [ "Variable1" ],
    "smps": [ "Sample1", "Sample2", "Sample3" ],
    "data": [ [ 10, 20, 30 ] ]
  },
  "x": {
    //          Sample1   Sample2   Sample3
    "Tissue": [ "Kidney", "Lung",   "Heart" ],
    "Donor":  [ "D1",     "D1",     "D2" ]
  },
  "z": {
    //           Variable1
    "Symbol":  [ "AAA" ],
    "Pathway": [ "P1" ]
  }
}
```

# Native Data Wrangling

CanvasXpress for Python can work with JSON-like data using a `dict` object, which will be automatically converted to JSON.  Using our Heatmap example, we have access to example data at the CanvasXpress site:

In [2]:
import pandas as pd

y_data_url = "https://www.canvasxpress.org/data/cX-heatmapR-dat.txt"
y_data = pd.read_csv(y_data_url, sep='\t', index_col=0)
y_data.head(3)

Unnamed: 0,S1,S2,S3,S4,S5,S6,S7,S8,S9,S10,...,S16,S17,S18,S19,S20,S21,S22,S23,S24,S25
V1,0.784,1.036,-0.641,1.606,2.208,3.879,0.333,2.265,-1.55,1.678,...,1.013,0.928,0.812,0.072,3.564,0.47,1.836,0.351,3.139,-2.207
V2,0.222,0.716,0.993,-0.913,0.996,1.235,1.396,1.817,0.162,1.137,...,0.696,0.777,1.6,0.175,2.423,0.044,3.881,-0.757,1.486,0.01
V3,0.486,2.15,-0.069,-0.468,0.402,0.725,-1.697,0.653,0.101,2.852,...,2.511,0.07,0.244,-0.41,2.345,2.401,-0.033,0.951,2.053,0.725


In [3]:
x_data_url = "https://www.canvasxpress.org/data/cX-heatmapR-smp.txt"
x_data = pd.read_csv(x_data_url, sep='\t', index_col=0)
x_data.head(3)

Unnamed: 0,Dose,Dose-Type,Site,Treatment
S1,,,Site1,Control
S2,,,Site1,Control
S3,,,Site1,Control


In [4]:
z_data_url = "https://www.canvasxpress.org/data/cX-heatmapR-var.txt"
z_data = pd.read_csv(z_data_url, sep='\t', index_col=0)
z_data.head(3)

Unnamed: 0,Sens,Type
V1,1,Pro
V2,2,Tyr
V3,3,Pho


We can process the above tables into a Standard format JSON data structure:

In [5]:
hm_data = {
    # Data
    "y": {
        # Samples, which can be considered the columns
        "smps": y_data.columns.to_list(),
        # Variables, which can be considered the index
        "vars": y_data.index.tolist(),
        # The data itself, split into rows per variable and ordered by samples
        "data": y_data.values.tolist(),
    },
    # Sample annotations
    "x": {col: x_data[col].to_list() for col in x_data.columns.to_list()},
    # Variable annotations
    "z": {col: z_data[col].to_list() for col in z_data.columns.to_list()}
}

hm_config = {
    "graphType": "Heatmap",
    "title": "Overlays in Heatmap",
    "colorSmpDendrogramBy": "Treatment",
    "colorSpectrum": ["magenta", "blue", "black", "red", "gold"],
    "colorSpectrumZeroValue": 0,
    "heatmapIndicatorHeight": 50,
    "heatmapIndicatorHistogram": True,
    "heatmapIndicatorPosition": "topLeft",
    "heatmapIndicatorWidth": 60,
    "heatmapSmpSeparateBy": "Treatment",
    "samplesClustered": True,
    "smpOverlays": ["Treatment", "Site"],
    "variablesClustered": True
}

chart = CanvasXpress(
    data=hm_data,
    config=hm_config
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_dict_data")

# Using Data Classes

CanvasXpress for Python also includes a series of data classes.

## Passing in a URL

If the data is already structured we can use a URL to a Web location.  Let's assume that the `y` data on the Web is sufficient:

In [6]:
chart = CanvasXpress(
    data=CXUrlData("https://www.canvasxpress.org/data/cX-heatmapR-dat.txt"),
    config=hm_config
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_url_data")

## Passing in raw JSON

If the data is available as JSON then it can be passed in as Raw Text or as a CX JSON object, for which the latter will enable further data management as a dict:

In [7]:
json_data_form = json.dumps(hm_data)

chart = CanvasXpress(
    data=CXTextData(json_data_form), # All processing and error handling is pushed to the JS tier
    config=hm_config
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_rawtext_data")

# Passing in Evaluated JSON

Here a JSON is first processed at the Python tier for immediate feedback in terms of structure, or the possibility of manipulation prior to execution.

In [8]:
json_data_form = json.dumps(hm_data, indent=2)

cx_json_data = CXJSONData(json_data_form)
print(cx_json_data.data['y']['vars'])

print()

cx_json_data.data['y']['vars'] = ["touched - " + value for value in cx_json_data.data['y']['vars']]
print(cx_json_data.data['y']['vars'])

chart = CanvasXpress(
    data=cx_json_data,
    config=hm_config
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_json_data")

['V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20', 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'V29', 'V30', 'V31', 'V32', 'V33', 'V34', 'V35', 'V36', 'V37', 'V38', 'V39', 'V40']

['touched - V1', 'touched - V2', 'touched - V3', 'touched - V4', 'touched - V5', 'touched - V6', 'touched - V7', 'touched - V8', 'touched - V9', 'touched - V10', 'touched - V11', 'touched - V12', 'touched - V13', 'touched - V14', 'touched - V15', 'touched - V16', 'touched - V17', 'touched - V18', 'touched - V19', 'touched - V20', 'touched - V21', 'touched - V22', 'touched - V23', 'touched - V24', 'touched - V25', 'touched - V26', 'touched - V27', 'touched - V28', 'touched - V29', 'touched - V30', 'touched - V31', 'touched - V32', 'touched - V33', 'touched - V34', 'touched - V35', 'touched - V36', 'touched - V37', 'touched - V38', 'touched - V39', 'touched - V40']


## Passing in a Pandas DataFrame

We can also work with DataFrames.  Assuming that we only loaded the data from the Web in `y` we could:

In [9]:
matrix_data = CXDataframeData(
    pd.read_csv(
        "https://www.canvasxpress.org/data/cX-heatmapR-dat.txt", 
        sep='\t', 
        index_col=0
    )
)

matrix_data.dataframe.head(3)

Unnamed: 0,S1,S2,S3,S4,S5,S6,S7,S8,S9,S10,...,S16,S17,S18,S19,S20,S21,S22,S23,S24,S25
V1,0.784,1.036,-0.641,1.606,2.208,3.879,0.333,2.265,-1.55,1.678,...,1.013,0.928,0.812,0.072,3.564,0.47,1.836,0.351,3.139,-2.207
V2,0.222,0.716,0.993,-0.913,0.996,1.235,1.396,1.817,0.162,1.137,...,0.696,0.777,1.6,0.175,2.423,0.044,3.881,-0.757,1.486,0.01
V3,0.486,2.15,-0.069,-0.468,0.402,0.725,-1.697,0.653,0.101,2.852,...,2.511,0.07,0.244,-0.41,2.345,2.401,-0.033,0.951,2.053,0.725


In [10]:
chart = CanvasXpress(
    data=matrix_data,
    config=hm_config
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_matrix_data")

# Working with Profiles

Given the matrix example, we can add a data profile object to help out with annotations.  The `x` and `z` properties can accept `dict` or `DataFrame` values.

In [11]:
matrix_data.profile = CXStandardProfile()

x_data_url = "https://www.canvasxpress.org/data/cX-heatmapR-smp.txt"
x_data = pd.read_csv(x_data_url, sep='\t', index_col=0)

# Assign X to the dataframe itself, which will be properly converted
matrix_data.profile.x = x_data

z_data_url = "https://www.canvasxpress.org/data/cX-heatmapR-var.txt"
z_data = pd.read_csv(z_data_url, sep='\t', index_col=0)

# Assign z to a prepared dict, which will be used as-is
matrix_data.profile.z = {col: z_data[col].to_list() for col in z_data.columns.to_list()}

print("X Values:", '\n', matrix_data.profile.x, '\n')
print("Y Values:", '\n', matrix_data.profile.z)

X Values: 
 {'Dose': [nan, nan, nan, nan, nan, 5.0, 5.0, 10.0, 10.0, 15.0, 15.0, 20.0, 20.0, 25.0, 25.0, 5.0, 5.0, 10.0, 10.0, 15.0, 15.0, 20.0, 20.0, 25.0, 25.0], 'Dose-Type': [nan, nan, nan, nan, nan, 'Dose1', 'Dose1', 'Dose2', 'Dose2', 'Dose3', 'Dose3', 'Dose4', 'Dose4', 'Dose5', 'Dose5', 'Dose1', 'Dose1', 'Dose2', 'Dose2', 'Dose3', 'Dose3', 'Dose4', 'Dose4', 'Dose5', 'Dose5'], 'Site': ['Site1', 'Site1', 'Site1', 'Site1', 'Site1', 'Site2', 'Site2', 'Site2', 'Site2', 'Site2', 'Site2', 'Site2', 'Site2', 'Site2', 'Site2', 'Site3', 'Site3', 'Site3', 'Site3', 'Site3', 'Site3', 'Site3', 'Site3', 'Site3', 'Site3'], 'Treatment': ['Control', 'Control', 'Control', 'Control', 'Control', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB', 'TreatmentA', 'TreatmentB']} 

Y Values: 
 {'Sens': [1, 2, 3

In [12]:
chart = CanvasXpress(
    data=matrix_data,
    config=hm_config
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_matrix_xyz_data")

# Configuration Options

## The Configuration Object

CanvasXpress configurations are provided as key-pair values via a JSON object.  In the Python tier a `dict` is used for most configurations, but equivalents such as `list` and `tuple` can be used instead.  Advanced control is provided by `CXConfigs` and `CXType` objects.

## dict, list, and tuple

Up until now, all of the examples have made use of `dict` objects to provide configuration options.  To recap for the heapmap:

```python
{
    "graphType": "Heatmap",
    "title": "Overlays in Heatmap",
    "colorSmpDendrogramBy": "Treatment",
    "colorSpectrum": ["magenta", "blue", "black", "red", "gold"],
    "colorSpectrumZeroValue": 0,
    "heatmapIndicatorHeight": 50,
    "heatmapIndicatorHistogram": True,
    "heatmapIndicatorPosition": "topLeft",
    "heatmapIndicatorWidth": 60,
    "heatmapSmpSeparateBy": "Treatment",
    "samplesClustered": True,
    "smpOverlays": ["Treatment", "Site"],
    "variablesClustered": True
}
```

All of the values are JSON compliant types.  Lists of `list` and `tuple` can also be used:

```python
# Brief list example:
[
    ["graphType", "Heatmap"],
    ["title", "Overlays in Heatmap"],
    ...
]

# Brief tuple example:
[
    ("graphType", "Heatmap"),
    ("title", "Overlays in Heatmap"),
    ...
]
```

The most important use of lists and tuples are when multiple instances of the same attribute might need to be declated, such as with the `after_render` property and init parameter of the `CanvasXpress` object.  `after_render` accepts function names and their parmeter lists, and it is sometimes desireable to call a function repeatedly in the sequence.  The `setDimensions` function available in CanvasXpress for Javascript looks like this in Python:

In [13]:
chart = CanvasXpress(
    data=hm_data,
    config=hm_config,
    after_render=[
        ["setDimensions", [300, 500, True]]
    ]
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_after_render_example")

## CXConfigs and CXType

CanvasXpress internally converts the key-pair values into `CXType` objects, such as `CXString`.  These objects are usually straightforward, but they aid the Python tier in ensuring that conversions make sense in Javascript.  Some advanced types are provided for additional error checking and feedback at the Python tier, such as `CXRGBAColor`.  `CXConfigs` provides collection functionality, including comparison and merging; `CXType` provides key-pair functionality.

Here is the heatmap rendered using `CXConfigs` and `CXTypes`:

In [14]:
hm_cx_config = CXConfigs(
    CXList("colorSpectrum", ["magenta", "blue", "black", "red", "gold"]),
    CXInt("colorSpectrumZeroValue", 0),
    CXInt("heatmapIndicatorHeight", 50),
    CXBool("heatmapIndicatorHistogram", True),
    CXString("heatmapIndicatorPosition", "topLeft"),
    CXInt("heatmapIndicatorWidth", 60),
    CXString("heatmapSmpSeparateBy", "Treatment"),
    CXBool("samplesClustered", True),
    CXList("smpOverlays", ["Treatment", "Site"]),
    CXBool("variablesClustered", True)
)

hm_cx_config.add(
    CXGraphType(CXGraphTypeOptions.Heatmap)
)

hm_cx_config \
    .set_param("title", "Overlays in Heatmap") \
    .set_param("colorSmpDendrogramBy", "Treatment")

chart = CanvasXpress(
    data=hm_data,
    config=hm_cx_config
)

view = CXNoteBook(chart)
view.render(output_file="heatmap_cxconfigs_example")

# Events

CanvasXpress supports Javascript functions for enhanced chart functionality.  CanvasXpress for Python facilitates assignment of function code to Javascript events for each chart.

## Event Format

An event takes the following form:

```javascript
function(o, e, t) {
    // Do something ...
}
```

The parameters are as follows (at the time of the function's execution):

* **o** represents the data for the portion of the chart under the mouse (e.g., {'y': 'data': ['a', ...], ...})
* **e** represents the event object being invoked (e.g., a MouseEvent)
* **t** represents the DOM object relevant to the event, such as the chart object

## Working Example

The Python package allows the body of each function to be declared as a `str` with the help of `CXEvents` and `CXEvent`.  Similar to the configuration classes, the former facilitates collection activities whereas the latter tracks specific data.  Here's an example:

In [15]:
chart_plain = CanvasXpress(
    data=hm_data,
    config=hm_config
)

js_mousemove_script = """
var selectionInfo = \
'<b>Custom Data Display</b>' + '<br>' 
+ 'var: ' + o.y.vars[0] + '<br>' 
+ 'smp: ' + o.y.smps[0] + '<br>' 
+ 'data: ' + o.y.data[0] + '<br>' 
+ 'dose: ' + o.x.Dose[0];

t.showInfoSpan(e, selectionInfo);
"""

event_mousemove = CXEvent(
    id="mousemove",
    script=js_mousemove_script
)

hm_cx_config.set_param("title", "Custom Event Edition")

chart_mousemove_event = CanvasXpress(
    data=hm_data,
    config=hm_cx_config,
    events=CXEvents(
        event_mousemove
    )
)

view = CXNoteBook(chart_plain, chart_mousemove_event)
view.render(columns=2, output_file="heatmap_event_example")