# Weave PanelPlot: Visualize Point, Bar, and Line Charts

In this tutorial, we will visualize data with **Weave PanelPlot**:
* load a Pandas DataFrame or other tabular data into Weave 
* visualize your data as a 2D plot of (x, y) points with *Weave Plot*
* customize the plot to filter or annotate the data
* convert to a **Bar Chart** or **Line Plot**

We provide a sample dataset of [notable machine learning models](https://docs.google.com/spreadsheets/d/1AAIebjNsnJj_uKALHbXNfn3_YsT6sHXtCU0q7OIPuc4/edit#gid=0) to get started—you can try your own CSV file, Pandas DataFrame, or any public Google Sheets url.

# 0: Setup
Import dependencies & login to W&B to save your work

In [None]:
!pip install -qqq weave
import pandas as pd

In [None]:
import wandb
wandb.login()
import weave

# 1: Load some data

Load a fascinating sample dataset of notable ML publications (560 rows, 33 columns). Feel free to
* load in your own Pandas DataFrame
* load a different CSV file
* modify the Google Sheets URL and sheet id to a different public spreadsheet

In [None]:
GOOGLE_SHEETS_URL = "https://docs.google.com/spreadsheets/d/1AAIebjNsnJj_uKALHbXNfn3_YsT6sHXtCU0q7OIPuc4"
SHEET_ID = "0"
CSV_SOURCE = f"{GOOGLE_SHEETS_URL}/export?format=csv&gid={SHEET_ID}"

In [None]:
df = pd.read_csv(CSV_SOURCE)
df.head()

# 2: View your data in Weave

View an interactive panel with your data in one line [1] 

In [None]:
weave.show(df)

# 3: View as PanelPlot

You can convert any Weave Panel of type `Table` to one of type `Plot` to make a **Weave PanelPlot**. Change the Weave Panel type in the expression at the top of the panel from `table` to `plot`:

<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/in_notebook_plot_convert.gif">

Weave infers a reasonable view of your data based on the column types:
* two numerical columns x and y become the scatter plot axes
* each row is rendered as an (x, y) point on the resulting 2D grid, with a tooltip showing details on hover
* if available, the first string-type column becomes the label / the color shown in the legend.

Before we dive into the detailed customization of a PanelPlot, how can we build with and iterate on this starter PanelPlot?

## Full-screen a plot: Open as a Weave Board

If you want more visual space/screen real estate to explore&mdash;zoom into details, zoom out to larger context, iterate on multiple panel views in parallel&mdash;open any Weave Panel in a new browser tab to get more space. Mouse over the right hand side and select "Open in a new tab"
<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/open_nb_new_tab.png" width="75%"/>

## Multiple views: Refer to source data as you customize the plot

Duplicate a panel so you can keep one copy in the `table` state and convert the second into a `plot`: click on the horizontal three-dot icon in the top right corner and select "Duplicate".

<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/small_dup_to_plot.png" width="75%"/>

## Resize one panel to adjust layout & iterate incrementally

Combine these UX moves to iterate quickly on a neat layout&mdash;duplicate panels, resize one panel from a corner to a smaller portion of the grid to accommodate more panels, and independently modify individual panels until you're happy with the next version.
<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/open_plot_split.gif">

Try the whole process with the panel below:

In [None]:
weave.show(df)

# 4. PanelPlot Visual Exploration: Zoom and Selection

Interactively explore PanelPlot data with zoom and region highlighting (subset selection):

## Zoom level: Click + drag to zoom into seleted rectangle, double-click to reset
To zoom into a region of the PanelPlot:
* click on the magnifying glass icon in the bottom right corner
* click, hold, and drag to select a set of points&mdash;a gray rectangle shows the active selected region
* repeat until you find the points of interest
* double-click anywhere in the plot to reset to the original zoom level

<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/zoom_in.gif">

## Selected data: Use .selected_data op to show point details
To see the full row details for selected points:
* given one plot panel named `panelN`, create another `panelN+1` and enter `panelN.selected_data` in the new panel's expression. `PanelN+1` will now show any points highlighted in `panelN`
* to select points from `panelN`, click on the pointer icon in the bottom right corner of `panelN`
* click, hold, and drag to select a set of points in `panelN`&mdash;a gray rectangle shows the active selected region
* view the full details for those points in panelN+1

<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/selected_data_horiz.png">

# 5. Customizing Plot Panels: Basic Configuration

Plot panels&mdash;whether scatter, line, or bar&mdash; share most of their configuration parameters. To customize a PanelPlot, click on the pencil icon in the top right corner to open the settings menu and edit the configuration.

## X axis & Y axis: Choose columns or compose a Weave expression

Define the X and Y dimensions of a plot:
* choose a column from your source data:`row["your column name here"]` (with helpful suggestions in the dropdown showing available column names as you type)
* customize the Weave expression using arithmetic (`row["x"] * 2`), combined columns (`row["a"] + row["b"]`), more advanced Weave ops (`row["cost_per_month"].avg`)

Try modifying the X and Y of the starter plot: for example, is there a correlation between publication impact (citations) and compute costs? Try setting X = `row["Training compute cost (2020 USD)"]` and Y = `row["Citations"]`
 
## Color

Point color is set by one of two input methods:
* default `Enter a Weave Expression` method: takes the [tableau10 color palette](https://vega.github.io/vega/docs/schemes/#tableau10) and enumerates it over the distinct values of the result of the Weave Expression, e.g the values of a string column. In our sample plot, the `Color` field defaults to `row["System"]`, and the legend in the top right of the plot shows the [tableau10 colors]((https://vega.github.io/vega/docs/schemes/#tableau10)) repeating over the full list of unique values in the "System" column. Try editing this to `row["Domain"]` to see publication trends by field (the Games and Lanuguage models seem to have the highest compute costs).
* `Select via Dropdown` -> `Encode from series` : this option defaults to one blue color for a single series, and enumerates over the tableau10 color palette for multiple series

## Tooltip

The tooltip field determines the content displayed when the mouse hovers over a data point. This defaults to a subset of columns from the Table and can be configured via Weave Expression to select one or more columns and optionally link them with strings for readability/formatting. The following expression might be a useful summary for our sample data to show the authors and date for each model in addition to the system name: `row["System"] + " - " + row["Authors"] + ", " + row["Publication date"]`

## Labels

This menu (at the very bottom of the PanelPlot settings) optionally sets the titles of the X axis, Y axis, and Color legend to the provided text. In our sample, we might condense the X axis title to "Compute cost", expand the Color legend title as "ML task type", etc.

Here is a [sample side-by-side Weave Board](https://weave.wandb.ai/?exp=get%28%0A++++%22wandb-artifact%3A%2F%2F%2Fstacey%2Fpivot%2Fdefault_plot_with_mods%3Alatest%2Fobj%22%29) of the default PanelPlot and the final state with all the above modifications (note that the tooltip appears to the bottom right of the cursor/corresponds to the point on the upper left of the textbox)
<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/side_by_side_plot.png">

## Multiple series

Use the "New Series +" button to add one or more series of points. You can treat this as an independent group/collection of settings with all the  specifications described above, which will be overlayed on the same plot area/PanelPlot. With our sample dataset, we may want a different series for each domain: language models in one series, computer vision in another, etc.

## Advanced configuration

### Log scale instead of linear scale axes

When the range of an axis is too large to capture detail at linear scale, convert the scale to log base 10 mathematically: e.g. convert `row["Parameters"]` to `row["Parameters"] ** 0.1`:
<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/log_scale_example.png">

### Filter source data to remove outliers

Sometimes a log scale or zooming in is still insufficient. You can remove outliers by filtering the input data based on a range of column values. Use the .filter() Weave op on the `Input` field to plot only the points that meet the specified condition. For example, compare the default starter PanelPlot with dropping any models that have >= 1e12 parameters:
<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/advanced_filter.png">

Try some of these or some of your own explorations from the Table below:

In [None]:
weave.show(df)

# 6. Scatter plot only: Point shape and size

The `Mark` setting intially determines the plot style: scatter plot, line plot, bar chart, etc. This defaults to "auto" and picks the best option based on the incoming data types. If `Mark` is set explicitly to `point`, this reveals controls for the shape and size of the points.

## Point shape via explicit assignment or enumeration over fixed option list

* default "Enter a Weave Expression": as in the other PanelPlot config fields, write a Weave Expression that returns a list, where each distinct option in that list will rotate throught the built-in list of shape options
* "Select via dropdown" -> "circle", "square", "cross", "diamond", etc: this input method sets a specific literal point shape from the list of available options
* "Select via dropdown" -> "Encode from series": enumerate the shape options over the multiple series in the PanelPlot

In our example scenario, we could look at how compute sponsorship compares for notable papers: set `Shape` to `row["Compute Sponsor Categorization"]` and observe circles for academia and squares for industry.

## Point size via explicit value or bucketed enumeration

* default 100 / user-specified number: point size for scatter plots defaults to 100 and can be set to any other numerical value [2]
* Weave expression: as in other settings, the list of distinct values resulting from the user-specified Weave expression will enumerate over five perceptually-distinguishable point sizes from smallest diameter to largest diameter of point

In this sample plot, try log-scaling point size with compute costs: set `Size` to `row["Training compute cost (2020 USD)"] ** 0.1`. Here's a view of the original PanelPlot with the modifications described:
<img src="https://raw.githubusercontent.com/wandb/weave/panel_plot_ref/docs/assets/panelplot_usage/shape_size_change.png">

# 7. Inspiration for many possible data exploration workflows

Weave is a maximally general toolkit, and the path of your interactive visual exploration will depend on your data, context, interests, and goals. We've described the main options and useful features for PanelPlot to both illustrate concrete steps and hopefully spark your own questions and insights. In follow-up tutorials, we will cover settings specific to line plots and bar charts. Here's [one more Board](https://weave.wandb.ai/?exp=get%28%0A++++%22wandb-artifact%3A%2F%2F%2Fstacey%2Fweave%2Fdefense_system_mvp%3Alatest%2Fobj%22%29) leveraging the sample dataset&mdash;we'd love to hear if you discover something interesting in this or your own iterations.

# 8. End notes

## [1] Viewing data in Weave

When starting with a Pandas DataFrame, you have two options for getting data into Weave:

### weave.show(my_dataframe)

`weave.show(my_dataframe)` returns an interactive Weave Panel with the Pandas DataFrame as a Table. This is the fastest and simplest way to load an interactive panel with your data.

### weave.save(weave.from_pandas(my_dataframe), name="my_dataframe")

If you'd like to save the DataFrame as a local object, first wrap it in format Weave can parse using the `weave.from_pandas` op:

```python
my_data = weave.save(weave.from_pandas(dataframe), name="my_dataframe")
my_data
```

## [2] Configuring point size

Point size is currently "perceptually clamped" to around five distinguishable sizes: tiny, small, medium, large, largest. Increasing/decreasing the literal number will not perceptually increase the biggest points or add more perceptible gradations of size. It may make the smallest points effectively invisible.