Getting Started

This page walks through installing Guidepost and building your first visualization in five steps. For details on what data is accepted see Data Requirements and Type Detection; for the full encoding options see Configuration.

Install

Guidepost is on PyPI:

pip install guidepost

It is designed for Jupyter notebooks (classic, JupyterLab, or VS Code notebooks) — the visualization renders inline in a notebook output cell via anywidget.

1. Import and initialize

from guidepost import Guidepost

gp = Guidepost()

2. Load your data

Guidepost takes a pandas DataFrame. You need at least three numeric and two categorical columns; datetime columns are also supported (typically on the x-axis).

import pandas as pd

jobs_data = pd.read_parquet("data/jobs_data.parquet")
gp.load_data(jobs_data)

A representative HPC scheduling dataset:

job_id	start_time	queue_wait	nodes_requested	partition	status	user
12345	2023-11-01 21:19:33	5.2	10	short	Complete	User1
12346	2023-11-01 21:20:01	12.0	20	long	Running	User2

load_data() formats and validates the DataFrame and reports any columns or rows it drops — for example all-NaN columns, timedelta columns (converted to seconds), or array-valued cells. It also refreshes the visualization if it is already displayed. See Data Requirements and Type Detection for the full set of rules; set gp.suppress_warnings = True before loading to quiet the report.

You can also load in one line at construction time: gp = Guidepost(records=jobs_data).

3. Configure the visualization

Tell Guidepost which columns map to which part of the chart:

gp.vis_configs = {
    'x':          'start_time',       # x-axis (numeric or datetime)
    'y':          'queue_wait',       # y-axis (numeric)
    'color':      'nodes_requested',  # cell color (numeric)
    'color_agg':  'avg',             # how color is aggregated per cell
    'categorical':'user',             # bar chart / filter
    'facet_by':   'partition'         # splits the data into groups
}

Every field is explained in Configuration. You can also change these mappings from the dropdown menus inside the widget — you don't have to set vis_configs from Python at all.

4. Display the visualization

In a notebook cell, evaluate the widget:

gp

This renders one panel of charts per facet_by group. To learn what each panel contains and how the panels link together, read Understanding the Views.

5. Retrieve your selection

Brush over the heatmap or its framing histograms (see Selecting and Exporting Data) to select records, then pull them back into Python:

# Either of these returns a pandas DataFrame of the selected rows:
df = gp.retrieve_selected_data()
df = gp.selection.dataframe

Both return the selected rows from your original dataset. gp.selection returns a small Selection wrapper object whose .dataframe attribute holds the DataFrame.

Next: Data Requirements and Type Detection · Configuration · Understanding the Views

Guidepost Wiki

Home

Getting Started

Getting Started

Data & Configuration

The Views

Reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Started

Getting Started

Install

1. Import and initialize

2. Load your data

3. Configure the visualization

4. Display the visualization

5. Retrieve your selection

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Guidepost Wiki

Clone this wiki locally