Skip to content

Getting Started

Connor Scully-Allison edited this page Jun 15, 2026 · 1 revision

Getting Started

This page walks through installing Guidepost and building your first visualization in five steps. For details on what data is accepted see Data Requirements and Type Detection; for the full encoding options see Configuration.

Install

Guidepost is on PyPI:

pip install guidepost

It is designed for Jupyter notebooks (classic, JupyterLab, or VS Code notebooks) — the visualization renders inline in a notebook output cell via anywidget.

1. Import and initialize

from guidepost import Guidepost

gp = Guidepost()

2. Load your data

Guidepost takes a pandas DataFrame. You need at least three numeric and two categorical columns; datetime columns are also supported (typically on the x-axis).

import pandas as pd

jobs_data = pd.read_parquet("data/jobs_data.parquet")
gp.load_data(jobs_data)

A representative HPC scheduling dataset:

job_id start_time queue_wait nodes_requested partition status user
12345 2023-11-01 21:19:33 5.2 10 short Complete User1
12346 2023-11-01 21:20:01 12.0 20 long Running User2

load_data() formats and validates the DataFrame and reports any columns or rows it drops — for example all-NaN columns, timedelta columns (converted to seconds), or array-valued cells. It also refreshes the visualization if it is already displayed. See Data Requirements and Type Detection for the full set of rules; set gp.suppress_warnings = True before loading to quiet the report.

You can also load in one line at construction time: gp = Guidepost(records=jobs_data).

3. Configure the visualization

Tell Guidepost which columns map to which part of the chart:

gp.vis_configs = {
    'x':          'start_time',       # x-axis (numeric or datetime)
    'y':          'queue_wait',       # y-axis (numeric)
    'color':      'nodes_requested',  # cell color (numeric)
    'color_agg':  'avg',             # how color is aggregated per cell
    'categorical':'user',             # bar chart / filter
    'facet_by':   'partition'         # splits the data into groups
}

Every field is explained in Configuration. You can also change these mappings from the dropdown menus inside the widget — you don't have to set vis_configs from Python at all.

4. Display the visualization

In a notebook cell, evaluate the widget:

gp

This renders one panel of charts per facet_by group. To learn what each panel contains and how the panels link together, read Understanding the Views.

5. Retrieve your selection

Brush over the heatmap or its framing histograms (see Selecting and Exporting Data) to select records, then pull them back into Python:

# Either of these returns a pandas DataFrame of the selected rows:
df = gp.retrieve_selected_data()
df = gp.selection.dataframe

Both return the selected rows from your original dataset. gp.selection returns a small Selection wrapper object whose .dataframe attribute holds the DataFrame.


Next: Data Requirements and Type Detection · Configuration · Understanding the Views

Clone this wiki locally