-
Notifications
You must be signed in to change notification settings - Fork 0
Getting Started
This page walks through installing Guidepost and building your first visualization in five steps. For details on what data is accepted see Data Requirements and Type Detection; for the full encoding options see Configuration.
Guidepost is on PyPI:
pip install guidepostIt is designed for Jupyter notebooks (classic, JupyterLab, or VS Code notebooks) — the visualization renders inline in a notebook output cell via anywidget.
from guidepost import Guidepost
gp = Guidepost()Guidepost takes a pandas DataFrame. You need at least three numeric and two categorical columns; datetime columns are also supported (typically on the x-axis).
import pandas as pd
jobs_data = pd.read_parquet("data/jobs_data.parquet")
gp.load_data(jobs_data)A representative HPC scheduling dataset:
| job_id | start_time | queue_wait | nodes_requested | partition | status | user |
|---|---|---|---|---|---|---|
| 12345 | 2023-11-01 21:19:33 | 5.2 | 10 | short | Complete | User1 |
| 12346 | 2023-11-01 21:20:01 | 12.0 | 20 | long | Running | User2 |
load_data() formats and validates the DataFrame and reports any columns or rows it drops — for example all-NaN columns, timedelta columns (converted to seconds), or array-valued cells. It also refreshes the visualization if it is already displayed. See Data Requirements and Type Detection for the full set of rules; set gp.suppress_warnings = True before loading to quiet the report.
You can also load in one line at construction time:
gp = Guidepost(records=jobs_data).
Tell Guidepost which columns map to which part of the chart:
gp.vis_configs = {
'x': 'start_time', # x-axis (numeric or datetime)
'y': 'queue_wait', # y-axis (numeric)
'color': 'nodes_requested', # cell color (numeric)
'color_agg': 'avg', # how color is aggregated per cell
'categorical':'user', # bar chart / filter
'facet_by': 'partition' # splits the data into groups
}Every field is explained in Configuration. You can also change these mappings from the dropdown menus inside the widget — you don't have to set vis_configs from Python at all.
In a notebook cell, evaluate the widget:
gpThis renders one panel of charts per facet_by group. To learn what each panel contains and how the panels link together, read Understanding the Views.
Brush over the heatmap or its framing histograms (see Selecting and Exporting Data) to select records, then pull them back into Python:
# Either of these returns a pandas DataFrame of the selected rows:
df = gp.retrieve_selected_data()
df = gp.selection.dataframeBoth return the selected rows from your original dataset. gp.selection returns a small Selection wrapper object whose .dataframe attribute holds the DataFrame.
Next: Data Requirements and Type Detection · Configuration · Understanding the Views
Getting Started
Data & Configuration
The Views
- Understanding the Views
- Main Summary View Heatmap
- Histograms Bar Chart and Legend
- Selecting and Exporting Data
Reference