Skip to content
107 changes: 107 additions & 0 deletions AGENTS.md
Comment thread
jasoncoffman marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Querybook

Querybook is Pinterest's open-source Big Data IDE for discovering, creating, and sharing data analyses. It combines a rich-text editor, SQL query engine, charting, scheduling, and table documentation in a single web app.

## Tech Stack

- **Backend:** Python 3.10, Flask, SQLAlchemy (MySQL), Celery (Redis broker), Elasticsearch/OpenSearch, gevent + Flask-SocketIO (WebSockets), uWSGI (production)
- **Frontend:** React 17, TypeScript, Redux, Webpack 5, CodeMirror (SQL editor), Draft.js (rich text), Chart.js/D3/ReactFlow
Comment thread
jasoncoffman marked this conversation as resolved.

## Directory Layout

- `querybook/server/` — Flask backend
- `app/` — app setup
Comment thread
jasoncoffman marked this conversation as resolved.
- `datasources/` — REST API endpoints
Comment thread
jasoncoffman marked this conversation as resolved.
- `logic/` — business logic
- `models/` — SQLAlchemy models
- `tasks/` — Celery tasks
- `lib/` — utilities, executors, metastores
- `env.py` — `QuerybookSettings` configuration
- `querybook/webapp/` — React/TypeScript frontend
- `components/` — React components
- `hooks/` — custom React hooks
- `redux/` — Redux store, actions, reducers
- `lib/` — frontend utilities
- `ui/` — reusable UI primitives
- `resource/` — API client layer
- `querybook/config/` — YAML config files
- `plugins/` — plugin stubs (extension point for custom behavior)
- `requirements/` — pip requirements (`base.txt`, `prod.txt`, `engine/*.txt`, `auth/*.txt`)
- `containers/` — Docker Compose files (dev, prod, test)
- `docs_website/` — Docusaurus documentation site
- `helm/` / `k8s/` — Kubernetes deployment manifests

## Plugin System

Querybook is extended via plugins without forking. The env var `QUERYBOOK_PLUGIN` (default `./plugins`) points to a directory where plugin modules are discovered by `lib.utils.import_helper.import_module_with_default()`.

Each plugin module exports a well-known variable (e.g. `ALL_PLUGIN_EXECUTORS`) that the server merges with built-in defaults.

Key plugin types: `executor_plugin`, `metastore_plugin`, `auth_plugin`, `api_plugin`, `exporter_plugin`, `result_store_plugin`, `notifier_plugin`, `event_logger_plugin`, `stats_logger_plugin`, `job_plugin`, `tasks_plugin`, `dag_exporter_plugin`, `ai_assistant_plugin`, `vector_store_plugin`, `webpage_plugin`, `monkey_patch_plugin`, `query_validation_plugin`, `query_transpilation_plugin`, `engine_status_checker_plugin`, `table_uploader_plugin`.

## Configuration

Priority: **env vars > `querybook_config.yaml` > `querybook_default_config.yaml`**.

Key settings live in `querybook/server/env.py` (`QuerybookSettings`).

## Running Locally

Start the full stack (web server, worker, scheduler, and all dependencies) with Docker Compose:

```bash
make
```

This brings up everything and serves the app at http://localhost:10001. This is the primary command for local development.

To restart individual services without bouncing the full stack:

```bash
make web # web server only
make worker # celery worker
make scheduler # celery beat
```
Comment thread
wbh123456 marked this conversation as resolved.

## Making Commits

When preparing a PR, run the relevant checks. CI runs all of the following via GitHub Actions (`.github/workflows/`), but must be manually triggered by a maintainer.

Always run tests via `make test`, which builds a `querybook-test` Docker image and runs checks inside it. This ensures an isolated, reproducible environment. Do not run test commands (pytest, yarn, webpack) directly on the host.

`make test` runs both backend and frontend checks:
- **Backend** (anything under `querybook/server/`): pytest
- **Frontend** (anything under `querybook/webapp/`): TypeScript type checking, Jest unit tests, ESLint, and production build verification

**Formatting (all changes) — common CI failure:**

`make test` does **not** run Prettier. CI runs Prettier separately via `pre-commit`, so formatting issues are a frequent cause of CI failures. After running `make test`, also run Prettier on changed files before pushing:

```bash
npx prettier --write <files>
```

For a full formatting pass (Black for Python, Prettier for JS/TS, flake8):

```bash
pre-commit run --all-files
```

## Maintaining This File

**Include:**
- Repo purpose, tech stack, and high-level architecture
- Directory layout (key paths only)
- How to run, test, and lint locally
- Commit and PR workflow expectations
- Plugin system overview and extension points

**Do not include:**
- Detailed API docs or function-level documentation
- Inline code examples longer than 5 lines
- Deployment runbooks or operational procedures (keep in README or docs/)
- Credentials, secrets, or internal URLs
- Information that changes frequently (version numbers, dependency lists)
- Content already covered in README.md
- Content that can be easily derived by AI agents (e.g. reading file trees, package.json)
- References to internal/proprietary repos — this is an open-source project
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
@AGENTS.md
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,4 +106,4 @@ Lineage & Analytics

# Contributing Back

See [CONTRIBUTING](CONTRIBUTING.md).
See [CONTRIBUTING](CONTRIBUTING.md) for the full guide, including [how to run tests locally](docs_website/docs/developer_guide/contributing.mdx#testing).
31 changes: 31 additions & 0 deletions docs_website/docs/developer_guide/contributing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,37 @@ To increase the chances that your pull request will be accepted:
- Write tests for your changes
- Write a good commit message

### Testing

CI runs these checks on every pull request via GitHub Actions (a maintainer may need to approve the workflow run for external contributors). You should reproduce them locally before pushing.

The simplest way to run the full test suite locally is:

```sh
make test
```

This builds a Docker test image and runs both Python and Node tests in parallel, mirroring CI.

If you only changed one side and want a faster feedback loop:

```sh
# Backend only (changes under querybook/server/)
PYTHONPATH=querybook/server:plugins ./querybook/scripts/run_test --python

# Frontend only (changes under querybook/webapp/)
./querybook/scripts/run_test --node
```

**Linting (all changes):**

Pre-commit hooks (Black, Prettier, flake8) run automatically on `git commit` if installed, and in CI on every PR. To set them up locally:

```sh
pip install pre-commit
pre-commit install
```

### Pull Request Format

When you create a new pull request, please make sure it's title follows the specifications in https://www.conventionalcommits.org/en/v1.0.0/. This is to ensure automatic versioning.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import { addRunInputSnapshot } from 'hooks/queryEditor/useExecutionSnapshots';

describe('addRunInputSnapshot', () => {
test('adds a snapshot to an empty record', () => {
const result = addRunInputSnapshot({}, 1, 'SELECT 1');
expect(result).toEqual({ 1: 'SELECT 1' });
});

test('adds a new execution to existing snapshots', () => {
const prev = { 1: 'SELECT 1' };
const result = addRunInputSnapshot(prev, 2, 'SELECT 2');
expect(result).toEqual({ 1: 'SELECT 1', 2: 'SELECT 2' });
});

test('overwrites an existing execution snapshot', () => {
const prev = { 1: 'SELECT old' };
const result = addRunInputSnapshot(prev, 1, 'SELECT new');
expect(result).toEqual({ 1: 'SELECT new' });
});

test('does not mutate the original record', () => {
const prev = { 1: 'SELECT 1' };
const result = addRunInputSnapshot(prev, 2, 'SELECT 2');
expect(prev).toEqual({ 1: 'SELECT 1' });
expect(result).not.toBe(prev);
});

test('handles many snapshots without pruning', () => {
let record: Record<number, string> = {};
for (let i = 0; i < 50; i++) {
record = addRunInputSnapshot(record, i, `SELECT ${i}`);
}
expect(Object.keys(record)).toHaveLength(50);
expect(record[0]).toBe('SELECT 0');
expect(record[49]).toBe('SELECT 49');
});
});
Loading
Loading