File uploads

The fastest path from "I have a CSV" to "I'm querying it."

Drag a file anywhere on the app → it lands as a queryable DuckDB table. Drop customers.csv → SELECT * FROM customers works immediately, no read_csv_auto('…') calls required.

What's supported

CSV · TSV · TXT · JSON · JSONL · NDJSON · Parquet

The connector picks the right DuckDB reader based on file extension:

Extension	Reader
`.csv` / `.tsv` / `.txt`	`read_csv_auto`
`.json` / `.jsonl` / `.ndjson`	`read_json_auto`
`.parquet`	`read_parquet`

Excel (.xlsx) is not supported yet — DuckDB needs its excel extension autoloaded. See Roadmap.

How it works under the hood

You drop a file. The frontend POSTs it to /api/files/upload as multipart form data.
The backend (rednotebook/uploads/store.py) saves it under local_data/uploads/<user-id>/<uuid>.<ext> and adds a manifest entry.
The table name is sanitised: lowercase, alnum + underscore, leading digits prefixed, collisions resolved (customers_2, customers_3).
On every subsequent query, the DuckDB connector reads the manifest and emits CREATE OR REPLACE VIEW <table_name> AS SELECT * FROM read_<ext>('/abs/path') per file before running the user's SQL.

The views are session-local to each query, so:

They're always fresh.
A renamed / deleted file disappears on the next query without a reconnect.
The connector layer never holds a long-lived connection.

Limits

200 MB per file. Streams in 1 MiB chunks during upload so memory doesn't double.
Per-user. The manifest is scoped to the request's user; one user's uploads are not visible to another.
DuckDB connections only. Postgres / Snowflake / etc. don't auto-register the views — those engines can't query a local file the way DuckDB can.

Renaming and removing

The Files panel in the left sidebar lists every uploaded file with its current table name + original filename + size.
Hover a row → click the trash icon to delete.
A rename endpoint exists (PATCH /api/files/<id>) — UI for it is on the Issue #X follow-up list.

Example flow

Drag orders.csv onto the canvas. Toast confirms Ready: \orders``.

In a SQL cell:

SELECT
  date_trunc('week', order_date)::date AS week,
  region,
  COUNT(*)                              AS orders,
  ROUND(SUM(revenue), 2)                AS revenue
FROM orders
GROUP BY 1, 2
ORDER BY 1, 2

Hit Run. Result populates. Click Profile for distributions.
Click Summarize result — AI brief grounded in the actual rows.
Click Publish to share the notebook + result snapshot publicly.

That's the analyst-favourite five-minute loop.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File uploads

File uploads

What's supported

How it works under the hood

Limits

Renaming and removing

Example flow

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally