Skip to content

basiklabs/tensek-api

Repository files navigation

Tensek

Tensek

Voice-to-document API — Turn voice or text into structured drafts (invoices, quotes), render PDFs, and share via link. Built for handymen, artisans, and field workers who want to capture work and send professional documents from their phone.


Overview

Tensek (this repo) is the Go backend that powers:

  • Speech-to-text — Transcribe audio with OpenAI Whisper
  • Structured extraction — LLM turns transcript into invoice/quote JSON (client, line items, totals)
  • PDF generation — Render documents with your logo and branding
  • Storage & share — Upload to Supabase Storage (or S3-compatible), optional short links for sharing (e.g. WhatsApp, iMessage)

The API is stateless where possible; optional Postgres + Supabase give you persistence, documents, and short-link redirects.


Features

Feature Description
Transcribe POST /api/v1/voice/transcribe — Audio (base64 or multipart) → text
Drafts POST /api/v1/drafts — Voice or text → structured invoice/quote JSON
Render POST /api/v1/documents/render — Document JSON → PDF, upload, return URL
Short links Optional short codes; frontend redirects /l/:code → storage URL
Invoices List, get, mark paid, edit (voice/text) with LLM
Auth JWT (Supabase or custom secret); optional DISABLE_AUTH for local dev

Tech stack

  • Go 1.22+ — HTTP server, business logic
  • Gin — Router, middleware
  • OpenAI — Whisper (STT), GPT-4o-mini (extraction, edits)
  • Postgres — Optional (Supabase Postgres or any DATABASE_URL)
  • Supabase — Optional storage bucket + auth
  • gofpdf — PDF generation (invoice layout, logo, font)

Prerequisites

  • Go 1.22+
  • OpenAI API key (required) — for STT and extraction
  • Supabase (optional) — for storage, auth, and DB
  • PNG logo (optional) — e.g. tensek.png at project root for PDF header
  • TTF font (optional) — e.g. LibreFranklin-Regular.ttf for PDF body text

Quick start

1. Clone and install

git clone https://github.com/your-org/whisper.git
cd whisper
go mod download

2. Environment

Copy the example env and set at least your OpenAI key:

cp .env.example .env
# Edit .env: set OPENAI_API_KEY=sk-...

For local dev without auth:

DISABLE_AUTH=true
OPENAI_API_KEY=sk-...
# Optional: DATABASE_URL, SUPABASE_* for storage and DB

3. Run

go run ./cmd/server

Server listens on :8080. Health: http://localhost:8080/health.

4. Try the API

# Transcribe audio (body: base64 audio or multipart)
curl -X POST http://localhost:8080/api/v1/voice/transcribe \
  -H "Content-Type: application/json" \
  -d '{"audio":"<base64>"}'

# Create draft from text (no auth when DISABLE_AUTH=true)
curl -X POST http://localhost:8080/api/v1/drafts \
  -H "Content-Type: application/json" \
  -d '{"type":"invoice","text":"Client: Acme Corp. Two hours plumbing, 80 per hour. Total 160."}'

You can also import the Postman collection to explore all endpoints: postman/Whisper-API.postman_collection.json (in Postman: Import → upload that file).


Configuration

Variable Required Description
PORT No Server port (default 8080)
OPENAI_API_KEY Yes OpenAI API key for Whisper + GPT
CORS_ALLOW_ORIGINS No Comma-separated origins or * (default *)
DISABLE_AUTH No true = skip JWT (testing only)
SUPABASE_URL No* Supabase project URL
SUPABASE_KEY No* Supabase anon or service key
SUPABASE_JWT_SECRET No* JWT secret for verifying access tokens
DATABASE_URL No Postgres connection string (Supabase or other)
FONT_PATH No Path to TTF font (default LibreFranklin-Regular.ttf)
LOGO_PATH No Path to PNG logo (default tensek.png)
FRONTEND_BASE_URL No Base URL for short links (e.g. https://app.example.com)
SMTP_* No Optional email sending (host, port, user, password, from)

* Auth needs either SUPABASE_JWT_SECRET or both SUPABASE_URL + SUPABASE_KEY unless DISABLE_AUTH=true.


Project layout

whisper/
├── cmd/server/          # Entrypoint
├── internal/
│   ├── config/          # Env config
│   ├── extract/         # LLM extraction (invoice/quote)
│   ├── handler/         # HTTP handlers + auth
│   ├── ingest/          # Request parsing
│   ├── logger/          # Structured logging
│   ├── pdf/             # PDF layout (gofpdf)
│   ├── render/          # Render by type, storage upload
│   ├── send/             # Optional email
│   ├── storage/         # Supabase Storage client
│   ├── store/           # Postgres (documents, short links)
│   └── stt/              # OpenAI Whisper client
├── pkg/schema/          # Shared types (invoice, document types)
├── k8s/                 # Kubernetes deployment
├── .github/workflows/   # CI (e.g. build + deploy)
├── docs/                # Prompts and integration notes
├── Dockerfile
├── .env.example
└── README.md

Development

Run locally

go run ./cmd/server

Or use air for live reload.

Tests

go test ./...

Docker build

docker build -t whisper .
docker run -p 8080:8080 -e OPENAI_API_KEY=sk-... -e DISABLE_AUTH=true whisper

The Dockerfile expects tensek.png and LibreFranklin-Regular.ttf in the build context (project root). Omit or replace for your own branding.


API overview

Method Path Auth Description
GET /health No Health check
GET /api/v1/links/:code No Resolve short link → redirect
POST /api/v1/voice/transcribe Yes* Audio → text
POST /api/v1/drafts Yes* Voice/text → structured document
POST /api/v1/documents/render Yes* Document → PDF, upload, URL
GET /api/v1/invoices/stats Yes* Invoice stats
GET /api/v1/invoices Yes* List invoices
GET /api/v1/invoices/:id Yes* Get invoice
POST /api/v1/invoices/:id/paid Yes* Mark paid
POST /api/v1/invoices/:id/edit Yes* Edit invoice (voice/text)

* Not required when DISABLE_AUTH=true.

Request/response shapes: see pkg/schema and the handlers in internal/handler. Frontend integration notes: docs/NEXTJS_PROMPT.md.


Deployment

  • Kubernetes — See k8s/deployment.yaml; inject env (e.g. Secret for OPENAI_API_KEY, DATABASE_URL).
  • CI.github/workflows/kube.yaml shows an example build-and-deploy; set IMAGE_NAME, APP_NAME, DOMAIN and secrets for your registry and cluster.

Logo and branding

The Tensek logo at the top of this README is tensek.png in the repo root. The same file is used in generated PDFs when LOGO_PATH is set. For your own fork:

  • Replace tensek.png with your logo (PNG recommended).
  • Or set LOGO_PATH to another path and ensure it’s available in the container if you deploy with Docker/Kubernetes.

Contributing

Contributions are welcome. Please open an issue or a PR. For large changes, open an issue first to align on direction.

  • Run go test ./... and keep the build green.
  • Follow existing style (Go standard layout, structured logging in internal/logger).

License

This project is open source under the MIT License.

About

Voice to Invoice

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors