Skip to content

ops: structured application log aggregation #131

@GitAddRemote

Description

@GitAddRemote

Tech Story

As a solo engineer, I want application logs aggregated into a searchable web UI so that I can investigate silent failures, slow queries, and unexpected behaviour without SSH-ing into the server and running docker logs — especially when I am not at my desk.

ELI5 Context

What is log aggregation?
Your NestJS app writes log lines to stdout. By default, those logs live only inside the Docker container — to read them you must SSH into the VPS and run docker logs station-backend-1. Log aggregation means shipping those log lines to a hosted service where they are stored, indexed, and searchable from a web browser. You can filter by log level, search for a specific request ID, or set up alerts for error spikes — all without SSH.

What is Vector?
Vector is an open-source log and metrics pipeline. It runs as a lightweight sidecar container, watches the Docker socket for new log lines from your backend container, and forwards them to Logtail. No changes to your NestJS app are required for basic log shipping — Vector handles it.

What is nestjs-pino?
Pino is a fast Node.js JSON logger. nestjs-pino wraps it as a NestJS module, replacing the default NestJS logger with one that outputs structured JSON (one JSON object per log line). JSON logs are far easier for Logtail to parse, search, and alert on than free-form text.

What is a request ID?
A UUID generated at the start of each HTTP request and included in every log line produced during that request. If a user reports an error at 2:14 PM, you can search Logtail for that timestamp, find the request ID, then see every log line from that specific request — database queries, auth checks, everything. Without request IDs, correlating log lines to a specific request is guesswork.

What is the 1 GB/month Logtail limit?
Logtail's free tier ingests 1 GB of log data per month. A JSON log line is roughly 200–500 bytes. To stay under the limit: only ship warn and error level logs from production (not log/debug). At low traffic this keeps you well under 1 GB. You can always increase the log level temporarily for debugging.

Technical Elaboration

Part 1: Structured logging with nestjs-pino

Install:

cd backend
pnpm add nestjs-pino pino-http
pnpm add -D pino-pretty   # pretty-print for local development only

Update backend/src/app.module.ts:

import { LoggerModule } from 'nestjs-pino';

@Module({
  imports: [
    LoggerModule.forRoot({
      pinoHttp: {
        level: process.env['NODE_ENV'] === 'production' ? 'warn' : 'debug',
        transport: process.env['NODE_ENV'] !== 'production'
          ? { target: 'pino-pretty' }  // human-readable in development
          : undefined,                  // raw JSON in production (Vector reads this)
        autoLogging: true,              // logs every HTTP request automatically
        redact: ['req.headers.authorization', 'req.body.password'],  // never log these
      },
    }),
    // ... other modules
  ],
})
export class AppModule {}

Update all service constructors to inject the Pino logger instead of the default NestJS Logger:

// Before:
private readonly logger = new Logger(MyService.name);

// After:
import { Logger } from 'nestjs-pino';
constructor(private readonly logger: Logger) {}

Each log line in production will look like:

{
  "level": "warn",
  "time": 1715000000000,
  "pid": 1,
  "hostname": "station-backend-1",
  "req": { "id": "abc-123", "method": "POST", "url": "/auth/login" },
  "msg": "Failed login attempt for unknown user"
}

Part 2: Vector sidecar container

New file: infra/vector.toml

[sources.docker_logs]
type = "docker_logs"
include_containers = ["station-backend-1"]   # only ship backend logs

[transforms.filter_level]
type = "filter"
inputs = ["docker_logs"]
# Only forward warn and error — keeps volume under free tier limit
condition = '.message | contains("\"level\":30") or contains("\"level\":50")'
# Pino level numbers: 30=warn, 40=error, 50=fatal

[sinks.logtail]
type = "http"
inputs = ["filter_level"]
uri = "https://in.logtail.com"
encoding.codec = "json"
auth.strategy = "bearer"
auth.token = "${LOGTAIL_SOURCE_TOKEN}"

Add Vector to docker-compose.prod.yml:

vector:
  image: timberio/vector:latest-alpine
  restart: unless-stopped
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock:ro   # read-only access to Docker socket
    - ./infra/vector.toml:/etc/vector/vector.toml:ro
  environment:
    LOGTAIL_SOURCE_TOKEN: ${LOGTAIL_SOURCE_TOKEN}
  depends_on:
    - backend

Add to .env.production.example:

LOGTAIL_SOURCE_TOKEN=   # from Logtail source settings — leave blank to disable

Add to env.validation.ts:

LOGTAIL_SOURCE_TOKEN: Joi.string().optional().allow(''),

Logtail setup (manual, one-time):

  1. Sign up at betterstack.com/logtail
  2. Create a new Source: type "HTTP" (Vector will push to it)
  3. Copy the Source Token into GitHub environment secrets as LOGTAIL_SOURCE_TOKEN
  4. Verify logs appear in the Logtail UI after first deploy

Part 3: Logtail alert

In the Logtail UI:

  • Create alert: query level >= error, threshold > 10 occurrences in 5 minutes
  • Notification: email to your address

New file: infra/docs/logging.md

Document:

  1. Architecture — NestJS (pino-http) -> Docker stdout -> Vector -> Logtail
  2. Log levels — what gets shipped in production (warn+) vs development (debug+)
  3. How to search logs — Logtail query syntax examples: filter by level, by URL, by time range
  4. How to temporarily increase verbosity — change level in vector.toml to debug, redeploy Vector container only
  5. What to do if Logtail is full — free tier is 1 GB/month; if approaching limit, increase filter to error only
  6. Retention — free tier keeps 3 days; anything older is gone
  7. Alert setup — current alert configuration

Definition of Done

  • nestjs-pino and pino-http installed; LoggerModule registered in AppModule
  • Log level set to warn in production, debug in development
  • req.headers.authorization and req.body.password redacted from all logs
  • All services use the injected Pino logger instead of new Logger(ServiceName.name)
  • infra/vector.toml written filtering to warn+ level and forwarding to Logtail
  • vector service added to docker-compose.prod.yml with read-only Docker socket mount
  • LOGTAIL_SOURCE_TOKEN is optional — Vector container starts without error when token is absent (it will fail to connect to Logtail but not crash the app)
  • LOGTAIL_SOURCE_TOKEN added to GitHub environment secrets for production
  • Logs verified appearing in Logtail UI after deploy
  • Error spike alert configured in Logtail UI (>10 errors in 5 minutes -> email)
  • infra/docs/logging.md written
  • pnpm test passes — nestjs-pino must not break unit tests (use LoggerModule.forRoot({ pinoHttp: { level: 'silent' } }) in test module setup if needed)

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions