Granular in-service Node.js monitoring that tells you which subsystem is causing high memory or CPU — not just that something is wrong.
npm version license node
Standard observability tools (APM platforms, default prom-client metrics) are excellent at alerting you that memory is high or that the process is under load. What they rarely tell you is which subsystem inside your service is responsible. During an incident you need to know: is it the logger pipeline backing up? The Kafka consumer batch processor? Mongoose operations on a hot collection? The external API call pool?
deepscope adds a targeted instrumentation layer that sits inside your process and exposes exactly that signal. It tracks HTTP request queue depth, Kafka consumer lag and batch timing, MongoDB per-collection operation latency, outbound HTTP call duration by host, event-loop time attribution per async subsystem, and your logger pipeline's queue depth and drop rate — all surfaced as Prometheus metrics that any scraper can collect.
The library was extracted from a production monitoring layer built to triage a real incident: a downstream log-shipping agent began exerting write-pressure, causing the in-process logger queue to grow unboundedly, which exhausted heap and eventually OOM-crashed the service. Neither the APM tool nor the default prom-client Node.js metrics could localize the cause — they showed "memory is rising" but not "the logger queue has 400 000 pending messages." deepscope was built to fill that gap. It is now a standalone, generic library you can drop into any Node.js service.
- 7 first-party monitors:
process,eventLoop,httpQueue,mongo,kafka,externalCall,logger - Tiered model: Tier 1 metrics are always-on and cheap; Tier 2 metrics are togglable at runtime for deeper per-operation instrumentation during an incident
- Pluggable architecture: extend
[BaseMonitor](src/core/baseMonitor.logic.ts)for lifecycle scaffolding, or implementIMonitorPluginfrom scratch - Pull-based via prom-client: metrics are exposed through a standard
Registryyou own — wire it to any Prometheus scraper or your existing/metricsendpoint - Optional debug endpoint: opt-in worker-thread HTTP server for heap snapshots, CPU profiles, and runtime tier toggling — off by default, localhost-bound
- No global state: all state lives on a
Deepscopeinstance; safe to run multiple instances in the same process - TypeScript-first: strict mode, ESM output, full type exports
- Peer-dep optional:
prom-clientis the only required peer;express,mongoose,kafkajs,axiosare optional and only activated when you call.use()with their monitor
npm install deepscope prom-client
# Optional peers — install only the ones you need:
npm install express # for httpQueueMonitor
npm install mongoose # for mongoMonitor
npm install kafkajs # for kafkaMonitor
npm install axios # for externalCallMonitorprom-client ^15.0.0 is the only required peer dependency. The framework/library peers are optional — deepscope never imports them at module load time; it receives the instance you pass into the factory function.
import express from 'express';
import { Registry } from 'prom-client';
import {
Deepscope,
processMonitor,
httpQueueMonitor,
kafkaMonitor,
loggerMonitor,
} from 'deepscope';
import type { ILoggerStatsProvider } from 'deepscope';
// Your express app and Kafka consumer (created elsewhere)
const app = express();
const consumer = getKafkaConsumer(); // kafkajs Consumer
// Adapt your logger to the ILoggerStatsProvider interface
const loggerAdapter: ILoggerStatsProvider = {
getQueueDepth: () => myLogger.internalQueue.length,
getDroppedCount: () => myLogger.droppedCount,
getEnqueueRate: () => myLogger.enqueueRate,
};
const registry = new Registry();
const scope = new Deepscope({ registry, serviceName: 'my-service', tier: 'tier1' })
.use(processMonitor())
.use(httpQueueMonitor({ app }))
.use(kafkaMonitor({ consumer }))
.use(loggerMonitor({ provider: loggerAdapter, pollIntervalMs: 5000 }))
.start();
// Expose metrics on your existing express app
app.get('/metrics', async (_req, res) => {
res.set('Content-Type', registry.contentType);
res.end(await registry.metrics());
});
// Upgrade to tier2 during an incident for deeper instrumentation
scope.setTier('tier2');
// Graceful shutdown
process.on('SIGTERM', async () => {
await scope.stop();
});import { Deepscope } from 'deepscope';
import type { IDeepscopeOptions } from 'deepscope';Constructor
new Deepscope(opts: IDeepscopeOptions)
interface IDeepscopeOptions {
readonly registry: Registry; // prom-client Registry you own
readonly serviceName: string; // injected as the `service` label on every metric
readonly tier?: Tier; // 'tier1' (default) | 'tier2'
}Methods
| Method | Signature | Description |
|---|---|---|
.use() |
use(plugin: IMonitorPlugin): this |
Register a monitor. Must be called before .start(). Returns this for chaining. |
.start() |
start(): this |
Register and start all monitors for the current tier. |
.stop() |
stop(): Promise<void> |
Stop all monitors and shut down the debug endpoint if running. |
.setTier() |
setTier(tier: Tier): void |
Toggle between 'tier1' and 'tier2' at runtime. Starts/stops tier2 monitors accordingly. |
.getTier() |
getTier(): Tier |
Returns the current active tier. |
.takeHeapSnapshot() |
takeHeapSnapshot(): Readable |
Returns a readable stream of a V8 heap snapshot. Drains to a .heapsnapshot file. |
.startCpuProfile() |
startCpuProfile(durationMs: number): Promise<string> |
Runs a CPU profile for durationMs milliseconds and returns the JSON string. |
.enableDebugEndpoint() |
enableDebugEndpoint(opts: IDebugEndpointOptions): void |
Starts the opt-in debug HTTP server in a worker thread. |
.debugPort() |
`debugPort(): number | null` |
[src/core/baseMonitor.logic.ts](src/core/baseMonitor.logic.ts)
import { BaseMonitor } from 'deepscope';
import type { IPluginContext, Tier } from 'deepscope';
abstract class BaseMonitor implements IMonitorPlugin {
abstract readonly name: string;
abstract readonly tier: Tier;
// Final lifecycle methods (do not override):
register(ctx: IPluginContext): void { /* calls onRegister */ }
start(): void { /* calls onStart */ }
stop(): void { /* calls onStop */ }
// Override these in your subclass:
protected abstract onRegister(ctx: IPluginContext): void;
protected abstract onStart(): void;
protected abstract onStop(): void;
}Lifecycle order: register(ctx) → start() → stop(). Create metrics in onRegister, attach listeners in onStart, detach in onStop. stop() is safe to call multiple times.
import type { IMonitorPlugin, IPluginContext, Tier } from 'deepscope';
interface IMonitorPlugin {
readonly name: string;
readonly tier: Tier;
register(ctx: IPluginContext): void;
start(): void;
stop(): void;
}Implement this directly if you don't want the BaseMonitor lifecycle scaffolding.
interface IMetricFactory {
counter(opts: ICounterOpts): ICounter;
gauge(opts: IGaugeOpts): IGauge;
histogram(opts: IHistogramOpts): IHistogram;
}Available through ctx.metrics inside onRegister. Never import prom-client directly in your monitors — use this factory so the backend remains swappable.
| Monitor | Factory | Tier | Peer dep | Key metrics |
|---|---|---|---|---|
ProcessMonitor |
processMonitor() |
tier1 | — | nodejs_heap_space_used_bytes, nodejs_rss_bytes, nodejs_heap_used_vs_total_ratio, nodejs_gc_runs_total, nodejs_gc_pause_seconds, nodejs_event_loop_lag_seconds, nodejs_event_loop_utilization_ratio, nodejs_active_handles |
EventLoopMonitor |
eventLoopMonitor() |
tier2 | — | nodejs_eventloop_time_share_ratio, nodejs_eventloop_time_seconds_total, nodejs_async_resource_active, nodejs_async_resource_created_total |
HttpQueueMonitor |
httpQueueMonitor({ app }) |
tier1 | express |
http_request_queue_depth, http_request_queue_seconds, http_active_requests |
MongoMonitor |
mongoMonitor({ mongoose }) |
tier2 | mongoose |
mongodb_op_seconds, mongodb_op_errors_total |
KafkaMonitor |
kafkaMonitor({ consumer }) |
tier1 | kafkajs |
kafka_consumer_lag, kafka_consumer_batch_size, kafka_consumer_batch_seconds, kafka_consumer_messages_total, kafka_consumer_errors_total |
ExternalCallMonitor |
externalCallMonitor({ axios }) |
tier2 | axios |
external_http_seconds, external_http_total, external_http_errors_total |
LoggerMonitor |
loggerMonitor({ provider }) |
tier1 | — | logger_queue_depth, logger_dropped_total, logger_enqueue_rate* |
*logger_enqueue_rate is only emitted when the provider implements getEnqueueRate().
For the full metric catalog including label names, types, and bucket definitions, see [docs/metrics.md](docs/metrics.md).
deepscope uses a two-tier model so you can keep production overhead low and escalate instrumentation depth only when needed.
Tier 1 — always-on
Tier 1 monitors run continuously. They are designed to be cheap: polling intervals are coarse, they use perf_hooks and process.memoryUsage() rather than async hooks, and they do not intercept every operation. These are the metrics you leave on in production at all times.
Tier 2 — togglable
Tier 2 monitors add per-operation granularity: Mongoose per-collection query timing, Axios per-host call latency, and async_hooks-based event-loop attribution by subsystem. These have higher overhead and are intended to be enabled on-demand — either when an incident is in progress, or in non-production environments.
Switching tiers
// At construction time (default is 'tier1')
const scope = new Deepscope({ registry, serviceName: 'svc', tier: 'tier2' }).use(...).start();
// At runtime — programmatic
scope.setTier('tier2'); // enables tier2 monitors
scope.setTier('tier1'); // stops tier2 monitors
// At runtime — via debug endpoint (see below)
// POST http://127.0.0.1:9091/tier?value=tier2Tier transitions are safe to call while the process is under load. Tier 2 monitors are stopped cleanly (listeners detached) when you downgrade back to tier 1.
The debug endpoint is off by default. Enable it explicitly:
scope.enableDebugEndpoint({ port: 9091 }); // binds to 127.0.0.1 by default
scope.enableDebugEndpoint({ port: 9091, host: '0.0.0.0' }); // explicit external bindThe endpoint runs in a worker thread so it remains responsive even when the main thread is under load — which is exactly the scenario where you need it most.
Routes
| Method | Path | Description |
|---|---|---|
GET |
/healthz |
Returns ok. Use to confirm the worker is alive. |
GET |
/heap |
Triggers a V8 heap snapshot on the main thread and streams the .heapsnapshot file as a download. Allow up to 2 minutes. |
POST |
/cpu?ms=<duration> |
Runs a CPU profile for ms milliseconds (default 10 000) and returns the .cpuprofile JSON as a download. |
POST |
/tier?value=<tier> |
Sets the active tier. value must be tier1 or tier2. |
Security note: there is no built-in authentication. The default bind address is 127.0.0.1. Do not bind 0.0.0.0 in production without placing the port behind a reverse proxy, VPN, or SSH tunnel that provides its own auth layer. This endpoint is a diagnostics surface — treat it accordingly.
The programmatic API (takeHeapSnapshot(), startCpuProfile(), setTier()) works independently of the HTTP server and is the preferred approach when you can deploy a code change.
Any subsystem not covered by the built-in monitors can be instrumented by extending BaseMonitor. The extension contract is the same whether you are adding a first-party monitor or a user-written one — Deepscope treats them identically.
Example: a BullMQ job queue monitor
import { BaseMonitor } from 'deepscope';
import type { IPluginContext, Tier, IGauge, ICounter } from 'deepscope';
import type { Queue } from 'bullmq'; // type-only import; instance passed at runtime
interface IBullMQMonitorOptions {
queue: Queue;
pollIntervalMs?: number;
}
class BullMQMonitor extends BaseMonitor {
readonly name = 'bullmq';
readonly tier: Tier = 'tier1';
private waitingGauge!: IGauge;
private activeGauge!: IGauge;
private failedTotal!: ICounter;
private timer: NodeJS.Timeout | null = null;
constructor(private readonly opts: IBullMQMonitorOptions) {
super();
}
protected onRegister(ctx: IPluginContext): void {
const f = ctx.metrics;
this.waitingGauge = f.gauge({
name: 'bullmq_jobs_waiting',
help: 'Jobs currently waiting in the queue',
labelNames: ['queue'],
});
this.activeGauge = f.gauge({
name: 'bullmq_jobs_active',
help: 'Jobs currently being processed',
labelNames: ['queue'],
});
this.failedTotal = f.counter({
name: 'bullmq_jobs_failed_total',
help: 'Total failed jobs since start',
labelNames: ['queue'],
});
}
protected onStart(): void {
const interval = this.opts.pollIntervalMs ?? 5000;
this.timer = setInterval(async () => {
const counts = await this.opts.queue.getJobCounts();
const name = this.opts.queue.name;
this.waitingGauge.set({ queue: name }, counts.waiting ?? 0);
this.activeGauge.set({ queue: name }, counts.active ?? 0);
}, interval);
this.opts.queue.on('failed', () => {
this.failedTotal.inc({ queue: this.opts.queue.name }, 1);
});
}
protected onStop(): void {
if (this.timer) clearInterval(this.timer);
this.timer = null;
this.opts.queue.removeAllListeners('failed');
}
}
// Export a factory function (same pattern as first-party monitors)
export function bullmqMonitor(opts: IBullMQMonitorOptions): BullMQMonitor {
return new BullMQMonitor(opts);
}Register it the same way as any other monitor:
scope.use(bullmqMonitor({ queue: myQueue }));Rules for user-written monitors:
- Declare all metrics in
onRegister, never inonStartor hot paths - Use only
ctx.metrics(theIMetricFactory) — never importprom-clientdirectly - Accept peer-dep instances via constructor options — never import library packages at module top level
- Name must be unique within a
Deepscopeinstance onStopmust be idempotent
LoggerMonitor is decoupled from any specific logger via the ILoggerStatsProvider interface:
interface ILoggerStatsProvider {
getQueueDepth(): number; // required: current queue size
getDroppedCount(): number; // required: cumulative dropped messages since process start
getEnqueueRate?(): number; // optional: msgs/sec; omit if your logger doesn't track it
}Write a thin adapter for your logger and pass it to loggerMonitor:
import type { ILoggerStatsProvider } from 'deepscope';
// Hypothetical logger with an internal stats object
class MyLoggerAdapter implements ILoggerStatsProvider {
constructor(private readonly logger: MyLogger) {}
getQueueDepth(): number {
return this.logger.stats.pendingCount;
}
getDroppedCount(): number {
return this.logger.stats.totalDropped;
}
getEnqueueRate(): number {
return this.logger.stats.enqueuePerSecond;
}
}
scope.use(loggerMonitor({ provider: new MyLoggerAdapter(myLogger), pollIntervalMs: 5000 }));The adapter pattern means deepscope has no compile-time dependency on any specific logger. Any logger that can expose these three numbers — even via a custom stats scraper — can be monitored.
Open an issue first for any non-trivial change so the approach can be agreed before implementation. Follow the conventions in CLAUDE.md strictly — in particular the file-naming convention, OOP/SOLID requirements, and the rule that every metric change (add, rename, remove) must update docs/metrics.md in the same commit and keep the reference Grafana dashboard in docs/grafana/ consistent.
Bug fixes and monitor additions are welcome. For new monitors, follow the pattern in src/monitors/ and add coverage in both test/unit/monitors/ and test/integration/.
Items out of scope for the current release, but tracked for future work:
- Fastify, Koa, and Hapi adapters for
httpQueueMonitor - OpenTelemetry and StatsD / Datadog metric backends (the
IMetricFactorychokepoint makes this a one-file swap) - Auto-detection of installed peer dependencies (deliberately excluded — explicit
.use()calls only) - Hosting the
/metricsscrape endpoint (the host application owns this)
MIT