A high-performance, developer-friendly cache proxy engine for Node.js designed to handle the complexity of HTTP response caching in data-intensive applications.
In scenarios like high-concurrency API proxies, web crawlers, or microservices, cache management often requires compromises between "speed" and "capacity". @isdk/proxy solves this with its unique architecture:
- Two-Pass Pipeline: Features a decoupled pipeline for "Gatekeeping" (determining cacheability) and "Fingerprinting" (generating cache keys). Both stages use the same configuration logic, achieving full semantic orthogonality.
- Metadata Residency: Metadata (Headers, Status, Policy) always resides in memory, ensuring nanosecond-level cache validity assessment regardless of response body size.
- Request Coalescing: Prevents cache stampedes by ensuring only one concurrent request is sent to the origin when a hot cache expires.
- Environment Agnostic: Built on Web Standard
Request/Responseobjects. Works anywhere.
- 🚀 Hybrid Multi-tier Cache: L1 (LRU memory) for instant response, L2 (content-addressable disk
cacache) for persistent storage. - 📥 HTTP POST & Multi-method Support: Full support for caching POST, PUT, and other non-GET methods with intelligent body fingerprinting.
- 🎯 Granular Interception: Surgical precision in cache control via
rulesfor specific paths or fields. - 🌊 Native Streaming: Built entirely on stream pipelines to prevent OOM when proxying large files.
- 🧠 Intelligent Metadata Residency: Metadata (Headers, Status, Policy) stays in memory for instant policy decisions.
- 🔄 Stale-While-Revalidate (SWR): Returns stale data instantly while updating the cache in the background.
- 🛡️ Request Coalescing: Merges concurrent requests for the same resource to protect upstream servers.
- 🚑 High Resiliency & STALE_RESCUE: Automatically returns stale cache on backend failure (
staleIfError). When WAF challenges, dirty data (viaminLengthorbodypatterns), or 403/429 blocks are detected, it protects the valid old cache and returns it asSTALE_RESCUE. - 🛡️ Built-in WAF Presets: Pre-integrated presets for Cloudflare, AWS WAF, and others, ready to use out of the box.
- 🕵️ Transparent Status: Injects
x-proxy-cacheheader (HIT,STALE,MISS,STALE_RESCUE,STALE_IF_ERROR) for easy debugging.
pnpm add @isdk/proxyThe primary way to use @isdk/proxy is through the fetchWithCache function, which can wrap any HTTP request logic.
import { SmartCache, createCachedFetch } from '@isdk/proxy';
// 1. Initialize hybrid cache instance
const cache = new SmartCache({
storagePath: './.cache',
maxMemorySize: 1024 * 1024 // 1MB memory threshold
});
// 2. Create a pre-configured cached fetcher
const myFetch = createCachedFetch({
cache,
config: {
staleIfError: true,
},
backgroundUpdate: true // Enable SWR
});
// 3. Use it!
const response = await myFetch(new Request('https://api.example.com/data'), (req) => fetch(req));
console.log(response.headers.get('x-proxy-cache'));Configure methods to enable POST caching and use field filters to ensure cache key stability.
const myPostFetch = createCachedFetch({
cache,
config: {
methods: ['GET', 'POST'],
// Query filtering: defaults to all, here excluding 'timestamp'
query: ['*', '!timestamp'],
// Body filtering: field-level matching for JSON
body: {
match: { 'action': 'query', 'version': true },
maxLength: 1024 // Limit body read length
},
rules: [
{ methods: ['POST'], path: '/api/v1/query' }
],
forceCache: true
}
});| Option | Type | Description |
|---|---|---|
path |
MatchPatterns |
Path gatekeeping. Supports Glob, Regex, or Negation. |
methods |
MatchPatterns |
Allowed HTTP methods. Default ['GET', 'HEAD']. |
rules |
ProxyCacheRule[] |
Granular rules. Matched rules are deeply merged with site-level config. |
query |
FieldConfig |
Query parameter filtering. Defaults to all. |
headers |
FieldConfig |
Header filtering. Defaults to none. |
cookies |
FieldConfig |
Cookie filtering. Defaults to none. |
body |
BodyConfig |
Body matching & extraction. Supports gatekeeping via match and fingerprinting via extract. |
staleIfError |
boolean |
Return stale cache on backend errors. |
forceCache |
boolean |
Force caching regardless of origin directives. |
offline |
boolean |
Strict offline mode: Read-only cache, returns 512 on cache miss. |
response |
ResponseConfig |
Response-side cacheability validation. Supports status, headers, and body matching. |
Define "what is valid and cacheable content" to automatically filter out WAF challenge pages.
| Option | Type | Description |
|---|---|---|
statuses |
MatchPatterns |
Allowed HTTP status codes. Defaults to common cacheable statuses (200, 404, etc.). |
headers |
FieldConfig |
Required or forbidden response headers. |
body |
MatchPatterns |
Response body matching. Supports Glob negation (e.g., !*Challenge*) to exclude dirty data. |
minLength |
number |
Minimum content length. Shorter responses will be intercepted (triggers STALE_RESCUE). |
For complex bodies, @isdk/proxy supports a clean separation of concerns:
| Option | Type | Description |
|---|---|---|
type |
'json' | 'text' | 'binary' |
Body type. Automatically determined by Content-Type if omitted. |
match |
FieldConfig | MatchPatterns |
Gatekeeping. Field-level validation for JSON or Pattern matching for Text. |
extract |
FieldConfig | MatchPatterns |
Fingerprinting. Priority over match. Supports field filtering for JSON fingerprints. |
maxLength |
number |
Maximum read limit during validation/extraction. |
sort |
boolean |
Sort JSON keys to ensure fingerprint stability. Defaults to true. |
| Status | Description |
|---|---|
HIT |
Cache hit, fresh content within TTL. |
OFFLINE_HIT |
Cache hit successfully in offline: true mode. |
STALE |
Cache hit but expired, SWR background update triggered. |
MISS |
Cache miss, request sent to origin and result cached. |
STALE_IF_ERROR |
Origin request failed (network error or 5xx), returned expired stale cache. |
STALE_RESCUE_{REASON} |
Disaster recovery protection. Served valid old cache when origin returned invalid data (e.g., WAF_CHALLENGE or TOO_SHORT). |
MISS_EXCLUDED_REQUEST |
Request excluded from caching by configuration rules (method, path, etc.). |
OFFLINE_MISS_EXCLUDED_REQUEST |
Offline mode, request excluded by rules and no local cache available. |
MISS_UNSTORABLE |
Response not storable (e.g., no-store directive and forceCache off). |
MISS_EXCLUDED_{REASON} |
Response validation failed (e.g., body too short or WAF challenge detected). |
MISS_EXCLUDED_WAF_CHALLENGE |
Explicitly detected WAF challenge page and no old cache available. |
@isdk/proxy includes built-in detection rules for major WAF providers (e.g., Cloudflare, AWS WAF), enabled by default. These rules are defined as Positive Signatures, meaning if a response matches any of the defined features (status code, header, or body keyword), it's identified as a WAF challenge.
You can dynamically manage WAF presets via the following APIs:
import {
registerWAFPreset,
unregisterWAFPreset,
isWAFChallenge,
CLOUDFLARE_WAF_PRESET
} from '@isdk/proxy';
// 1. Register a custom WAF signature
registerWAFPreset({
response: {
statuses: ['418'],
body: ['*I am a teapot*']
}
});
// 2. Programmatic Detection (Manual check in code)
// This function automatically handles clone(), so it won't consume the original stream
if (await isWAFChallenge(response)) {
console.log('WAF Challenge detected, intervention required');
}
// 3. Unregister a specific preset
unregisterWAFPreset(CLOUDFLARE_WAF_PRESET);| Function | Description |
|---|---|
isWAFChallenge(res, presets?) |
Determines if a response is a WAF challenge. Supports optional custom presets. |
getWAFPresets() |
Retrieves all currently registered WAF preset rules. |
registerWAFPreset(rule) |
Registers a new WAF signature rule. |
unregisterWAFPreset(rule) |
Unregisters an existing rule. |
clearWAFPresets() |
Clears all registered WAF presets. |
Note
fetchWithCache automatically calls isWAFChallenge when processing responses. If a WAF challenge is detected and a valid old cache exists, it triggers STALE_RESCUE_WAF_CHALLENGE to prevent your clean data from being overwritten by "dirty" data.
@isdk/proxy provides powerful matching capabilities with negation support:
| Type | Example | Description |
|---|---|---|
| Negation | ['*', '!/api/private/**'] |
Exclude matching (starts with !). |
| Glob | /**/*.json |
Path-style glob matching. |
| Regex String | "/^api\\/v\\d/" |
Automatically converted to RegExp object. |
| Boolean (Field) | true / false |
Required / forbidden. |
Tip
Exclusion Priority: In a MatchPatterns array, if any ! pattern matches, the overall result is false.
@isdk/proxy distinguishes between two complementary matching modes: MatchPatterns Mode and Record Mode. Understanding their differences is crucial for robust configuration.
- Types:
string | RegExp | Array - Semantic: Distinction between Single Value (String/Regex) and List Mode (Array).
| Form | Matching Rule (Blocking) | Description | Typical Use Case |
|---|---|---|---|
| Single Value (String/Regex) | Strict (All keys) | Every key in the request must satisfy this rule. | Strict Exclusion/Access. e.g., !id strictly forbids 'id' entirely; id only allows requests with 'id' alone. |
| List Mode (Array) | Lenient (Any key) | Matches if any key satisfies the rule; negations are ignored during matching. | Parameter Filtering. e.g., ['*', '!sid'] ignores 'sid' for the cache key without blocking the request. |
| Config Example | Is Request Blocked? | What's in the Cache Key? |
|---|---|---|
query: '!id' |
❌ Blocked if id is present |
All params except id |
query: ['*', '!id'] |
✅ Not blocked even if only id exists |
All params except id |
query: 'id' |
✅ Only allowed if id is the only key |
Only id field |
query: ['id'] |
✅ Allowed if id is present |
Only id field |
Tip
Simple Rule: Use a single value for "Strict format constraints"; use an array for "Excluding parameters from the cache key".
- Types:
Record<string, ProxyMatchPatterns | boolean> - Semantic: Field Validation. Logic declarations for specific keys.
- Logic: Based on
ANDlogic.
| Config Example | Semantic | Result for Empty Request |
|---|---|---|
{} (Empty Object) |
No validation rules. | ✅ Pass |
{ sid: true } |
Explicitly requires 'sid' to exist. | ❌ Blocked |
{ sid: false } |
Explicitly requires 'sid' to NOT exist. | ✅ Pass |
{ lang: 'en' } |
Requires 'lang' to exist and match 'en'. | ❌ Blocked |
| Configuration | Matching Result | Recommended Usage |
|---|---|---|
undefined |
Pass (Ignored) | Default: no gatekeeping check for this category. |
{} (Empty Object) |
Pass (Valid) | No rules defined means everything is allowed. |
[] (Empty Array) |
Fail (Invalid) | No key can pass an empty matching set. |
| Empty Request | Depends on Category | query passes by default; headers/cookies fail by default. |
@isdk/proxy allows you to attach an isdkProxy property directly to the Request object. This is the highest priority configuration method, enabling you to adjust cache behavior dynamically based on business logic at the moment of the request.
const req = new Request('https://api.example.com/data');
// Attach runtime instructions
(req as any).isdkProxy = {
refresh: true, // Bypass cache and force a "healing" update
forceCache: true, // Force caching even if origin says no-store
onBackgroundUpdate: (res) => { ... }, // Override global SWR callback
generateKey: async (req) => 'custom_key', // Override hashing logic
config: { // Temporary rule overrides
offline: true, // Dynamic offline mode
body: {
match: ['*'], // Gatekeeping: allow all
extract: ['id', '!ts'] // Extraction: exclude 'ts' field from fingerprint
}
}
};
const res = await fetchWithCache(req, fetcher, { cache, config: siteConfig });Priority Order:
Request.isdkProxy(Runtime) - Top-level override.Matched Rule(Rule Level) - Specific rule matching the URL/Body.Site Config(Site Level) - Domain-based configuration.Global Config(Global Level) - System defaults.
- First Pass: Gatekeeping
Uses
path,methods, etc., to determine if the request is eligible for caching. - Second Pass: Fingerprinting
Uses the same field configurations to implicitly perform data extraction. For example, if
queryis['*', '!token'], the extraction phase automatically stripstokenbefore hashing.
- Node.js Server: @isdk/proxy-server.
- Crawlee Adapter: @isdk/proxy-crawlee.
- MSW Adapter: @isdk/proxy-msw.
A high-level factory function for end users. It automatically maintains the concurrency tracker in an internal closure and returns a production-ready Fetch instance with built-in cache stampede protection.
options.cache: ASmartCacheinstance.options.config: Global configuration object (ProxyConfig).options.backgroundUpdate: Whether to enable SWR (Stale-While-Revalidate). Defaults totrue.options.onBackgroundUpdate: Callback that receives the background update Promise when triggered.options.refresh: Force Refresh. Bypasses cache reading to force an origin request. If a valid response is received, it automatically "heals" and updates the cache. Often used to "pierce" through WAF challenges.options.activeCacheWrites: Optional. Shared concurrency tracker Map.- Returns: A wrapped fetch function
(request, fetcher) => Promise<Response>.
A single-responsibility utility for isolating the activeCacheWrites concurrency tracker. Use this if you are building middleware and want to avoid manual tracker management.
activeCacheWrites: Optional. An externalMap<string, Promise<void>>.- Returns: A
fetchWithCachevariant bound to the tracker.
The low-level core coordination function.
request: Web StandardRequestobject.fetcher: Origin request callback(req: Request) => Promise<Response>.options.activeCacheWrites: Required. Shared lock state Map.options.cache:SmartCacheinstance.options.config:ProxySiteConfigconfiguration.options.backgroundUpdate: Whether to enable SWR.
The core engine managing multi-tier hybrid storage.
new SmartCache(options)options.maxMemorySize: Threshold (in bytes) for storing response bodies in memory (L1). Bodies larger than this stream directly to disk (default1048576, i.e., 1MB).options.storagePath: Physical path for the disk L2 cache (cacache).
Universal matching function supporting Regex, Glob, Negation patterns, and simple strings.
pattern:string | RegExp | (string | RegExp)[]value: The string to test.usePrefix: Whether to use prefix matching for simple strings (default:false).defaultIfNoPositives: Return value when no positive patterns match (default:true).ignoreCase: Whether to perform case-insensitive matching (default:true).
import { isMatch } from '@isdk/proxy';
isMatch('/api/v[12]/.*', '/api/v1/users'); // Regex
isMatch('/api/**/*.json', '/api/v1/data.json'); // Glob
isMatch(['*', '!/private/**'], '/api/data'); // Negation: Allow all except private
isMatch(['!id'], 'id', false, false); // Returns false (no positive match)Checks if a string is a Glob pattern.
pattern:string- Returns:
boolean
import { isGlob } from '@isdk/proxy';
isGlob('/api/*.json'); // true
isGlob('/api/v1'); // falseRetrieves site-specific cache configuration based on the URL. It first tries to match hostnames or path prefixes in sites, otherwise falls back to proxyConfig.
urlString: The complete request URL.proxyConfig: TheProxyConfigobject containingsitesand global rules.- Returns: A
ProxySiteConfigobject.
import { getSiteConfig } from '@isdk/proxy';
const config = getSiteConfig('https://api.example.com/data', {
methods: ['GET'],
sites: {
'api.example.com': { forceCache: true }, // Hostname match
'/internal/': { offline: true } // Path prefix match
}
});Determines if a specific key (e.g., header name) is allowed in fingerprinting.
key: The key name.config:ProxyMatchPatternsconfiguration.defaultAllowed: Default policy if no patterns match.
import { isAllowed } from '@isdk/proxy';
isAllowed('id', ['id', 'name']); // true (Whitelist)
isAllowed('auth', ['*', '!auth']); // false (Blacklist)
isAllowed('other', ['!id'], false, false); // false (Default denied)Extracts and normalizes data from source objects. Used for fingerprinting.
source: The original data object.config:ProxyFieldConfigorProxyMatchPatterns.defaultAllowed: Default extraction policy.
import { extractData } from '@isdk/proxy';
const headers = { 'Content-Type': 'application/json', 'X-Token': 'abc' };
// Array mode: filter Keys
extractData(headers, ['content-type']); // { 'content-type': ['application/json'] }
// Object mode: precise Value matching
extractData(headers, {
'content-type': '/^application\/.*/'
}); // { 'content-type': ['application/json'] }Prefetch function to populate the cache with a specified list of URLs.
-
options.urls:PrefetchRequest[]. -
options.config: FullProxyConfig. -
options.cache:SmartCacheinstance. -
options.concurrency: Concurrency limit (default3). -
options.onProgress: Progress callback(completed, total, url) => void. -
Returns:
Promise<PrefetchResult>succeeded: Number of successfully prefetched requests.failed: Number of failures.errors: List of failure details{ url, error }[].
import { prefetch } from '@isdk/proxy';
const result = await prefetch({
urls: [{ url: 'https://api.com/page1' }],
config,
cache,
onProgress: (c, t, url) => console.log(`${c}/${t}: ${url}`)
});
console.log(`Succeeded: ${result.succeeded}, Failed: ${result.failed}`);When offline: true is enabled and the request misses the cache, a Response with status 512 is returned instead of throwing an error.
status:512(Custom status codeOfflineCacheMissErrorCode)statusText:Offline mode: No cached response
const response = await myFetch(request);
if (response.status === OfflineCacheMissErrorCode) {
// Handle cache miss
}All Response objects returned by @isdk/proxy include an x-proxy-cache header for observability. This header provides granular status information:
- Core Hits:
HIT: Cache hit. Data served from L1 (memory) or L2 (disk).OFFLINE_HIT: Served from cache in offline mode.
- Fetching & Updates:
MISS: Cache miss. Fetched from origin and successfully cached.STALE: Stale hit. Served from cache while a background SWR update is triggered.
- Failovers:
STALE_IF_ERROR: Backend failed; serving stale cache as a fallback.STALE_RESCUE_{REASON}: Disaster recovery protection. Served valid old cache when origin returned invalid data.
- Exclusion Reasons:
MISS_EXCLUDED_REQUEST: Request excluded by configuration rules.OFFLINE_MISS_EXCLUDED_REQUEST: Offline mode, request excluded and no cache found.MISS_UNSTORABLE: Response not cacheable (e.g.,Cache-Control: no-store).MISS_EXCLUDED_{REASON}: Response validation failed; data fetched but not cached.
Common {REASON} Suffixes:
| Suffix | Meaning |
|---|---|
WAF_CHALLENGE |
Explicitly detected WAF challenge page (via built-in or custom rules). |
TOO_SHORT |
Content length is less than the configured minLength. |
BODY_MATCH_FAILED |
Content failed body keyword matching (negation hit or positive miss). |
STATUS_MISMATCH_{CODE} |
Status code not in the allowed cache list (e.g., STATUS_MISMATCH_503). |
HEADERS_MISMATCH |
Response headers do not meet configuration requirements. |
BODY_READ_ERROR |
Error occurred while reading response body for analysis. |
UNKNOWN |
Other unspecified validation failure. |
To ensure consistency for downstream consumers, responses returned by fetchWithCache feature:
- URL Preservation: The
response.urlproperty correctly reflects the original request URL, even when served from cache. - Clone Compatibility: Custom properties and headers are preserved when calling
response.clone().
This library uses the debug package. Enable internal tracing by setting the DEBUG environment variable:
# Trace all cache logic for fetch operations
DEBUG=@isdk/proxy:fetchWithCache node app.js
# Trace everything
DEBUG=@isdk/proxy:* node app.jsLogs cover configuration merging, fingerprinting, policy evaluation, SWR tasks, and response validation.
MIT