Skip to content

myProjectsRavi/API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Production-Ready USDA Nutritional Data API

This project provides a robust, high-performance, and production-grade serverless API for accessing nutritional data from the USDA's FoodData Central. Built on Cloudflare Workers and TypeScript, it features a resilient architecture with a Cloudflare D1-powered caching layer, structured logging, and a full suite of tests.

πŸš€ New Features: Advanced Natural Language Processing

Now featuring a zero-cost, highly efficient natural language processing system:

🎯 Intelligent Query Parsing

  • Parse complex food queries with quantity, units, and preparation methods
  • Smart food recognition with local fuzzy matching
  • Context-aware processing for preparation methods

πŸ“ Advanced Unit Handling

  • Standard measurements (g, kg, lb, oz)
  • Informal measurements (pinch, dash, handful)
  • Fraction support (1/2, quarter, half)
  • Range handling (2-3 tablespoons)

🧠 Smart Features (Zero-Cost Implementation)

  • Local fuzzy string matching for food recognition
  • Food substitution suggestions
  • Preparation method impact analysis
  • Nutritional context awareness

πŸ’‘ Example Queries

"100g of chedar cheese" 
β†’ Suggests "cheddar cheese" with alternatives

"2-3 tablespoons olive oil" 
β†’ Handles range and provides context

"grilled chicken breast" 
β†’ Includes preparation method impact

Production-Ready USDA Nutritional Data API

This project provides a robust, high-performance, and production-grade serverless API for accessing nutritional data from the USDA's FoodData Central. Built on Cloudflare Workers and TypeScript, it features a resilient architecture with a Cloudflare D1-powered caching layer, structured logging, and a full suite of tests.

πŸš€ New Features: Zero-Cost Smart Natural Language Processing

Now featuring a sophisticated yet cost-efficient natural language processing system:

🎯 Intelligent Query Parsing

  • Parse complex food queries with smart entity recognition
  • Handle informal measurements and fractions
  • Support for preparation methods and modifiers
  • Efficient local fuzzy matching

πŸ“Š Smart Nutritional Context

  • Preparation method impact analysis
  • Food category recognition
  • Intelligent substitution suggestions
  • Serving recommendations

πŸ” Enhanced Error Handling

  • Smart typo detection
  • Helpful suggestions for invalid queries
  • Context-aware error messages
  • Alternative recommendations

All features implemented with zero external dependencies and no ongoing costs!

⚑ Phase 2: Performance Multipliers (NEW!)

Dramatic performance improvements with minimal setup:

πŸš€ USDA Batch API Service

  • Up to 20 foods in a single API call instead of 20 separate calls
  • Automatic request batching with intelligent queuing
  • 90% reduction in API calls for multi-item queries
  • Zero configuration required - works automatically

πŸ”₯ Hot Cache for Top 100 Foods

  • <5ms response time for most common queries
  • ~80% cache hit rate with just 100 entries
  • One-time seeding of popular foods
  • Automatic query frequency tracking

πŸ“Š Performance Impact

  • Before Phase 2: 150ms avg, 2-3 API calls per request
  • After Phase 2: <10ms for 80% of queries, 88% fewer API calls
  • Cost Savings: Massive reduction in API usage and compute time

See PHASE_2_QUICKSTART.md for deployment instructions.

Example Queries & Responses

Basic Query

POST /api/natural-language-search
Content-Type: application/json

{
  "query": "100g of chiken brest"
}

Response:

{
  "parsed": {
    "quantity": 100,
    "unit": "g",
    "foodName": "chicken breast",
    "quantityInGrams": 100
  },
  "suggestions": [
    {
      "word": "chicken breast",
      "similarity": 85,
      "category": "meat",
      "alternatives": ["turkey breast", "tofu"]
    }
  ],
  "nutritionalContext": {
    "category": "meat",
    "preparation": {
      "suggested": ["grilled", "baked", "pan-fried"],
      "impact": {
        "grilled": {
          "calories": -5,
          "notes": ["Reduced fat content", "Minimal nutrient loss"]
        }
      }
    }
  }
}

Complex Query with Preparation

POST /api/natural-language-search
Content-Type: application/json

{
  "query": "2-3 tablespoons of extra virgin olive oil for cooking"
}

Response:

{
  "parsed": {
    "quantity": 2.5,
    "unit": "tablespoons",
    "foodName": "extra virgin olive oil",
    "quantityInGrams": 37.5,
    "preparation": "cooking"
  },
  "nutritionalContext": {
    "category": "oils",
    "preparation": {
      "method": "cooking",
      "notes": [
        "Better alternatives for high-heat cooking: regular olive oil, avocado oil",
        "Extra virgin olive oil best used unheated for dressings and finishing"
      ]
    }
  }
}

Table of Contents


Architecture Overview

The API is designed for high availability and low latency by leveraging a serverless architecture on Cloudflare Workers and a persistent caching layer with Cloudflare D1.

Core Components

  • Cloudflare Worker (TypeScript): The core application logic runs on Cloudflare's global network, ensuring requests are handled close to the user.
  • itty-router: A lightweight, high-performance router for handling API endpoints within the worker.
  • Cloudflare D1: Serves as a persistent, external cache to store responses from the USDA API. This dramatically reduces latency for repeated requests and lessens the load on the upstream API.
  • Structured Logging: All log output is in a machine-readable JSON format, which is essential for effective monitoring and debugging in a production environment.

Request Lifecycle & Caching Strategy

The caching logic is central to the API's performance and resilience. It implements a stale-while-revalidate strategy.

  1. Incoming Request: A user requests data for a specific food_id.
  2. Cache Check (Read): The worker first queries the D1 database using the food_id as the cache key.
  3. Cache Hit: If a fresh (not expired) record is found, the worker immediately returns the cached data. This is indicated by an X-Cache-Status: HIT header.
  4. Cache Stale: If the data is found but has passed its ttl (Time-to-Live), it is considered "stale." The worker returns the stale data immediately (X-Cache-Status: STALE) and simultaneously triggers a background fetch to the USDA API to refresh the cache. This ensures the user gets a fast response while the cache is updated asynchronously.
  5. Cache Miss: If no record is found, the worker calls the external USDA FoodData Central API.
  6. Fetch & Parse: The worker fetches the raw data, validates it against a Zod schema, and transforms it into a clean, standardized JSON format.
  7. Cache Write: The newly fetched data is written to the D1 database with a ttl and a stale_while_revalidate period.
  8. Response: The worker returns the freshly fetched data to the user with an X-Cache-Status: MISS header.

Architecture Diagram


API Documentation

Health Check

A comprehensive endpoint to verify that the worker and all its dependencies (USDA API, D1) are running and responsive.

  • Endpoint: GET /health
  • Success Response (200 OK):
    {
      "status": "ok",
      "checks": {
        "usdaApi": { "status": "ok", "message": "USDA API is reachable." },
        "d1": { "status": "ok", "message": "D1 is reachable." },
    "apiKeyDb": { "status": "ok", "message": "API key D1 database is reachable (Cloudflare D1)." }
      }
    }
  • Error Response (503 Service Unavailable):
    {
      "status": "error",
      "checks": {
        "usdaApi": { "status": "error", "message": "USDA API is unreachable." },
        "d1": { "status": "ok", "message": "D1 is reachable." },
        "apiKeyDb": { "status": "ok", "message": "API key D1 database is reachable." }
      }
    }

Get Food Data

Retrieves detailed nutritional information for a specific food item by its FDC ID.

  • Endpoint: GET /food/:id
  • URL Parameters:
    • id (required): The FoodData Central ID of the food item.
  • Success Response (200 OK):
    • The response is a structured JSON object containing the most essential nutrients.
    • Example (GET /food/746782):
      {
        "fdcId": 746782,
        "description": "Cheese, cheddar, sharp",
        "calories": {
          "value": 404,
          "unit": "KCAL"
        },
        "protein": {
          "value": 24.9,
          "unit": "G"
        },
        "fat": {
          "value": 33.14,
          "unit": "G"
        },
        "carbohydrates": {
          "value": 1.28,
          "unit": "G"
        }
      }

Search for Foods

Searches for foods based on a query string. This endpoint is useful for finding foods by name or brand.

  • Endpoint: GET /v1/search
  • Authentication: Required (API Key)
  • Query Parameters:
    • query (required): The search term (e.g., "cheddar cheese").
    • dataType (optional): The type of food data (e.g., "Branded", "Foundation").
    • pageSize (optional): The number of results to return (default: 10).

API Response Structure

Understanding the /v1/search Response Structure

The /v1/search endpoint returns detailed nutritional information. The primaryFood object contains two main sets of data regarding serving size and nutrients:

  1. Reference Data (Based on USDA Standard):

    • referenceServing: This object always describes the standard 100g serving size used by the USDA FoodData Central database.
      • size: Always 100.
      • unit: Always "g".
    • referenceNutrients: This object contains the detailed nutritional values (protein, fat, calories, vitamins, etc.) corresponding exactly to the 100g referenceServing. This provides a consistent baseline for comparison across different foods.
  2. Calculated Data (Based on Your Query):

    • calculatedAmount: This object provides details about the specific amount calculated based on your input query (quantity, unit, totalGramWeight).
      • If your query included a quantity and unit (e.g., "3 apples", "200g rice"), this section details how the total gram weight was determined (e.g., which portion size was matched, the weight per unit, and the final totalGramWeight).
      • If your query did not include a quantity and unit (e.g., "apple"), this section defaults to reflecting the 100g reference amount (totalGramWeight: 100).
    • calculatedNutrients: This object contains the nutritional values scaled to match the totalGramWeight shown in calculatedAmount.
      • For a query like "3 apples", these nutrients will reflect the total for ~600g (or whatever the calculated weight is).
      • For a query like "apple", these nutrients will be identical to referenceNutrients (reflecting the 100g default).

Why Both? This structure gives you flexibility:

  • Use referenceNutrients if you always need data per 100g for comparisons.
  • Use calculatedNutrients if you need the nutritional information for the specific amount requested in the user's query.

Example 1: Query apple

{
  "query": "apple",
  "parsed": { "quantity": null, "unit": null, "food": "apple" },
  "primaryFood": {
    // ... other fields
    "referenceServing": { "size": 100, "unit": "g" },
    "referenceNutrients": { "calories": { "value": 61, /* ... */ } },
    "calculatedAmount": { "totalGramWeight": 100, /* ... */ },
    "calculatedNutrients": { "calories": { "value": 61, /* ... */ } } // Same as reference
    // ...
  }
}

Example 2: Query 3 apples

{
  "query": "3 apples",
  "parsed": { "quantity": 3, "unit": "apple", "food": "apple" },
  "primaryFood": {
    // ... other fields
    "referenceServing": { "size": 100, "unit": "g" },
    "referenceNutrients": { "calories": { "value": 61, /* ... */ } }, // Per 100g
    "calculatedAmount": { "totalGramWeight": 600, /* based on 3 * 200g/apple */ },
    "calculatedNutrients": { "calories": { "value": 366, /* Scaled: 61 * 6 */ } } // Scaled to 600g
    // ...
  }
}

Natural Language Search

Performs a search using a natural language query to identify a food and its quantity.

  • Endpoint: POST /v1/natural-language-search
  • Authentication: Required (API Key – Free or Pro)
  • Body:
    • text (string, required): A natural language query (e.g., "100g of cheddar cheese").
    • maxResults, confidence, filterForSuggestions (optional): Advanced controls for USDA lookups.
  • Success Response (200 OK):
    {
      "query": "100g of cheddar cheese",
      "foods": [
          {
              "description": "Cheese, cheddar, sharp",
              "category": "Branded",
              "nutrients": {
                  "Protein": {
                      "value": 22.87,
                      "unit": "G"
                  },
                  "Fat": {
                      "value": 33.82,
                      "unit": "G"
                  },
                  "Carbohydrates": {
                      "value": 2.77,
                      "unit": "G"
                  },
                  "Energy": {
                      "value": 411,
                      "unit": "KCAL"
                  }
              }
          }
      ]
    }

Premium AI Natural Language Search (Pro Tier)

Unlock the Workers AI-powered parser for more nuanced, multi-item meal descriptions.

  • Endpoint: POST /v2/ai-natural-language-search
  • Authentication: Requires a Pro tier API key
  • Body:
    • text (string, required): Meal description (max 500 characters)
    • Optional knobs: maxResults, confidence, filterForSuggestions
  • What you get:
    • AI-interpreted items with unit normalization and gram estimates
    • USDA-backed search results with confidence scores
    • Response meta showing cache status and model identifier (@cf/meta/llama-2-7b-chat-int8)
  • Generate a Pro key: GET /_admin/generate-key?tier=pro

Getting Started & Deployment

Follow these steps to set up and deploy the worker.

-### Prerequisites

Step 1: Set Up the D1 Database

  1. Create the D1 Database:
    • In the Cloudflare dashboard, create a new D1 database.
    • Bind it to your worker in wrangler.toml with the binding name DB.
  2. Run the Schema:
  • Use Wrangler to execute the schema.sql file to create the necessary tables for caching and API key management in Cloudflare D1.
# Example: apply schema.sql to your production D1 database binding
wrangler d1 execute --binding API_KEYS_DB --file=schema.sql

Step 2: Configure Cloudflare Secrets

Secrets are used to store sensitive data like API keys and credentials. They are encrypted and cannot be viewed after being set.

# 1. USDA API Key (get one from https://api.nal.usda.gov/)
wrangler secret put USDA_API_KEY

# 2. Cloudflare D1 for API key management
Create the D1 database and bind it in `wrangler.toml` as `API_KEYS_DB`. The project stores API key metadata and validation data in Cloudflare D1. Optionally create a KV namespace called `API_KEY_CACHE_KV` for short-lived API key lookup caching.

# 3. Admin token for protected endpoints
wrangler secret put ADMIN_TOKEN

For local development, create a .dev.vars file in the project root and add your secrets there.

Step 3: Local Development & Deployment

  1. Install Dependencies:
    npm install
  2. Run Locally:
    npm run dev
  3. Deploy to Cloudflare:
    npm run deploy

Testing

The project includes a comprehensive test suite using vitest.

  • Unit & Integration Tests: Located in the tests/ directory, they cover individual functions and the complete request/response flow by mocking external services.

To run the full test suite:

npm test

Code Quality

Input Validation & Security

Comprehensive Input Validation

All API endpoints validate incoming data using zod schemas. This ensures:

  • Type safety (e.g., string, number, object)
  • Required fields are present
  • Length and format constraints
  • Consistent error responses

Example validation (TypeScript):

import { z } from 'zod';
const NaturalLanguageSearchSchema = z.object({
  query: z.string().min(1).max(100),
});

All validation errors return a structured JSON error response with details.

Input Sanitization & NoSQL Injection Protection

All user-supplied inputs used in database queries are sanitized using a strict allowlist of safe characters. This prevents injection attacks.

Example sanitization:

import { sanitize } from './utils/sanitizer';
const safeKeyId = sanitize(keyId);

Sanitization is applied before any database query, including API key lookups, quota/rate checks, and admin actions.

Security Best Practices

  • All secrets and credentials are stored using Cloudflare Secrets (never in code or env files).
  • Structured logging redacts sensitive headers and tokens.
  • All error responses use a consistent ErrorResponse model.
  • Rate limiting and quota enforcement are applied to all endpoints.

  • ESLint: Enforces code quality and best practices.
  • Prettier: Ensures consistent code formatting.

To check for linting errors:

npm run lint

To automatically format all code:

npm run format

Logging Privacy & Retention

This project emits structured JSON logs intended for machine parsing by observability systems. When deploying to production, follow these guidelines to protect user privacy and to control costs:

  • Redact sensitive headers and tokens before emitting logs (the worker uses sanitizeHeaders to redact Authorization, cookie headers, and similar values).
  • Avoid logging full request bodies unless strictly necessary; if you must log request bodies, mask PII (emails, phone numbers, SSNs) and truncate long content.
  • Implement log retention policies in your logging backend (for example: keep detailed logs for 30 days, aggregated metrics for 365 days).
  • Consider sampling high-volume, low-value logs (such as repeated 400-level client errors) to reduce cost and noise.
  • Ensure logs are transmitted over TLS and stored encrypted at rest in your logging backend.

These guidelines reduce the risk of accidental PII exposure and help maintain cost-effective observability.


API Usage Examples

Here are some examples of how to use the API in different programming languages.

JavaScript (Node.js)

const fetch = require('node-fetch');

const apiKey = 'YOUR_API_KEY';
const foodId = '746782'; // Example: Cheddar Cheese

fetch(`https://your-worker.your-domain.workers.dev/food/${foodId}`, {
  headers: {
    'x-api-key': apiKey,
  },
})
  .then(response => response.json())
  .then(data => console.log(data))
  .catch(error => console.error('Error:', error));

Python

import requests

api_key = 'YOUR_API_KEY'
food_id = '746782' # Example: Cheddar Cheese
url = f'https://your-worker.your-domain.workers.dev/food/{food_id}'

headers = {
    'x-api-key': api_key
}

response = requests.get(url, headers=headers)

if response.status_code == 200:
    print(response.json())
else:
    print(f"Error: {response.status_code}, {response.text}")

Getting Started Guide

  1. Obtain an API Key: Contact our sales team at sales@example.com to get your API key.
  2. Making Requests: All requests must include your API key in the x-api-key header.
  3. Response Format: All successful responses will be in JSON format. Errors will also be returned as JSON with an appropriate status code.

Pricing Model

We offer the following tiers for our API:

Tier Price Requests/Month
Free $0/month 1,000
Pro $50/month 100,000
Enterprise Custom Custom
  • Premium Features: The POST /v2/ai-natural-language-search endpoint and future AI add-ons are available to Pro keys (and above) only. Requests from Free keys return 403 Forbidden.

For more details, please visit our pricing page at example.com/pricing.


Rate Limiting

This API enforces both global and endpoint-specific rate limits based on your API key tier (e.g., free, pro).

How It Works

  • Global Tier Limit: Each API key tier has a default global limit (e.g., 100 requests/min for free tier).
  • Endpoint-Specific Limit: Some endpoints (e.g., /food/search) may have stricter limits (e.g., 20 requests/min for free tier).
  • The middleware checks for an endpoint-specific limit first; if none is set, it falls back to the global tier limit.

Example Rate Limit Config

rateLimits: {
  free: {
    global: { maxRequests: 100, windowMs: 60000 },
    endpoints: {
      '/food/search': { maxRequests: 20, windowMs: 60000 },
      '/admin/stats': { maxRequests: 5, windowMs: 60000 }
    }
  },
  pro: {
    global: { maxRequests: 1000, windowMs: 60000 },
    endpoints: {
      '/food/search': { maxRequests: 200, windowMs: 60000 }
    }
  }
}

Rate Limit Headers

Every response includes headers to help you track your usage:

  • X-RateLimit-Limit: Maximum requests allowed in the window
  • X-RateLimit-Remaining: Requests remaining in the current window
  • X-RateLimit-Reset: Time (in seconds) until the window resets

Error Response (429 Too Many Requests)

If you exceed your rate limit, you will receive:

{
  "statusCode": 429,
  "error": "Rate limit exceeded. Please try again in 30 seconds.",
  "details": [
    { "field": "Retry-After", "value": "30" }
  ]
}

πŸ“š Documentation

User Guides

Technical Documentation

Advanced Features


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published