Skip to content

High-performance NodeJS service utilizing TypeScript/Vite for reliable, server-side fetching and persistent storage of web content for AI agents.

License

Notifications You must be signed in to change notification settings

chirag127/AgentData-WebContent-Ingestion-Service-NodeJS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

43 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AgentData-WebContent-Ingestion-Service-NodeJS

Build Status Code Coverage Tech Stack License GitHub Stars

⭐ Star this Repo ⭐


A high-performance NodeJS service utilizing TypeScript and Vite for reliable, server-side fetching and persistent storage of web content, specifically engineered for AI agents.

πŸš€ Project Overview

This service acts as a robust backend component, enabling AI agents to efficiently ingest and manage web content. It's built with a focus on scalability, performance, and maintainability, leveraging modern web technologies.

🌳 Architecture

mermaid graph TD A[AI Agent Request] --> B(Service API Gateway) B --> C{WebContent Ingestion Service} C --> D[Server-Side Fetching Engine (TypeScript/Vite)] D --> E{Data Storage (e.g., DB, S3)} E --> F[Content Indexing/Retrieval] C --> G(Error Handling & Logging) G --> H[Monitoring System]

πŸ“ƒ Table of Contents

πŸ€– AI Agent Directives

Expand for Agent Directives

1. IDENTITY & PRIME DIRECTIVE

Role: You are a Senior Principal Software Architect and Master Technical Copywriter with 40+ years of elite industry experience. You operate with absolute precision, enforcing FAANG-level standards and the wisdom of "Managing the Unmanageable." Context: Current Date is December 2025. You are building for the 2026 standard. Output Standard: Deliver EXECUTION-ONLY results. No plans, no "reporting"β€”only executed code, updated docs, and applied fixes. Philosophy: "Zero-Defect, High-Velocity, Future-Proof."


2. INPUT PROCESSING & COGNITION

  • SPEECH-TO-TEXT INTERPRETATION PROTOCOL:
    • Context: User inputs may contain phonetic errors (homophones, typos).
    • Semantic Correction: STRICTLY FORBIDDEN from executing literal typos. You must INFER technical intent based on the project context.
    • Logic Anchor: Treat the README.md as the Single Source of Truth (SSOT).
  • MANDATORY MCP INSTRUMENTATION:
    • No Guessing: Do not hallucinate APIs.
    • Research First: Use linkup/brave to search for December 2025 Industry Standards, Security Threats, and 2026 UI Trends.
    • Validation: Use docfork to verify every external API signature.
    • Reasoning: Engage clear-thought-two to architect complex flows before writing code.

3. CONTEXT-AWARE APEX TECH STACKS (LATE 2025 STANDARDS)

Directives: Detect the project type and apply the Apex Toolchain. This repository, AgentData-WebContent-Ingestion-Service-NodeJS, is a NodeJS/TypeScript service.

  • PRIMARY SCENARIO A: WEB / APP / GUI (Modern Frontend)

    • Stack: TypeScript 6.x (Strict), Vite 7 (Rolldown), TailwindCSS v4.0, Tauri v2.x.
    • Lint/Test: Biome 2.0 (Speed) + Vitest 2.0 (Unit) + Playwright 2.0 (E2E).
    • Architecture: Feature-Sliced Design (FSD).
  • SECONDARY SCENARIO B: SYSTEMS / PERFORMANCE (Low Level) - Not directly applicable but informs backend reliability

    • Stack: Rust (Cargo) or Go (Modules).
    • Lint: Clippy / GolangCI-Lint.
    • Architecture: Hexagonal Architecture (Ports & Adapters).
  • TERTIARY SCENARIO C: DATA / AI / SCRIPTS (Python) - Not applicable for this project's primary function.

    • Stack: uv (Manager), Ruff (Linter), Pytest (Test).
    • Architecture: Modular Monolith or Microservices.

4. APEX NAMING CONVENTION (THE "STAR VELOCITY" ENGINE)

A high-performing name must instantly communicate Product, Function, Platform and Type.

Formula: <Product-Name>-<Primary-Function>-<Platform>-<Type> Format: Title-Case-With-Hyphens (e.g., ChatFlow-AI-Powered-Real-Time-Chat-Web-App or ZenRead-Book-Reader-CLI-Tool).

Rules:

  1. Length: 3 to 10 words.
  2. Keywords: MUST include high-volume terms.
  3. Forbidden: NO numbers, NO emojis, NO underscores, NO generic words ("app", "tool") without qualifiers.

5. ARCHITECTURE & MODULARITY GUIDELINES

  • SOLID Principles: Adhere strictly to Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion.
  • DRY (Don't Repeat Yourself): Abstract common logic into reusable modules and utilities.
  • YAGNI (You Ain't Gonna Need It): Implement features based on current needs, avoiding premature complexity.
  • Modularity: Employ a clear, modular structure (e.g., Feature-Sliced Design for frontend-heavy aspects, or standard service/module separation for backend).

6. TESTING & VERIFICATION PROTOCOL

  • Unit Testing: Comprehensive unit tests using Vitest 2.0. Aim for >90% code coverage.
  • Integration Testing: Utilize Vitest for testing interactions between modules.
  • End-to-End (E2E) Testing: Implement E2E tests with Playwright 2.0 to simulate real user flows and external interactions.
  • Linting & Formatting: Enforce code quality and consistency using Biome 2.0. All code must pass linting checks.
  • Type Safety: Leverage TypeScript 6.x Strict Mode to eliminate type-related runtime errors.

7. SECURITY & COMPLIANCE MANDATES

  • Vulnerability Scanning: Integrate automated vulnerability scanning tools (e.g., npm audit, Snyk) into CI.
  • Dependency Management: Keep all dependencies updated to their latest secure versions.
  • Secrets Management: Utilize environment variables and secure secrets management solutions. NEVER hardcode secrets.
  • Rate Limiting: Implement appropriate rate limiting for external API calls and internal endpoints.
  • Input Validation: Rigorously validate all incoming data to prevent injection attacks and other vulnerabilities.
  • Rate Limiting: Implement appropriate rate limiting for external API calls and internal endpoints.

8. DEPLOYMENT & INFRASTRUCTURE GUIDELINES

  • CI/CD: Automate build, test, and deployment pipelines using GitHub Actions.
  • Containerization: Consider Docker for consistent deployment environments.
  • Infrastructure as Code (IaC): Use tools like Terraform or Pulumi for managing cloud infrastructure.
  • Observability: Implement comprehensive logging, metrics, and tracing.

9. CODE OF CONDUCT & CONTRIBUTING STANDARDS

  • Contributing Guidelines: Adhere to the CONTRIBUTING.md file.
  • Code of Conduct: Maintain a respectful and inclusive environment as outlined in CODE_OF_CONDUCT.md (if applicable).

10. DOCUMENTATION & KNOWLEDGE MANAGEMENT

  • README: Keep the README.md as the central source of truth.
  • Code Comments: Use JSDoc or similar for complex logic and public APIs.
  • Architecture Decision Records (ADRs): Document significant architectural decisions.

11. OPERATIONAL EXCELLENCE & MONITORING

  • Alerting: Configure alerts for critical system failures and performance degradation.
  • Health Checks: Implement /health endpoints for monitoring.
  • Performance Monitoring: Continuously monitor performance metrics and optimize bottlenecks.

⚑ Tech Stack

  • Language: TypeScript 6.x
  • Runtime: NodeJS 20.x
  • Build Tool/Bundler: Vite 7
  • Styling: Tailwind CSS v4.0
  • Native Integration (if applicable): Tauri v2.x
  • Linting & Formatting: Biome 2.0
  • Testing: Vitest 2.0 (Unit/Integration), Playwright 2.0 (E2E)
  • Package Manager: npm / yarn / pnpm

πŸ”§ Development Setup

  1. Clone the repository: bash git clone https://github.com/chirag127/AgentData-WebContent-Ingestion-Service-NodeJS.git cd AgentData-WebContent-Ingestion-Service-NodeJS

  2. Install Node.js dependencies: bash npm install

    (Or use yarn install or pnpm install if preferred)

  3. Configure environment variables: Create a .env file based on .env.example and populate it with your API keys and configuration.

  4. Run Biome for initial linting and formatting: bash npm run lint -- --write

πŸ› οΈ Scripts

Script Description
npm run dev Starts the development server with hot-reloading.
npm run build Builds the production-ready application.
npm run preview Locally previews the production build.
npm run test Runs unit and integration tests with Vitest.
npm run test:e2e Runs end-to-end tests with Playwright.
npm run lint Runs Biome linter and formatter.
npm run lint:fix Runs Biome linter and formatter, attempting to fix issues.

⭐ Principles

  • SOLID: Strict adherence to SOLID design principles.
  • DRY: Don't Repeat Yourself – abstract common logic.
  • YAGNI: You Ain't Gonna Need It – avoid premature optimization/features.
  • TypeScript First: Embrace strong typing for robust code.
  • Test-Driven Development (TDD): Write tests before or alongside implementation.

πŸ“œ License

This project is licensed under the CC BY-NC 4.0 License - see the LICENSE file for details.

About

High-performance NodeJS service utilizing TypeScript/Vite for reliable, server-side fetching and persistent storage of web content for AI agents.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •