Skip to content

ouiex/mocra

Repository files navigation

mocra

A distributed, event-driven crawling and data collection framework for Rust.

Simple

Documentation

  • version 0.2.1

  • Major updates (v0.2.1):

    • Rebuilt a unified node-level metrics system with canonical mocra_* metric families.
    • Integrated key runtime paths (engine, downloader, queue, DAG scheduler, monitor, runner) into unified throughput/latency/error/backlog/inflight reporting.
    • Prometheus endpoint is available at GET /metrics, and local Prometheus + Grafana stack is ready via docker-compose.monitoring.yml.
    • Added real-request benchmark entry in tests: httpbin_prometheus_benchmark (default 1000 requests, 1 second interval) for observable metrics validation.
  • Major updates (v0.2.0):

    • Distributed DAG execution is now a first-class runtime capability.
    • ModuleTrait now supports DAG-oriented execution, with linear compatibility kept for existing modules.
  • Distributed DAG runtime (v0.2.0):

    • Added distributed DAG scheduler capabilities for parallel and resilient execution.
    • Added remote dispatch and worker model for cross-node DAG node execution.
    • Added run guard and fencing token protection for safer distributed consistency.
    • Added run-state checkpoint and resume support for failure recovery.
    • Added singleflight/idempotency controls to reduce duplicate remote execution.
  • ModuleTrait DAG support (v0.2.0):

    • Added DAG-oriented module orchestration integration in engine task flow.
    • Added linear-compatible DAG compile/execute path for incremental migration.
    • Added shadow/preview/cutover governance hooks for gradual rollout.
  • Validation and rollout confidence (v0.2.0):

    • Added end-to-end coverage for Engine-driven and task-queue-driven DAG + ModuleTrait flows.
    • Added dual-node and single-node runtime regression coverage for critical execution paths.
  • Previous release notes:

    • Config precedence fix (v0.1.5):

      • Effective config now consistently follows: ORM/module config > config.toml > hardcoded default.
      • Fixed fallback behavior for module_locker and wss_timeout when ORM value is missing.
      • Fixed boolean fallback in download config loading to inherit from config.toml when module value is absent.
    • Middleware interface update (v0.1.2):

      • DownloadMiddleware / DataMiddleware / DataStoreMiddleware methods now use &mut self.
      • Added store lifecycle hooks in DataStoreMiddleware:
        • before_store(&mut self, _config: &Option<ModuleConfig>) -> Result<()>
        • after_store(&mut self, _config: &Option<ModuleConfig>) -> Result<()>
      • Store execution order is now unified as:
        1. before_store
        2. store_data
        3. after_store
      • If before_store fails, store_data and after_store are skipped and the middleware enters existing error/retry flow.
    • ParserData update (v0.1.1): ParserData.parser_task changed from Option<ParserTaskModel> to Vec<ParserTaskModel>.

      • Parser can now return multiple next parser tasks in a single round.
      • Existing with_task(...) remains available (now appends to the vector).
      • Recommended check for "has next task": use !parser_task.is_empty().
  • Project docs: docs/README.md

  • Unified metrics docs: docs/unified_metrics.md

  • DAG docs index: docs/dag/DAG_GUIDE.md

  • DAG API reference: docs/dag/DAG_API_REFERENCE.md

  • DAG runbook: docs/dag/DAG_RUNBOOK.md

  • API docs (after publish): https://docs.rs/mocra

Monitoring Quick Start

  1. Start Prometheus and Grafana:
docker compose -f docker-compose.monitoring.yml up -d
  1. Run real HTTP benchmark to generate enough events:
$env:BENCH_TOTAL_REQUESTS='1000'
$env:BENCH_INTERVAL_SECS='1'
$env:BENCH_HOLD_SECS='600'
cargo run --manifest-path tests/Cargo.toml --bin httpbin_prometheus_benchmark
  1. Verify scraping:
http://localhost:8905/metrics
http://localhost:9090
http://localhost:3000

License

Licensed under either of:

  • MIT license
  • Apache License, Version 2.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages