Skip to content

feat: add comprehensive logging across scan pipeline#2

Closed
Microck wants to merge 1 commit intomainfrom
feat/logging-audit
Closed

feat: add comprehensive logging across scan pipeline#2
Microck wants to merge 1 commit intomainfrom
feat/logging-audit

Conversation

@Microck
Copy link
Copy Markdown
Owner

@Microck Microck commented Mar 5, 2026

Summary

Added structured logging throughout the Jarspect codebase to significantly improve observability and debugging capabilities. This audit identified critical gaps in logging across external services, configuration loading, and scan pipeline stages.

Changes

  • MalwareBazaar integration (src/malwarebazaar.rs)

    • Added logging for hash lookup attempts with duration tracking
    • Structured logging for API responses, matches, and non-matches
    • Captures malware family, tags count, and timing
  • Configuration loading (src/lib.rs)

    • Added logging for signature file loading with per-pack counts
    • YARA rule compilation logging with rule counts
    • Total signature and rulepack counts at startup
  • YARA scanning (src/analysis/yara.rs)

    • Added scan start/end logging with entry counts
    • Debug logging for each rule match with rule ID, severity, and file path
    • Total findings count at completion
  • Profile building (src/profile.rs)

    • Added debug logging for detected mod loader type
    • Metadata extraction logging (mod_id, name where available)
    • Tracks which metadata format was found (fabric/forge/spigot/legacy)
  • Scan pipeline (src/scan.rs)

    • Added scan start logging with identifiers
    • Upload loading logging with JAR size and SHA256
    • Archive intake logging with file and class counts
    • Static analysis progress logging with match counts
    • Verdict determination logging for AI, heuristic, and override paths
    • Final scan completion logging with result, method, confidence, and risk score
  • HTTP endpoints (src/main.rs)

    • Added upload request logging with generated upload_id
    • Upload persistence logging with filename and size
    • Scan request logging with upload_id

Structured Fields

All logging uses consistent structured fields for queryability:

  • upload_id: Upload identifier
  • scan_id: Scan operation identifier
  • sha256: File hash
  • jar_size_bytes: File size
  • file_count, class_count: Archive statistics
  • result, confidence, risk_score: Verdict details
  • duration_ms: Operation timing
  • family, tags_count: MalwareBazaar metadata
  • rule_id, severity, pack: YARA rule details

Log Level Guidelines

  • info: High-level operations, external service calls, scan outcomes
  • warn: Retryable failures with backoff, graceful degradation
  • debug: Detailed progress, individual detector/YARA runs
  • trace: Very detailed diagnostics (future use)

Documentation

Created LOGGING_AUDIT.md documenting:

  • Current logging state analysis
  • Identified gaps and priority levels
  • Structured logging field standards
  • Log level usage guidelines

Added structured logging to improve observability and debugging:

- MalwareBazaar: hash lookups, API responses, matches
- Configuration: signature/YARA rule loading with counts
- YARA scanning: rule matches with severity and paths
- Profile building: metadata extraction and loader detection
- Scan pipeline: stage transitions, verdicts, overrides
- HTTP endpoints: upload/scan request tracking

All logging uses structured fields (upload_id, scan_id, etc.)
for consistent querying and analysis.

Nightshift-Task: logging-audit
Nightshift-Ref: https://github.com/marcus/nightshift
@Microck Microck closed this Mar 5, 2026
@Microck Microck deleted the feat/logging-audit branch March 5, 2026 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant