Skip to content

[REVIEW] log-analysis: add timestamp and entity normalization evidence #1142

@Peter7896

Description

@Peter7896

Skill Being Reviewed

Skill name: log-analysis
Skill path: skills/secops/log-analysis/

False Positive Analysis

Benign log sequence that can be misclassified when timestamps are compared naively:

[
  {
    "source": "vpn",
    "event_time": "2026-06-05T09:59:58-07:00",
    "ingested_at": "2026-06-05T17:00:07Z",
    "user": "admin@example.com",
    "event": "vpn_login"
  },
  {
    "source": "windows-security",
    "event_time": "2026-06-05T17:00:02Z",
    "ingested_at": "2026-06-05T17:00:04Z",
    "user": "admin@example.com",
    "event_id": 4672,
    "event": "special_privilege_logon"
  }
]

Why this is a false positive:

The two events can look like an impossible sequence if the analyst compares mixed local time, UTC time, and ingestion time without normalization. A VPN event that occurred at 09:59:58-07:00 is really 16:59:58Z, which plausibly precedes the privileged Windows session at 17:00:02Z. If a report uses ingestion order or raw string timestamps, it can turn a benign login sequence into a suspected privilege escalation or lateral movement chain.

The current skill asks for a time window and builds a timeline, but it does not require the report to state which timestamp field was used, whether all times were normalized to UTC, whether event time and ingestion time differ materially, or whether the source host clock is trusted.

Coverage Gaps

Missed variant 1: event time vs ingestion time changes the attack sequence

Event A: endpoint process creation
  event_time:   2026-06-05T10:01:00Z
  ingested_at:  2026-06-05T10:19:30Z

Event B: proxy connection
  event_time:   2026-06-05T10:04:00Z
  ingested_at:  2026-06-05T10:04:05Z

Naive timeline by ingestion time:
  proxy connection -> process creation

Correct timeline by event time:
  process creation -> proxy connection

Why it should be caught:

The skill's timeline output has one Timestamp (UTC) column, but real log sources often expose multiple time fields: event creation time, device time, collector receipt time, SIEM ingestion time, and normalization time. During outage, backlog replay, mobile/offline endpoint sync, or cloud audit delivery delay, ingestion time can be minutes or hours after event time. A security log analysis report should record the selected canonical timestamp, the fallback order, and any material ingestion lag before reconstructing kill-chain order.

Missed variant 2: source clock skew creates false lateral movement or impossible travel

dc-01 Windows Security:
  event_time: 2026-06-05T12:00:04Z
  clock_skew: +4s

workstation-17 Sysmon:
  event_time: 2026-06-05T11:54:58Z
  clock_skew: -5m02s

siem correlation:
  "user accessed workstation before authenticating to domain controller"

Why it should be caught:

The skill recommends temporal joins and +/- 30 minute correlation, but it does not require analysts to verify source clock health or document skew tolerance. Domain controllers, hypervisors, EDR buffers, cloud audit services, network devices, and containers can disagree on time. Without a clock-skew field, a timeline can create false causality, hide the real first event, or incorrectly scope incident start time.

Missed variant 3: field normalization mismatch breaks entity pivots

[
  {"source": "windows", "Account": "ACME\\alice", "Computer": "WS-17", "EventID": 4624},
  {"source": "azuread", "UserPrincipalName": "alice@acme.example", "DeviceId": "aad-device-123"},
  {"source": "edr", "user.name": "alice", "host.hostname": "ws-17.acme.example"}
]

Why it should be caught:

The current skill tells analysts to pivot on users, hosts, IPs, and IOCs, but it does not require a normalization table showing how identities and hosts are joined across log schemas. A Windows DOMAIN\user, an Entra ID UPN, an EDR short username, a host NetBIOS name, and an FQDN may represent the same entity or different entities. Without normalized entity keys and confidence, correlation can both over-link unrelated events and miss real cross-source activity.

Edge Cases

  • Some sources only have ingestion time. That should be allowed, but the report should mark timeline confidence as lower and avoid second-level causality claims.
  • Network devices and SaaS audit APIs may report local time without an offset. The reviewer should require the source timezone or collector parsing rule before converting to UTC.
  • Daylight saving transitions can duplicate or skip local clock hours; UTC normalization avoids most of this, but only if the original offset is preserved.
  • Backfilled cloud audit logs can be valid evidence even with long ingestion delays; the delay is not suspicious by itself.
  • Clock skew can itself be a finding when caused by disabled NTP, host tampering, or log pipeline failure, but it should not automatically prove attacker activity.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: Add timestamp provenance, timezone normalization, clock-skew, ingestion-lag, and entity-normalization evidence fields to the log-analysis process and output template. This is intentionally narrower than issue [REVIEW] log-analysis: refresh ATT&CK v19 defensive evidence model #208, which focuses on ATT&CK version/model drift; this review is about preventing incorrect timelines and entity pivots even when the framework mapping is current.

Recommended changes:

  1. Add a preflight step: Timestamp and Entity Normalization.
  2. Require a source-quality table:
    • log source
    • canonical event-time field
    • ingestion/collector-time field
    • timezone or offset source
    • UTC normalization status
    • observed ingestion lag
    • known clock skew or time-sync status
    • parser/schema used
    • normalized user key
    • normalized host key
    • confidence / Not Evaluable reason
  3. Extend the timeline with:
    • Event Time (UTC)
    • Ingested At (UTC)
    • Source Clock Confidence
    • Entity Join Confidence
  4. Add severity guidance:
    • suspicious activity confirmed after normalized event-time ordering: keep normal severity
    • suspicious only under ingestion-time ordering: downgrade or mark not evaluable
    • material clock skew or missing timezone in a critical source: visibility gap or separate logging-quality finding

Comparison to Other Tools

Tool Catches this? Notes
Splunk CIM / Elastic ECS / OCSF mapping Partial These schemas help standardize fields, but the analyst still needs to prove the parser chose the correct timestamp and entity keys.
Microsoft Sentinel / KQL Partial Provides TimeGenerated and ingestion-time functions, but cross-source correlation still needs explicit event-time vs ingestion-time reasoning.
SIEM correlation engines Partial Can correlate by time windows, but wrong timezone parsing, clock skew, or schema joins can still produce misleading timelines.
Manual log analysis Yes A disciplined analyst can validate source clocks, time fields, parser behavior, and entity joins before making causality claims.

Overall Assessment

Strengths:

  • The skill has a useful log-source taxonomy and practical Windows, Sysmon, Linux, cloud, DNS, proxy, and network examples.
  • The correlation workflow is clear and maps well to real investigation pivots.
  • The output template already includes a timeline and visibility gaps section, which is the right place to add timestamp and normalization evidence.

Needs improvement:

  • The process assumes a single trustworthy Timestamp (UTC) after collection.
  • It does not distinguish event time, device time, collector time, ingestion time, and normalized SIEM time.
  • It does not ask for timezone/offset evidence or clock-skew tolerance before making temporal claims.
  • It lacks an entity-normalization table for cross-source pivots across users, hosts, IPs, devices, and cloud principals.

Priority recommendations:

  1. Add timestamp provenance and UTC-normalization checks before timeline construction.
  2. Add clock-skew and ingestion-lag fields to the source-quality/visibility-gap sections.
  3. Add entity-normalization evidence for every cross-source pivot.
  4. Add a pitfall warning: "Treating ingestion order or raw local timestamps as attack order."

Sources Checked

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: Crypto; details can be provided privately after acceptance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions