[REVIEW] log-analysis: add timestamp and entity normalization evidence

## Skill Being Reviewed
**Skill name:** `log-analysis`
**Skill path:** `skills/secops/log-analysis/`

## False Positive Analysis

**Benign log sequence that can be misclassified when timestamps are compared naively:**
```json
[
  {
    "source": "vpn",
    "event_time": "2026-06-05T09:59:58-07:00",
    "ingested_at": "2026-06-05T17:00:07Z",
    "user": "admin@example.com",
    "event": "vpn_login"
  },
  {
    "source": "windows-security",
    "event_time": "2026-06-05T17:00:02Z",
    "ingested_at": "2026-06-05T17:00:04Z",
    "user": "admin@example.com",
    "event_id": 4672,
    "event": "special_privilege_logon"
  }
]
```

**Why this is a false positive:**

The two events can look like an impossible sequence if the analyst compares mixed local time, UTC time, and ingestion time without normalization. A VPN event that occurred at `09:59:58-07:00` is really `16:59:58Z`, which plausibly precedes the privileged Windows session at `17:00:02Z`. If a report uses ingestion order or raw string timestamps, it can turn a benign login sequence into a suspected privilege escalation or lateral movement chain.

The current skill asks for a time window and builds a timeline, but it does not require the report to state which timestamp field was used, whether all times were normalized to UTC, whether event time and ingestion time differ materially, or whether the source host clock is trusted.

## Coverage Gaps

**Missed variant 1: event time vs ingestion time changes the attack sequence**
```text
Event A: endpoint process creation
  event_time:   2026-06-05T10:01:00Z
  ingested_at:  2026-06-05T10:19:30Z

Event B: proxy connection
  event_time:   2026-06-05T10:04:00Z
  ingested_at:  2026-06-05T10:04:05Z

Naive timeline by ingestion time:
  proxy connection -> process creation

Correct timeline by event time:
  process creation -> proxy connection
```
**Why it should be caught:**

The skill's timeline output has one `Timestamp (UTC)` column, but real log sources often expose multiple time fields: event creation time, device time, collector receipt time, SIEM ingestion time, and normalization time. During outage, backlog replay, mobile/offline endpoint sync, or cloud audit delivery delay, ingestion time can be minutes or hours after event time. A security log analysis report should record the selected canonical timestamp, the fallback order, and any material ingestion lag before reconstructing kill-chain order.

**Missed variant 2: source clock skew creates false lateral movement or impossible travel**
```text
dc-01 Windows Security:
  event_time: 2026-06-05T12:00:04Z
  clock_skew: +4s

workstation-17 Sysmon:
  event_time: 2026-06-05T11:54:58Z
  clock_skew: -5m02s

siem correlation:
  "user accessed workstation before authenticating to domain controller"
```
**Why it should be caught:**

The skill recommends temporal joins and +/- 30 minute correlation, but it does not require analysts to verify source clock health or document skew tolerance. Domain controllers, hypervisors, EDR buffers, cloud audit services, network devices, and containers can disagree on time. Without a clock-skew field, a timeline can create false causality, hide the real first event, or incorrectly scope incident start time.

**Missed variant 3: field normalization mismatch breaks entity pivots**
```json
[
  {"source": "windows", "Account": "ACME\\alice", "Computer": "WS-17", "EventID": 4624},
  {"source": "azuread", "UserPrincipalName": "alice@acme.example", "DeviceId": "aad-device-123"},
  {"source": "edr", "user.name": "alice", "host.hostname": "ws-17.acme.example"}
]
```
**Why it should be caught:**

The current skill tells analysts to pivot on users, hosts, IPs, and IOCs, but it does not require a normalization table showing how identities and hosts are joined across log schemas. A Windows `DOMAIN\user`, an Entra ID UPN, an EDR short username, a host NetBIOS name, and an FQDN may represent the same entity or different entities. Without normalized entity keys and confidence, correlation can both over-link unrelated events and miss real cross-source activity.

## Edge Cases

- Some sources only have ingestion time. That should be allowed, but the report should mark timeline confidence as lower and avoid second-level causality claims.
- Network devices and SaaS audit APIs may report local time without an offset. The reviewer should require the source timezone or collector parsing rule before converting to UTC.
- Daylight saving transitions can duplicate or skip local clock hours; UTC normalization avoids most of this, but only if the original offset is preserved.
- Backfilled cloud audit logs can be valid evidence even with long ingestion delays; the delay is not suspicious by itself.
- Clock skew can itself be a finding when caused by disabled NTP, host tampering, or log pipeline failure, but it should not automatically prove attacker activity.

## Remediation Quality

- [x] Fix resolves the vulnerability
- [x] Fix doesn't introduce new security issues
- [x] Fix doesn't break functionality
- **Issues found:** Add timestamp provenance, timezone normalization, clock-skew, ingestion-lag, and entity-normalization evidence fields to the `log-analysis` process and output template. This is intentionally narrower than issue #208, which focuses on ATT&CK version/model drift; this review is about preventing incorrect timelines and entity pivots even when the framework mapping is current.

Recommended changes:

1. Add a preflight step: **Timestamp and Entity Normalization**.
2. Require a source-quality table:
   - log source
   - canonical event-time field
   - ingestion/collector-time field
   - timezone or offset source
   - UTC normalization status
   - observed ingestion lag
   - known clock skew or time-sync status
   - parser/schema used
   - normalized user key
   - normalized host key
   - confidence / Not Evaluable reason
3. Extend the timeline with:
   - `Event Time (UTC)`
   - `Ingested At (UTC)`
   - `Source Clock Confidence`
   - `Entity Join Confidence`
4. Add severity guidance:
   - suspicious activity confirmed after normalized event-time ordering: keep normal severity
   - suspicious only under ingestion-time ordering: downgrade or mark not evaluable
   - material clock skew or missing timezone in a critical source: visibility gap or separate logging-quality finding

## Comparison to Other Tools

| Tool | Catches this? | Notes |
|------|:---:|-------|
| Splunk CIM / Elastic ECS / OCSF mapping | Partial | These schemas help standardize fields, but the analyst still needs to prove the parser chose the correct timestamp and entity keys. |
| Microsoft Sentinel / KQL | Partial | Provides `TimeGenerated` and ingestion-time functions, but cross-source correlation still needs explicit event-time vs ingestion-time reasoning. |
| SIEM correlation engines | Partial | Can correlate by time windows, but wrong timezone parsing, clock skew, or schema joins can still produce misleading timelines. |
| Manual log analysis | Yes | A disciplined analyst can validate source clocks, time fields, parser behavior, and entity joins before making causality claims. |

## Overall Assessment

**Strengths:**

- The skill has a useful log-source taxonomy and practical Windows, Sysmon, Linux, cloud, DNS, proxy, and network examples.
- The correlation workflow is clear and maps well to real investigation pivots.
- The output template already includes a timeline and visibility gaps section, which is the right place to add timestamp and normalization evidence.

**Needs improvement:**

- The process assumes a single trustworthy `Timestamp (UTC)` after collection.
- It does not distinguish event time, device time, collector time, ingestion time, and normalized SIEM time.
- It does not ask for timezone/offset evidence or clock-skew tolerance before making temporal claims.
- It lacks an entity-normalization table for cross-source pivots across users, hosts, IPs, devices, and cloud principals.

**Priority recommendations:**
1. Add timestamp provenance and UTC-normalization checks before timeline construction.
2. Add clock-skew and ingestion-lag fields to the source-quality/visibility-gap sections.
3. Add entity-normalization evidence for every cross-source pivot.
4. Add a pitfall warning: "Treating ingestion order or raw local timestamps as attack order."

## Sources Checked

- NIST SP 800-92, Guide to Computer Security Log Management: https://csrc.nist.gov/pubs/sp/800/92/final
- MITRE ATT&CK Data Components: https://attack.mitre.org/datacomponents/

## Bounty Info
- [x] I have read and agree to the [CONTRIBUTING.md](../../CONTRIBUTING.md) bounty terms
- **Preferred payment method:** Crypto; details can be provided privately after acceptance.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] log-analysis: add timestamp and entity normalization evidence #1142

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

Sources Checked

Bounty Info

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tool	Catches this?	Notes
Splunk CIM / Elastic ECS / OCSF mapping	Partial	These schemas help standardize fields, but the analyst still needs to prove the parser chose the correct timestamp and entity keys.
Microsoft Sentinel / KQL	Partial	Provides `TimeGenerated` and ingestion-time functions, but cross-source correlation still needs explicit event-time vs ingestion-time reasoning.
SIEM correlation engines	Partial	Can correlate by time windows, but wrong timezone parsing, clock skew, or schema joins can still produce misleading timelines.
Manual log analysis	Yes	A disciplined analyst can validate source clocks, time fields, parser behavior, and entity joins before making causality claims.

[REVIEW] log-analysis: add timestamp and entity normalization evidence #1142

Description

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

Sources Checked

Bounty Info

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions