Permalink
Find file Copy path
3efd542 May 22, 2017
3 contributors

Users who have contributed to this file

@joachimmetz @destijl @timevortex
510 lines (419 sloc) 16.1 KB

Artifact definition format and style guide

Summary

This guide contains a description of the forensics artifacts definitions. The artifacts definitions are YAML-based. The format is currently still under development and is likely to undergo some change. One of the goals of this guide is to ensure consistency and readbility of the artifacts definitions.

Revision history

Version Author Date Comments

0.0.1

G. Castle

November 2014

Initial version.

0.0.2

G. Castle

December 2014

Minor format changes.

0.0.3

J.B. Metz

April 2015

Merged style guide and artifact definitions wiki page.

0.0.3

J.B. Metz

September 2015

Additional label.

0.0.4

J.B. Metz

July 2016

Added information about a naming convention.

1. Background

The first version of the artifact definitions originated from the GRR project, where it is used to describe and quickly collect data of interest, e.g. specific files or Windows Registry keys. The goal of the format is to provide a way to describe the majority of forensic artifacts in a language that is readable by humans and machines.

The format is designed to be simple and straight forward, so that a digital forensic analysist is able to quickly write artifact definitions during an investigation without having to rely on complex standards or tooling.

The format is intended to describe forensically-relevant data on a machine, while being tool agnostic. In particular we intentionally avoided adding IOC-like logic, or describing how the data should be collected since this various between tools.

1.1. Terminology

The term artifact (or artefact) is widely used within computer (or digital) forensics, though there is no official definition of this term.

The definition closest to the meaning of the word within computer forensics is that of the word artifact within archaeology. The term should not be confused with the word artifact used within software development.

If archaeology defines an artifact as:

something made or given shape by man, such as a tool or
a work of art, esp an object of archaeological interest

The definition of artifact within computer forensics could be:

An object of digital archaeological interest.

Where digital archaeology roughly refers to computer forensics without the forensic (legal) context.

2. The artifact definition

The best way to show what an artifact definition is, is by example. The following example is the artifact definition for the Windows EVTX System Event Logs.

name: WindowsSystemEventLogEvtx
doc: Windows System Event log for Vista or later systems.
sources:
- type: FILE
  attributes: {paths: ['%%environ_systemroot%%\System32\winevt\Logs\System.evtx']}
conditions: [os_major_version >= 6]
labels: [Logs]
supported_os: [Windows]
urls: ['http://www.forensicswiki.org/wiki/Windows_XML_Event_Log_(EVTX)']

The artifact definition can have the following values:

Value Description

name

The name. An unique string that identifies the artifact definition.
Also see section: Name.

doc

The description (or documentation). A human readable string that describes the artifact definition.
Style note: Typically one line description of the artifact, mentioning important caveats.
If more description is necessary, use the Long docs form.

sources

A list of source definitions.
See section: Sources.

conditions

Optional list of conditions that describe when the artifact definition should apply.
See section: Conditions.

labels

Optional list of predefined labels. See section: Labels.

provides

Optional list of TODO

supported_os

Optional list that indicates which operating systems the artifact definition applies to. See section: Supported operating system.

urls

Optional list of URLs with more contextual information.
Ideally the artifact definition links to an article that discusses the artificat in more depth e.g. on Forensics Wiki

2.1. Name

Style note: The name of an artifact defintion should be in CamelCase name without spaces.

As of July 2016 we are migrating to the following naming convention:

  • Prefix platform specific artifact definitions with the name of the operating system using "Linux", "MacOS" or "Windows"

  • If not platform specific:

    • prefix with the application name, for example "ChromeHistory".

    • prefix with the name of the subsystem, for example "WMIComputerSystemProduct".

Style note: If the sole source of the artifact definition for example are files use "BrowserHistoryFiles" instead of "BrowserHistory" to reduce ambiguity.

2.2. Long docs form

Multi-line documentation should use the YAML Literal Style as indicated by the | character.

doc: |
  The Windows run keys.

  Note users.sid will currently only expand to SIDs with profiles on the system,
  not all SIDs.

Style note: the short description (first line) and the longer portion are separated by an empty line.

Style note: explicit newlines (\n) should not be used.

3. Sources

Every source definition starts with a type followed by arguments e.g.

sources:
- type: COMMAND
  attributes:
    args: [-qa]
    cmd: /bin/rpm
sources:
- type: FILE
  attributes:
    paths:
      - /root/.bashrc
      - /root/.cshrc
      - /root/.ksh
      - /root/.logout
      - /root/.profile
      - /root/.tcsh
      - /root/.zlogin
      - /root/.zlogout
      - /root/.zprofile
      - /root/.zprofile

Style note: where sources take a single argument with a single value, the one-line {} form should be used to save on line breaks as below:

- type: FILE
  attributes: {paths: ['%%environ_systemroot%%\System32\winevt\Logs\System.evtx']}
Value Description

attributes

A dictionary of keyword attributes specific to the type of source definition.

type

The source type.

conditions

Optional list of conditions to when the artifact definition should apply.
See section: Conditions.

returned_types

Optional list of returned artifact definition types.

supported_os

Optional list that indicates which operating systems the artifact definition applies to.
See section: Supported operating system.

3.1. Source types

Currently the following different source types are defined:

Value Description

ARTIFACT_GROUP

A source that consists of a group of other artifacts.

COMMAND

A source that consists of the output of a command.

FILE

A source that consists of the contents of files.

PATH

A source that consists of the contents of paths.

REGISTRY_KEY

A source that consists of the contents of Windows Registry keys.

REGISTRY_VALUE

A source that consists of the contents of Windows Registry values.

WMI

A source that consists of the output of Windows Management Instrumentation (WMI) queries.

The sources types are defined in definitions.py. as TYPE_INDICATOR constants.

3.2. Artifact group source

The artifact group source is a source that consists of a group of other artifacts e.g.

- type: ARTIFACT_GROUP
  attributes:
    names: [WindowsRunKeys, WindowsServices]
  returned_types: [PersistenceFile]

Where attributes can contain the following values:

Value Description

names

A list of artifact definition names that make up this "composite" artifact.
This can also be used to group multiple artifact definitions into one for convenience.

3.3. Command source

The command source is a source that consists of the output of a command e.g.

- type: COMMAND
  attributes:
    args: [-qa]
    cmd: /bin/rpm

Where attributes can contain the following values:

Value Description

args

A list arguments to pass to the command.

cmd

The path of the command.

3.4. File source

The file source is a source that consists of the contents of files e.g.

- type: FILE
  attributes:
    paths: ['%%environ_systemroot%%\System32\winevt\Logs\System.evtx']

Where attributes can contain the following values:

Value Description

paths

A list of file paths that can potentially be collected.
The paths can use parameter expansion e.g. %%environ_systemroot%%.
See section: Parameter expansion and globs

3.5. Path source

The path source is a source that consists of the contents of paths e.g.

- type: PATH
  attributes:
    paths: ['\Program Files']
    separator: '\'

Where attributes can contain the following values:

Value Description

paths

A list of file paths that can potentially be collected.
The paths can use parameter expansion e.g. %%environ_systemroot%%.
See section: Parameter expansion and globs

3.6. Windows Registry key source

The Windows Registry key source is a source that consists of the contents of Windows Registry keys e.g.

sources:
- type: REGISTRY_KEY
  attributes:
    keys:
    - 'HKEY_USERS\%%users.sid%%\Software\Microsoft\Internet Explorer\TypedURLs\*'

Where attributes can contain the following values:

Value Description

keys

A list of Windows Registry key paths that can potentially be collected.
The paths can use parameter expansion e.g. %%users.sid%%.
See section: Parameter expansion and globs

3.7. Windows Registry value source

The Windows Registry value source is a source that consists of the contents of Windows Registry values e.g.

- type: REGISTRY_VALUE
  attributes:
    key_value_pairs:
      - {key: 'HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Explorer\WindowsUpdate', value: 'CISCNF4654'}

Where attributes can contain the following values:

Value Description

key_value_pairs

A list of Windows Registry key paths and value names that can potentially be collected.
The key path can use parameter expansion e.g. %%users.sid%%.
See section: Parameter expansion and globs

3.8. Windows Management Instrumentation (WMI) query source

The Windows Management Instrumentation (WMI) query source is a source that consists of the output of Windows Management Instrumentation (WMI) queries e.g.

- type: WMI
  attributes:
    query: SELECT * FROM Win32_UserAccount WHERE name='%%users.username%%'

Where attributes can contain the following values:

Value Description

query

The Windows Management Instrumentation (WMI) query.
The query can use parameter expansion e.g. %%users.username%%.
See section: Parameter expansion and globs

4. Conditions

TODO: work is in progress to move this out of GRR into something more portable.

Artifact conditions are currently implemented using the objectfilter system that allows you to apply complex conditions to the attributes of an object. Artifacts can apply conditions to any of the Knowledge Base object attributes as defined in the GRR knowledge_base.proto.

Style note: single quotes should be used for strings when writing conditions.

conditions: [os_major_version >= 6 and time_zone == 'America/Los_Angeles']

4.1. Supported operating system

Since operating system (OS) conditions are a very common constraint, this has been provided as a separate option "supported_os" to simplify syntax. For supported_os no quotes are required. The currently supported operating systems are:

  • Darwin (also used for Mac OS X)

  • Linux

  • Windows

supported_os: [Darwin, Linux, Windows]

This can be translated to objectfilter as:

["os =='Darwin'" OR "os=='Linux'" OR "os == 'Windows'"]

5. Labels

Currently the following different labels are defined:

Value Description

Antivirus

Antivirus related artifacts, e.g. quarantine files.

Authentication

Authentication artifacts.

Browser

Web Browser artifacts.

Cloud Storage

Cloud Storage artifacts.

Configuration Files

Configuration files artifacts.

Execution

Contain execution events.

External Media

Contain external media data or events e.g. USB drives.

KnowledgeBase

Artifacts used in knowledge base generation.

Logs

Contain log files.

Memory

Artifacts retrieved from memory.

Network

Describe networking state.

Processes

Describe running processes.

Software

Installed software.

System

Core system artifacts.

Users

Information about users.

Rekall

Artifacts using the Rekall memory forensics framework.

The labes are defined in definitions.py.

6. Style notes

6.1. Artifact definition YAML files

Artifact definition YAML filenames should be of the form:

$FILENAME.yaml

Where $FILENAME is name of the file e.g. windows.yaml.

Each defintion file should have a comment at the top of the file with a one-line summary describing the type of artifact definitions contained in the file e.g.

# Windows specific artifacts.

6.2. Lists

Generally use the short [] format for single-item lists that fit inside 80 characters to save on unnecessary line breaks:

labels: [Logs]
supported_os: [Windows]
urls: ['http://www.forensicswiki.org/wiki/Windows_XML_Event_Log_(EVTX)']

and the bulleted list form for multi-item lists or long lines:

paths:
  - 'HKEY_USERS\%%users.sid%%\Software\Microsoft\Windows\CurrentVersion\Run\*'
  - 'HKEY_USERS\%%users.sid%%\Software\Microsoft\Windows\CurrentVersion\RunOnce\*'
  - 'HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Run\*'
  - 'HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\RunOnce\*'
  - 'HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\RunOnceEx\*'

6.3. Quotes

Quotes should not be used for doc strings, artifact names, and simple lists like labels and supported_os.

Paths and URLs should use single quotes to avoid the need for manual escaping.

paths: ['%%environ_temp%%\*.exe']
urls: ['http://www.forensicswiki.org/wiki/Windows_XML_Event_Log_(EVTX)']

Double quotes should be used where escaping causes problems, such as regular expressions:

content_regex_list: ["^%%users.username%%:[^:]*\n"]

6.4. Minimize the number of definitions by using multiple sources

To minimize the number of artifacts in the list, combine them using the supported_os and conditions attributes where it makes sense. e.g. rather than having FirefoxHistoryWindows, FirefoxHistoryLinux, FirefoxHistoryDarwin, do:

name: FirefoxHistory
doc: Firefox places.sqlite files.
sources:
- type: FILE
  attributes:
    paths:
      - %%users.localappdata%%\Mozilla\Firefox\Profiles\*\places.sqlite
      - %%users.appdata%%\Mozilla\Firefox\Profiles\*\places.sqlite
  supported_os: [Windows]
- type: FILE
  attributes:
    paths: [%%users.homedir%%/Library/Application Support/Firefox/Profiles/*/places.sqlite]
  supported_os: [Darwin]
- type: FILE
  attributes:
    paths: ['%%users.homedir%%/.mozilla/firefox/*/places.sqlite']
  supported_os: [Linux]
labels: [Browser]
supported_os: [Windows, Linux, Darwin]

7. Parameter expansion and globs

TODO