How to write a parser

Ryan Benson edited this page Sep 4, 2018 · 12 revisions

Introduction

This page is intended to give you an introduction into developing a parser for plaso.

  • First a step-by-step example is provided to create a simple binary parser for the Safari Cookies.binarycookies file.
  • At bottom are some common troubleshooting tips that others have run into before you.

This page assumes you have at least a basic understanding of programming in Python and use of git.

Format

Before you can write a binary file parser you will need to have a good understanding of the file format. A description of the Safari Cookies.binarycookies format can be found here.

Parsers vs. Plugins

Before starting work on a parser, check if Plaso already has a parser that handles the underlying format of the file you're parsing. Plaso currently supports plugins for the following file formats:

  • Bencode
  • Web Browser Cookies
  • ESEDB
  • OLECF
  • Plist
  • SQLite
  • Syslog
  • Windows Registry

If the artifact you're trying to parse is in one of these formats, you need to write a plugin of the appropriate type, rather than a parser.

For our example, however, the Safari Cookies.binarycookies file is in its own binary format, so a separate parser is appropriate.

Test data

First we make a representative test file and add it to the test_data/ directory, in our example:

test_data/Cookies.binarycookies

Make sure that the test file does not contain sensitive or copyrighted material.

Parsers, formatters, events and event data

  • parser; subclass of plaso.parsers.interface.FileObjectParser, that extracts events from the content of a file.
  • formatter (or event formatter); subclass of plaso.formatters.interface.EventFormatter, that generates a human readable description of the event data.
  • event; subclass of plaso.containers.events.EventObject, that represents an event
  • event data; subclass of plaso.containers.events.EventData, that represents data related to the event.

Writing the parser

Registering the parser

Add an import for the parser to:

plaso/parsers/__init__.py
from plaso.parsers import safari_cookies

When plaso.parsers is imported this will load the safari_cookies module (safari_cookies.py).

The parser class BinaryCookieParser is registered using manager.ParsersManager.RegisterParser(BinaryCookieParser).

plaso/parsers/safari_cookies.py
# -*- coding: utf-8 -*-
"""Parser for Safari Binary Cookie files."""

from plaso.parsers import interface
from plaso.parsers import manager


class BinaryCookieParser(interface.FileObjectParser):
  """Parser for Safari Binary Cookie files."""

  NAME = 'binary_cookies'
  DESCRIPTION = 'Parser for Safari Binary Cookie files.'

  def ParseFileObject(self, parser_mediator, file_object, **kwargs):
    """Parses a Safari binary cookie file-like object.

    Args:
      parser_mediator (ParserMediator): parser mediator.
      file_object (dfvfs.FileIO): file-like object to be parsed.

    Raises:
      UnableToParseFile: when the file cannot be parsed, this will signal
          the event extractor to apply other parsers.
    """
    ...


manager.ParsersManager.RegisterParser(BinaryCookieParser)
  • NAME
  • DESCRIPTION

Writing the event formatter

plaso/formatters/safari_cookies.py
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.