Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
How to write a parser
This page is intended to give you an introduction into developing a parser for plaso.
- First a step-by-step example is provided to create a simple binary parser for the Safari Cookies.binarycookies file.
- At bottom are some common troubleshooting tips that others have run into before you.
This page assumes you have at least a basic understanding of programming in Python and use of git.
Before you can write a binary file parser you will need to have a good understanding of the file format. A description of the Safari Cookies.binarycookies format can be found here.
Parsers vs. Plugins
Before starting work on a parser, check if Plaso already has a parser that handles the underlying format of the file you're parsing. Plaso currently supports plugins for the following file formats:
- Web Browser Cookies
- Windows Registry
If the artifact you're trying to parse is in one of these formats, you need to write a plugin of the appropriate type, rather than a parser.
For our example, however, the Safari Cookies.binarycookies file is in its own binary format, so a separate parser is appropriate.
First we make a representative test file and add it to the test_data/ directory, in our example:
Make sure that the test file does not contain sensitive or copyrighted material.
Parsers, formatters, events and event data
- parser; subclass of plaso.parsers.interface.FileObjectParser, that extracts events from the content of a file.
- formatter (or event formatter); subclass of plaso.formatters.interface.EventFormatter, that generates a human readable description of the event data.
- event; subclass of plaso.containers.events.EventObject, that represents an event
- event data; subclass of plaso.containers.events.EventData, that represents data related to the event.
Writing the parser
Registering the parser
Add an import for the parser to:
from plaso.parsers import safari_cookies
When plaso.parsers is imported this will load the safari_cookies module (safari_cookies.py).
The parser class
BinaryCookieParser is registered using
# -*- coding: utf-8 -*- """Parser for Safari Binary Cookie files.""" from plaso.parsers import interface from plaso.parsers import manager class BinaryCookieParser(interface.FileObjectParser): """Parser for Safari Binary Cookie files.""" NAME = 'binary_cookies' DESCRIPTION = 'Parser for Safari Binary Cookie files.' def ParseFileObject(self, parser_mediator, file_object, **kwargs): """Parses a Safari binary cookie file-like object. Args: parser_mediator (ParserMediator): parser mediator. file_object (dfvfs.FileIO): file-like object to be parsed. Raises: UnableToParseFile: when the file cannot be parsed, this will signal the event extractor to apply other parsers. """ ... manager.ParsersManager.RegisterParser(BinaryCookieParser)
Writing the event formatter