Skip to content

Commit

Permalink
a bit more doc
Browse files Browse the repository at this point in the history
  • Loading branch information
trolldbois committed Jul 3, 2017
1 parent 433be4f commit 7631baf
Show file tree
Hide file tree
Showing 14 changed files with 252 additions and 41 deletions.
4 changes: 2 additions & 2 deletions docs/Haystack_basic_usage.ipynb
Expand Up @@ -11,9 +11,9 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"metadata": {
"collapsed": true
"collapsed": false
},
"outputs": [
{
Expand Down
57 changes: 57 additions & 0 deletions docs/capture-a-memory-dump.rst
@@ -0,0 +1,57 @@
.. _capture-process-memory:

Capture a process memory to file
================================

First of all, be prepared to face a need for elevated privileges.

On Windows, the most straightforward is to get a Minidump. The Windows task manager
allows to capture a process memory to file. Alternatively the Microsoft Sysinternals
suite of tools provide either a CLI (procdump.exe) or a GUI (Process explorer).
Using one of these (with full memory dump option) you will produce a file
that can be used with the ``haystack-xxx`` list of entry points using the ``dmp://``
file prefix.

While technically you could use many third party tool, Haystack actually
need memory mapping information to work with the raw memory data.
In nothing else, there is a dumping tool included in the pytahon-haystack package that
leverage python-ptrace to capture a process memory. See the ``haystack-live-dump`` tool:

.. code-block:: bash
# haystack-live-dump <pid> myproc.dump
For live processes
------------------
- ``haystack-live-dump`` capture a process memory dump to a folder (haystack format)

For a Rekall memory dump
------------------------
- ``haystack-rekall-dump`` dump a specific process to a haystack process dump

For a Volatility memory dump
----------------------------
- ``haystack-volatility-dump`` dump a specific process to a haystack process dump

Interesting note for Linux users, dumping a process memory for the same user can be done
if you downgrade the "security" of your system by allowing cross process ptrace access::

$ sudo sysctl kernel.yama.ptrace_scope=0

Interesting note for Windows users, most processes memory can be dumped to a Minidump format
using the task manager. (NB: I don't remember is the process memory mapping are included then)

Making your own memory mappings handler
=======================================

If you have a different technique to access a process memory, you can implement the
``haystack.abc.IMemoryLoader`` and ``haystack.abc.IMemoryMapping`` interface for
your favorite technique.
Check out the `Frida plugin <https://github.com/trolldbois/python-haystack/blob/master/haystack/mappings/fridaprocess.py>`_
for example.

Alternatively, if you can copy the process' memory mappings to file, you can "interface"
with the basic, simple, haystack memory dump file format by doing the following:
The basic format is a folder containing each memory mapping in a separate file :
- memory content in a file named after it's start/end addresses ( ex: 0x000700000-0x000800000 )
- a file named 'mappings' containing memory mappings metadata. ( ex: mappings )
30 changes: 30 additions & 0 deletions docs/getting-started.rst
@@ -0,0 +1,30 @@
Getting started
===============

First you need to install python-haystack_. Please refer to the
:ref:`installation` section of the documentation.

Then you need a process memory dump. Please refer to the :ref:`capture-a-memory-dump`
section of the documentation.
We will name the process memory dump `memory.dmp` for the rest of this documentation.

*What is it all about?*

Yeti is about organizing observables, indicators of compromise, TTPs, and
knowledge on threat actors in a single, unified repository. Ideally, this
repository should be queryable in an automated way by other tools (spoiler:
it is!)

Malware stolen data
-------------------

You just analyzed the latest Dridex sample and you figured out that it's using
a subdirectory in the user's ``Roaming`` directory to store its data, and you'd
like to document this. *(Whether this is a strong indicator or not is another
story)*.

You start by adding a new **Entity** of type **Malware** called Dridex. Navigate
to **New > Malware**, and populate the fields.

Creating a Malware Entity
^^^^^^^^^^^^^^^^^^^^^^^^^
54 changes: 54 additions & 0 deletions docs/index.rst
@@ -0,0 +1,54 @@
.. python haystack documentation master file
Welcome to Haystack's documentation!
====================================

**Useful links**

* `Code repository <https://github.com/trolldbois/python-haystack>`_

Summary:
--------

Haystack is a framework dedicated to process heap analysis. The general idea
is that process memory contains user data (the interesting stuff) allocated by
the process and system metadata allocated by the kernel (in short) to manage
allocation and de-allocation of user data (as on of many metadata present in there)

This framework assists its user in a programmatic interpretation of the system
allocation metadata, so that the user can then concentrate on interpretation of
the user-data itself.

This framework also provide a way to search user allocated memory for specific
instance of user defined types such as C records. That mechanism is used internally
to identify the system metadata records used by the memory allocator to manage
allocation of user memory.

The framework also provide a way to reverse engineer the types of memory structure
in use by a process. The reversed types will take into account linked list, pointers
and other value constraints to propose a list of type definition.

Packages:
---------

The core package python-haystack_ is providing the base modules and classes to
search for instance of C records in a process memory.
Based on types definition (using python ctypes) and value constraints defined
by the user, the package allows to search a process memory for such instances.

The additional package python-haystack-reverse_ is providing a set of tools to
assist in reversing the types used by a process and recreate type definitions.

Contents:
---------

.. toctree::
installation
getting-started
capture-process-memory
usage

.. _python-haystack: https://github.com/trolldbois/python-haystack/
.. _python-haystack-reverse: https://github.com/trolldbois/python-haystack-reverse/
.. _python-haystack-gui: https://github.com/trolldbois/python-haystack-gui/
.. _python-haystack-docs: https://github.com/trolldbois/python-haystack-docs/
46 changes: 46 additions & 0 deletions docs/installation.rst
@@ -0,0 +1,46 @@
.. _installation:

Installation
============

These procedures were tested on Ubuntu 16.04.

Install from PyPi
-----------------

Install a virtual environment::

$ virtualenv v_haystack
$ source v_haystack/bin/activate

Install python-haystack::

(v_haystack) $ pip install haystack

Keeping it up to date ::

(v_haystack) $ pip install haystack --upgrade

Clone+Install from GitHub
-------------------------

Clone python-haystack::

$ git clone https://github.com/trolldbois/python-haystack.git

Setup a virtual environment::

$ virtualenv v_haystack
$ source v_haystack/bin/activate

Install python-haystack (won't work otherwise)::

(v_haystack) $ cd python-haystack
(v_haystack) ~/python-haystack$ pip install -r requirements
(v_haystack) ~/python-haystack$ python setup.py install

Keeping it up to date ::

(v_haystack) $ cd python-haystack
(v_haystack) ~/python-haystack$ git pull

26 changes: 26 additions & 0 deletions docs/usage.rst
@@ -0,0 +1,26 @@
.. _command-line:

Command line usage
==================

A few entry points exists for different purposes

- ``haystack-find-heap`` allows to show details on Windows HEAP.
- ``haystack-search`` allows to search for instance of types
- ``haystack-show`` allows to show a specific formatted values of a type instance at a specific memory address

You can use the following URL to designate your memory handler/dump:

- ``dir:///path/to/my/haystack/fump/folder`` to use the haystack dump format
- ``dmp:///path/to/my/minidump/file`` use the minidump format (microsoft?)
- ``frida://name_or_pid_of_process_to_attach_to`` use frida to access a live process memory
- ``live://name_or_pid_of_process_to_attach_to`` ptrace a live process
- ``rekall://`` load a rekall image
- ``volatility://`` load a volatility image

API usage
=========

.. automodule:: haystack.search.api
:members:

10 changes: 5 additions & 5 deletions haystack/cli.py
Expand Up @@ -84,12 +84,12 @@ class HaystackError(Exception):
pass


def get_memory_handler(opts):
def make_memory_handler(opts):
dumptype = opts.target.scheme.lower()
if dumptype not in SUPPORTED_DUMP_URI.keys():
raise TypeError('dump type has no case support. %s' % dumptype)
loader = SUPPORTED_DUMP_URI[dumptype](opts)
return loader.get_memory_handler()
return loader.make_memory_handler()


def get_output(memory_handler, results, rtype):
Expand Down Expand Up @@ -120,7 +120,7 @@ def dump_process(opts):
def search_cmdline(args):
""" Search for instance of a record_type in the allocated memory of a process. """
# get the memory handler adequate for the type requested
memory_handler = get_memory_handler(args)
memory_handler = make_memory_handler(args)
# try to load constraints
my_constraints = None
if args.constraints_file:
Expand Down Expand Up @@ -157,7 +157,7 @@ def show_cmdline(args):
# we need an int
memory_address = args.address
# get the memory handler adequate for the type requested
memory_handler = get_memory_handler(args)
memory_handler = make_memory_handler(args)
# check the validity of the address
heap = memory_handler.is_valid_address_value(memory_address)
if not heap:
Expand Down Expand Up @@ -231,7 +231,7 @@ def watch(args):
refresh = args.refresh_rate
varname = args.varname
# get the memory handler adequate for the type requested
memory_handler = get_memory_handler(args)
memory_handler = make_memory_handler(args)
# check the validity of the address
heap = memory_handler.is_valid_address_value(memory_address)
if not heap:
Expand Down
15 changes: 11 additions & 4 deletions haystack/mappings/cuckoo.py
Expand Up @@ -39,6 +39,16 @@
}


class CuckooProcessLoader(interfaces.IMemoryLoader):
desc = 'Load a Cuckoo memory dump'

def __init__(self, opts):
self.loader = CuckooProcessMapper(opts.target.netloc)

def make_memory_handler(self):
return self.loader.make_memory_handler()


class CuckooProcessMapper(interfaces.IMemoryLoader):

def __init__(self, procdump_filename):
Expand All @@ -56,10 +66,7 @@ def __init__(self, procdump_filename):
def _init_mappings(self):
content_file = open(self.filename, 'rb')
fsize = os.path.getsize(self.filename)
mmap_content = mmap.mmap(
content_file.fileno(),
fsize,
access=mmap.ACCESS_READ)
mmap_content = mmap.mmap(content_file.fileno(), fsize, access=mmap.ACCESS_READ)
log.debug("fsize: %d", fsize)
maps = []
# BUG ?
Expand Down
6 changes: 3 additions & 3 deletions haystack/mappings/folder.py
Expand Up @@ -255,7 +255,7 @@ def load(dumpname, cpu=None, os_name=None):
elif os.path.isfile(dumpname):
# try minidump
from haystack.mappings import minidump
mapper = minidump.MDMP_Mapper(dumpname, cpu=cpu, os_name=os_name)
mapper = minidump.MinidumpLoader(dumpname, cpu=cpu, os_name=os_name)
else:
raise IOError('couldnt load %s' % dumpname)
memory_handler = mapper.make_memory_handler()
Expand All @@ -264,14 +264,14 @@ def load(dumpname, cpu=None, os_name=None):
return memory_handler


class FolderLoader:
class FolderLoader(interfaces.IMemoryLoader):
desc = 'Load a basic haystack folder memory dump'

def __init__(self, opts):
opts.dump_folder_name = opts.target.path
self.loader = ProcessMemoryDumpLoader(opts.dump_folder_name)

def get_memory_handler(self):
def make_memory_handler(self):
return self.loader.make_memory_handler()


Expand Down
20 changes: 5 additions & 15 deletions haystack/mappings/fridaprocess.py
Expand Up @@ -32,7 +32,7 @@ def _init_mappings(self):
start = _range.base_address
end = _range.base_address + _range.size
perms = _range.protection
mappings.append(FridaMemoryMapping(self.session, start, end, perms, 0, 0, 0, 0, None))
mappings.append(FridaMemoryMapping(self.session, start, end, perms, None))
if not is_64 and len(hex(start)) > 8:
is_64 = True
#
Expand Down Expand Up @@ -82,18 +82,8 @@ class FridaMemoryMapping(AMemoryMapping):
useful in list contexts
"""

def __init__(self, frida_session, start, end, permissions, offset,
major_device, minor_device, inode, pathname):
AMemoryMapping.__init__(
self,
start,
end,
permissions,
offset,
major_device,
minor_device,
inode,
pathname)
def __init__(self, frida_session, start, end, permissions, pathname):
AMemoryMapping.__init__(self, start, end, permissions, 0, 0, 0, 0, pathname)
self._session = frida_session

def read_word(self, address):
Expand Down Expand Up @@ -125,13 +115,13 @@ def __getstate__(self):
return d


class FridaLoader:
class FridaLoader(interfaces.IMemoryLoader):
desc = 'Load a Minidump memory dump'

def __init__(self, opts):
self.loader = FridaMapper(opts.target.netloc)

def get_memory_handler(self):
def make_memory_handler(self):
return self.loader.make_memory_handler()


Expand Down

0 comments on commit 7631baf

Please sign in to comment.