Skip to content

Commit

Permalink
Merge c9128c4 into 90f481e
Browse files Browse the repository at this point in the history
  • Loading branch information
Enteee committed Nov 25, 2018
2 parents 90f481e + c9128c4 commit 80f9026
Show file tree
Hide file tree
Showing 23 changed files with 404 additions and 168 deletions.
1 change: 1 addition & 0 deletions .gitattributes
@@ -0,0 +1 @@
doc/*.cast filter=lfs diff=lfs merge=lfs -text
3 changes: 3 additions & 0 deletions .gitmodules
@@ -0,0 +1,3 @@
[submodule "pdml2flow/plugins/pdml2flow-elasticsearch"]
path = pdml2flow/plugins/pdml2flow-elasticsearch
url = https://github.com/Enteee/pdml2flow-elasticsearch.git
1 change: 1 addition & 0 deletions .travis.yml
Expand Up @@ -26,6 +26,7 @@ before_deploy:
deploy:
provider: pypi
user: Ente
skip_existing: true
password:
secure: KU4gvRQOmXR0QIjkL+zj4ozl3ATJPErQQY8c4Qax2iOVL71R9oEbpz6maRo2iOyul5Vn5ofCVz1OQ3dwp0o/M9zzQCg8NLEPVmUpKCD4Vx8wZX+6OguQhC37lN3lBrrR6c/dtmMrtqck+FSfM5AgWWdv2GUohicPeP+ZevUCgb+GZaiJ+mVaOkq9CbZCNRph8UiOVW5IKMlHsWnRRT+I5xoxoJOKJt/Kx/S9fyMYw5V2nKbcjZACANRIkN9Oq7lG8eANE+IfevwahprWVBlrQ2b1BbiIvP3sG141YXdkiBljmr/7Uh7oR2u88iiQuk/Xgs8eLd5e8jF0NweYEy7k7jLQxfNobKDcwcjURJUsDPZqAFRubkvdeeU5/fX9aXo3ktcVCEjPpb+86LuuJ2r5ONq6NslP+GxWN6CzXSmQ5sXehBHFsSzR99MAwCkaGuxOozhfaZY70I3oZa4mHWFfF89aKTTmtMfNMrWQqoLJQXL810BgAgyoiBP8HCh3o7cv/d8fuOdggX+MBLB3rC+kNHYOQt4DXWAYz4ONvDdre58yxEPEXxioICqvYeutI5vilFmVQ2tP7Tb1mw876HfUUm+unB0OhqUTTxzydX0d237IroFbRgC8F4JEmFPA8+dfYsuBtp4VtsIvgf7/gI8vQFm47JZnMOm1YD8HmmPYnec=
on:
Expand Down
75 changes: 70 additions & 5 deletions README.md
@@ -1,12 +1,15 @@
# pdml2flow [![PyPI version](https://badge.fury.io/py/pdml2flow.svg)](https://badge.fury.io/py/pdml2flow)
_Aggregates wireshark pdml to flows, with plugins_

When analyzing network traffic, it is sometimes helpful to group captured frames. For example by port numbers to obtain network flows or using MAC addresses for hardware flows. Doing this in [Wireshark][wireshark] or [tshark] is difficult. `pdml2flow` was designed to solve this use case. `pdml2flow` reads [tshark] output using the [Packet Description Markup Language][pdml] and writes flows either in JSON or XML. These flows are also accessible from a python plugin interface. If flow aggregation is not needed, `pdml2frame` can be be used to process [pdml] with plugins.

| Branch | Build | Coverage |
| ------- | ------ | -------- |
| master | [![Build Status master]](https://travis-ci.org/Enteee/pdml2flow) | [![Coverage Status master]](https://coveralls.io/github/Enteee/pdml2flow?branch=master) |
| develop | [![Build Status develop]](https://travis-ci.org/Enteee/pdml2flow) | [![Coverage Status develop]](https://coveralls.io/github/Enteee/pdml2flow?branch=develop) |

## Prerequisites

* [python]:
- 3.4
- 3.5
Expand All @@ -18,15 +21,17 @@ _Aggregates wireshark pdml to flows, with plugins_
* [pip](https://pypi.python.org/pypi/pip)

## Installation

```shell
$ sudo pip install pdml2flow
```

## Usage

```shell
$ pdml2flow -h
usage: pdml2flow [-h] [--version] [-f FLOW_DEF_STR] [-t FLOW_BUFFER_TIME]
[-l DATA_MAXLEN] [-s] [-c] [-a] [-m] [-d] [+json [args]]
[-l DATA_MAXLEN] [-c] [-a] [-s] [-d] [+json [args]]
[+xml [args]]

Aggregates wireshark pdml to flows
Expand All @@ -41,13 +46,12 @@ optional arguments:
packets [default: 180]
-l DATA_MAXLEN Maximum lenght of data in tshark pdml-field [default:
200]
-s Extract show names, every data leaf will now look like
{ raw : [] , show: [] } [default: False]
-c Removes duplicate data when merging objects, will not
preserve order of leaves [default: False]
-a Instead of merging the frames will append them to an
array [default: False]
-m Appends flow metadata [default: False]
-s Extract show names, every data leaf will now look like
{ raw : [] , show: [] } [default: False]
-d Debug mode [default: False]

Plugins:
Expand All @@ -59,7 +63,14 @@ Plugins:
lines with null character
```
## Example
### Environment Variables
| Name | Descripton |
| ---- | ---------- |
| LOAD_PLUGINS | If set to `False`, skips loading of all plugins |
## Examples
Sniff from interface and write json:
```shell
$ tshark -i interface -Tpdml | pdml2flow +json
Expand Down Expand Up @@ -87,6 +98,47 @@ $ tshark -i interface -Tpdml | pdml2flow +json | fluentflow rules.js
## Plugins
* [Elasticsearch](https://github.com/Enteee/pdml2flow-elasticsearch)
* see [pdml2flow/plugins/](pdml2flow/plugins/) for a full list of supported plugins
### Interface
```python
# vim: set fenc=utf8 ts=4 sw=4 et :

class Plugin2(object): # pragma: no cover
"""Version 2 plugin interface."""

@staticmethod
def help():
"""Return a help string."""
pass

def __init__(self, *args):
"""Called once during startup."""
pass

def __deinit__(self):
"""Called once during shutdown."""
pass

def flow_new(self, flow, frame):
"""Called every time a new flow is opened."""
pass

def flow_expired(self, flow):
"""Called every time a flow expired, before printing the flow."""
pass

def flow_end(self, flow):
"""Called every time a flow ends, before printing the flow."""
pass

def frame_new(self, frame, flow):
"""Called for every new frame."""
pass
```
### Create a New Plugin
[![asciicast](https://asciinema.org/a/208963.png)](https://asciinema.org/a/208963)
Expand Down Expand Up @@ -120,11 +172,24 @@ Plugins:
character
```
## Testing
* [Test documentation](test/README.md)
running the tests:
```shell
$ python setup.py test
```
[python]: https://www.python.org/
[wireshark]: https://www.wireshark.org/
[tshark]: https://www.wireshark.org/docs/man-pages/tshark.html
[dict2xml]: https://github.com/delfick/python-dict2xml
[jq]: https://stedolan.github.io/jq/
[FluentFlow]: https://github.com/t-moe/FluentFlow
[pdml]: https://wiki.wireshark.org/PDML
[Build Status master]: https://travis-ci.org/Enteee/pdml2flow.svg?branch=master
[Coverage Status master]: https://coveralls.io/repos/github/Enteee/pdml2flow/badge.svg?branch=master
Expand Down
3 changes: 3 additions & 0 deletions doc/new_plugin.cast
Git LFS file not shown
14 changes: 13 additions & 1 deletion pdml2flow/autovivification.py
Expand Up @@ -4,6 +4,18 @@

DEFAULT = object();

def getitem_by_path(d, path):
"""Access item in d using path.
a = { 0: { 1: 'item' } }
getitem_by_path(a, [0, 1]) == 'item'
"""
return reduce(
lambda d, k: d[k],
path,
d
)

class AutoVivification(dict):

def clean_empty(self, d=DEFAULT):
Expand Down Expand Up @@ -89,7 +101,7 @@ def __getitem__(self, item):
"""
# if the item is a list we autoexpand it
if type(item) is list:
return reduce(lambda d, k: d[k], item, self)
return getitem_by_path(self, item)
else:
try:
return dict.__getitem__(self, item)
Expand Down
60 changes: 20 additions & 40 deletions pdml2flow/cli.py
Expand Up @@ -14,6 +14,24 @@
from .plugin import *
from .pdmlhandler import PdmlHandler

def _add_common_arguments(argparser):
argparser.add_argument(
'-s',
dest='EXTRACT_SHOW',
action='store_true',
help='Extract show names, every data leaf will now look like {{ raw : [] , show: [] }} [default: {}]'.format(
Conf.EXTRACT_SHOW
)
)
argparser.add_argument(
'-d',
dest='DEBUG',
action='store_true',
help='Debug mode [default: {}]'.format(
Conf.DEBUG
)
)

def pdml2flow():

def add_arguments_cb(argparser):
Expand Down Expand Up @@ -41,14 +59,6 @@ def add_arguments_cb(argparser):
Conf.DATA_MAXLEN
)
)
argparser.add_argument(
'-s',
dest='EXTRACT_SHOW',
action='store_true',
help='Extract show names, every data leaf will now look like {{ raw : [] , show: [] }} [default: {}]'.format(
Conf.EXTRACT_SHOW
)
)
argparser.add_argument(
'-c',
dest='COMPRESS_DATA',
Expand All @@ -65,22 +75,7 @@ def add_arguments_cb(argparser):
Conf.FRAMES_ARRAY
)
)
argparser.add_argument(
'-m',
dest='METADATA',
action='store_true',
help='Appends flow metadata [default: {}]'.format(
Conf.METADATA
)
)
argparser.add_argument(
'-d',
dest='DEBUG',
action='store_true',
help='Debug mode [default: {}]'.format(
Conf.DEBUG
)
)
_add_common_arguments(argparser)

def postprocess_conf_cb(conf):
"""Split each flowdef to a path."""
Expand All @@ -101,22 +96,7 @@ def postprocess_conf_cb(conf):
def pdml2frame():

def add_arguments_cb(argparser):
argparser.add_argument(
'-s',
dest='EXTRACT_SHOW',
action='store_true',
help='Extract show names, every data leaf will now look like {{ raw : [] , show: [] }} [default: {}]'.format(
Conf.EXTRACT_SHOW
)
)
argparser.add_argument(
'-d',
dest='DEBUG',
action='store_true',
help='Debug mode [default: {}]'.format(
Conf.DEBUG
)
)
_add_common_arguments(argparser)

def postprocess_conf_cb(conf):
conf['DATA_MAXLEN'] = sys.maxsize
Expand Down
65 changes: 37 additions & 28 deletions pdml2flow/conf.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
# vim: set fenc=utf8 ts=4 sw=4 et :
import sys
from os import path
from os import path, environ
from shlex import split
from base64 import b32encode, b32decode
from pkg_resources import require, DistributionNotFound
Expand All @@ -10,7 +10,7 @@
from inspect import isclass

from .plugin import Plugin2
from .utils import call_plugin
from .utils import call_plugin, make_argparse_help_safe, boolify

DEFAULT = object()

Expand Down Expand Up @@ -58,10 +58,11 @@ def get_version():
FRAMES_ARRAY = False
FRAME_TIME = ['frame', 'time_epoch', 'raw', 0]
DEBUG = False
METADATA = False
PARSE_SOURCE = sys.stdin
SUPPORTED_PLUGIN_INTERFACES = [Plugin2]
LOAD_PLUGINS = boolify(environ.get('LOAD_PLUGINS', 'True'))
PLUGINS = []
PLUGIN_GROUP_BASE = 'pdml2flow.plugins.base'
PLUGIN_GROUP = 'pdml2flow.plugins'
PLUGIN_CONF_NAME = 'conf.ini'

Expand Down Expand Up @@ -106,33 +107,41 @@ def load(description, add_arguments_cb = lambda x: None, postprocess_conf_cb = l
plugin_argparser = argparser.add_argument_group('Plugins')

plugins = {}
for entry_point in iter_entry_points(group = Conf.PLUGIN_GROUP):
name = str(entry_point).split(' =',1)[0]
plugin = entry_point.load()
if isclass(plugin) \
and not plugin in Conf.SUPPORTED_PLUGIN_INTERFACES \
and any([
issubclass(plugin, supported_plugin_interface)
for supported_plugin_interface in Conf.SUPPORTED_PLUGIN_INTERFACES
]):

plugin_argparser.add_argument(
'+{}'.format(name),
dest = 'PLUGIN_{}'.format(name),
type = str,
nargs = '?',
default = DEFAULT,
metavar = 'args'.format(name),
help = call_plugin(
plugin,
'help'
def load_plugin_group(group):
"""Load all plugins from the given plugin_group."""
for entry_point in iter_entry_points(group = group):
name = str(entry_point).split(' =',1)[0]
plugin = entry_point.load()
if isclass(plugin) \
and not plugin in Conf.SUPPORTED_PLUGIN_INTERFACES \
and any([
issubclass(plugin, supported_plugin_interface)
for supported_plugin_interface in Conf.SUPPORTED_PLUGIN_INTERFACES
]):

plugin_argparser.add_argument(
'+{}'.format(name),
dest = 'PLUGIN_{}'.format(name),
type = str,
nargs = '?',
default = DEFAULT,
metavar = 'args'.format(name),
help = make_argparse_help_safe(
call_plugin(
plugin,
'help'
)
)
)
)

# register plugin
plugins[name] = plugin
else:
warning('Plugin not supported: {}'.format(name))
# register plugin
plugins[name] = plugin
else:
warning('Plugin not supported: {}'.format(name))

load_plugin_group(Conf.PLUGIN_GROUP_BASE)
if Conf.LOAD_PLUGINS:
load_plugin_group(Conf.PLUGIN_GROUP)

conf = vars(
argparser.parse_args([
Expand Down

0 comments on commit 80f9026

Please sign in to comment.