Skip to content
Brian Trammell edited this page Oct 10, 2013 · 4 revisions

In the pytools directory in QoF, there exists a qof.py module. The functions there are meant to support the analysis of QoF-produced IPFIX data using Pandas:http://pandas.pydata.org in IPython notebooks together with the python-ipfix module. This is how the data analysis for continued QoF development is done.

The IPFIX module and tools require Python 3.3; I generally work in a locally installed copy of Python 3 to remain independent of the system Python. To get the toolchain up and running:

  1. Install Python 3.3; I generally install from source from www.python.org to a subdirectory of my home directory, because the interaction between Debian packaged python modules and pip can be a little unpredictable. To make sure you get the right modules in the library, on Debian-like systems you’ll want to install libbz2-dev and libsqlite3-dev first.
  2. Install setuptools and pip; make sure you’re using the Python 3 you just installed when running setup.py install in these modules to install to the right paths.
  3. Install numpy: pip-3.3 install numpy
  4. Install scipy: pip-3.3 install scipy. This has some dependencies: On Debian-like systems, libblas-dev, liblapack-dev, and gfortran seem to be the necessary packages.
  5. Install matplotlib: pip-3.3 install matplotlib. There are also dependencies here: on Debian-like systems, libfreetype6-dev and libpng12-dev seem to be the necessary packages.
  6. Install pandas: pip-3.3 install pandas
  7. Install ipython notebook server dependencies: pip-3.3 install tornado, pip-3.3 install pyzmq.
  8. Install ipython: pip-3.3 install ipython
  9. Install the IPFIX module: pip-3.3 install ipfix; alternately, you can use the version of the module in github by adding it to the PYTHONPATH environment variable
  10. Make sure the qof.py module is somewhere in PYTHONPATH.

Now ipython3 notebook --pylab inline will start a notebook server for notebooks in the current directory, which you can connect to (by default) at http://localhost:8888. There are iPython notebooks for certain simple analyses in the pytools directory that you can get started with. You’ll need to use the develop branch of the python ipfix module with these notebookes; add it to PYTHONPATH as above.

Otherwise, you can use the pytools directly. These functions are purposely not documented to discourage their use in anything public, and may no longer work as outlined here.

  • Before using the IPFIX tools, you’ll need to import ipfix followed by initializing the information model: ipfix.ie.use_iana_default(); ipfix.ie.use_5103_default; ipfix.ie.use_specfile("/path/to/pytools/qof.iespec").
  • To read an IPFIX file into a dataframe: qof.dataframe_from_ipfix(filename, (ie-name, ie-name, ...)), where the second argument is a list or tuple of Information Element names to use as columns of the dataframe. The resulting dataframe will only include those records in the input file including all of the Information Elements specified. If you’re getting empty dataframes, the requested Information Element may not have been specified in the YAML configuration file.
  • qof.drop_lossy(df), when given a dataframe containing tcpSequenceLoss and/or reverseTcpSequenceLoss columns, will return a dataframe without lossy flows.
  • qof.derive_duration(df), when given a dataframe with flowStartMilliseconds and flowEndMilliseconds columns, adds a floating-point durationSeconds column.
Clone this wiki locally