Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data.
Python C C++ Objective-C CMake Shell Other

Merge pull request #435 from 153957/patch-1

Remove leftover references to 'nested iterators'
latest commit 9df97c3373
@FrancescAlted FrancescAlted authored
Failed to load latest commit information.
LICENSES Include missing licenses in the toplevel LICENSES directory
bench Update bench code (closes gh-114)
c-blosc c-blosc updated to 1.5.4
contrib Formatting
doc Remove duplicate `hdf5_version` in Top-level docs
examples Fixed running of examples in batch for python2 and python3
src Fix a performance issue on NFS (see gh-204)
tables Remove leftover references to 'nested iterators'
utils initial working version of ptree, walk_nodes is a bit slow
.gitignore Update gitignore
.travis.yml Doing a partial merged of PR #427 Post-release commit
LICENSE.txt Updated copyright date Fix the manifest template to include the rename README.rst
Makefile Fix the manifest template to include the rename README.rst
README.rst Merge pull request #425 from gitter-badger/gitter-badge
RELEASE_NOTES.txt Update release notes after merging #422
THANKS fix typo
VERSION Post-release commit
requirements.txt Minimum Cython version reverted to 0.14 (see #387)
setup.cfg Merged in r4147 and r4148 (put in sync setup.cfg and Minimum Cython version reverted to 0.14 (see #387) Updated to the new version (0.12) of the subtree merge script for Blosc


PyTables: hierarchical datasets in Python

Join the chat at

PyTables is a package for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data.

It is built on top of the HDF5 library and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython), makes it a fast, yet extremely easy to use tool for interactively save and retrieve very large amounts of data. One important feature of PyTables is that it optimizes memory and disk resources so that they take much less space (between a factor 3 to 5, and more if the data is compressible) than other solutions, like for example, relational or object oriented databases.

Not a RDBMS replacement

PyTables is not designed to work as a relational database replacement, but rather as a teammate. If you want to work with large datasets of multidimensional data (for example, for multidimensional analysis), or just provide a categorized structure for some portions of your cluttered RDBS, then give PyTables a try. It works well for storing data from data acquisition systems (DAS), simulation software, network data monitoring systems (for example, traffic measurements of IP packets on routers), or as a centralized repository for system logs, to name only a few possible uses.


A table is defined as a collection of records whose values are stored in fixed-length fields. All records have the same structure and all values in each field have the same data type. The terms "fixed-length" and strict "data types" seems to be quite a strange requirement for an interpreted language like Python, but they serve a useful function if the goal is to save very large quantities of data (such as is generated by many scientific applications, for example) in an efficient manner that reduces demand on CPU time and I/O.


There are other useful objects like arrays, enlargeable arrays or variable length arrays that can cope with different missions on your project. Also, quite a bit of effort has been invested to make browsing the hierarchical data structure a pleasant experience. PyTables implements a few easy-to-use methods for browsing. See the documentation (located in the doc/ directory) for more details.

Easy to use

One of the principal objectives of PyTables is to be user-friendly. To that end, special Python features like generators, slots and metaclasses in new-brand classes have been used. In addition, iterators has been implemented were context was appropriate so as to enable the interactive work to be as productive as possible. For these reasons, you will need to use Python 2.6 or higher to take advantage of PyTables.


We are using Linux on top of Intel32 and Intel64 boxes as the main development platforms, but PyTables should be easy to compile/install on other UNIX or Windows machines. Nonetheless, caveat emptor: more testing is needed to achieve complete portability, we'd appreciate input on how it compiles and installs on your platform.


To compile PyTables you will need, at least, a recent version of HDF5 (C flavor) library, the Zlib compression library and the NumPy and Numexpr packages. Besides, if you want to take advantage of the LZO and bzip2 compression libraries support you will also need recent versions of them. LZO and bzip2 compression libraries are, however, optional.

We've tested this PyTables version with HDF5 1.8.11/1.8.12, NumPy 1.7.1/1.8.0 and Numexpr 2.2.2, and you need to use these versions, or higher, to make use of PyTables.


The Python Distutils are used to build and install PyTables, so it is fairly simple to get things ready to go. Following are very simple instructions on how to proceed. However, more detailed instructions, including a section on binary installation for Windows users, is available in Chapter 2 of the User's Manual (doc/usersguide.pdf or

  1. First, make sure that you have HDF5, NumPy and Numexpr installed (you will need at least HDF5 1.8.4, HDF5 >= 1.8.7 is strongly recommended, NumPy 1.4.1 and Numexpr 2.0). If don't, get them from, and Compile/install them.

    Optionally, consider to install the excellent LZO compression library from You can also install the high-performance bzip2 compression library, available at

  2. From the main PyTables distribution directory run this command, (plus any extra flags needed as discussed above):

    $ python build_ext --inplace
  3. To run the test suite, set the PYTHONPATH environment variable to include the . directory, enter the Python interpreter and issue the commands:

    >>> import tables
    >>> tables.test()

    If there is some test that does not pass, please send the complete output for tests back to us.

  4. To install the entire PyTables Python package, run this command as the root user (remember to add any extra flags needed):

    $ python install

That's it! Good luck, and let us know of any bugs, suggestions, gripes, kudos, etc. you may have.

Enjoy data!

—The PyTables Team

Something went wrong with that request. Please try again.