Skip to content
This repository

PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data.

Octocat-spinner-32 LICENSES Include missing licenses in the toplevel LICENSES directory January 17, 2014
Octocat-spinner-32 bench Update bench code (closes gh-114) January 11, 2014
Octocat-spinner-32 c-blosc subtree merge blosc v1.3.5 March 22, 2014
Octocat-spinner-32 contrib Formatting September 01, 2012
Octocat-spinner-32 doc Shining Panda hosting for CI is ceased April 13, 2014
Octocat-spinner-32 examples Removed nested-iter.py as it is a rather misleading example. Fixes #343 March 21, 2014
Octocat-spinner-32 src Do not use relative paths to include blosc_filter.h January 17, 2014
Octocat-spinner-32 tables Better control of verbosity in tables.tests.test April 09, 2014
Octocat-spinner-32 utils have most of pt2to3 fleashed outh April 09, 2013
Octocat-spinner-32 .gitignore Added attributes to class dicstrings of Array, Leaf, Table, and Row c… July 14, 2012
Octocat-spinner-32 .travis.yml Install numpy via pip for python 3.3 January 04, 2014
Octocat-spinner-32 ANNOUNCE.txt.in Post-release commit March 25, 2014
Octocat-spinner-32 LICENSE.txt Updated copyright date January 16, 2014
Octocat-spinner-32 MANIFEST.in Update MANIFEST.in to include .cc source files January 19, 2014
Octocat-spinner-32 Makefile Update bench code (closes gh-114) January 11, 2014
Octocat-spinner-32 Makefile_windows Some modifications for windows 64-bit July 01, 2010
Octocat-spinner-32 README.txt Document the limited support to HDF5 < 1.8.7 January 20, 2014
Octocat-spinner-32 RELEASE_NOTES.txt Post-release commit March 25, 2014
Octocat-spinner-32 THANKS fix typo April 19, 2013
Octocat-spinner-32 VERSION Post-release commit March 25, 2014
Octocat-spinner-32 requirements.txt Revert "Fixed requirements file" June 30, 2013
Octocat-spinner-32 setup.cfg Merged in r4147 and r4148 (put in sync setup.cfg and setup.py). June 09, 2009
Octocat-spinner-32 setup.py Fix build with python3 March 23, 2014
Octocat-spinner-32 subtree-merge-blosc.sh Updated to Blosc 1.3.2 and added an script for easy merging the subtree January 17, 2014
README.txt
===========================================
 PyTables: hierarchical datasets in Python
===========================================

:URL: http://www.pytables.org/


PyTables is a package for managing hierarchical datasets and designed
to efficiently cope with extremely large amounts of data.

It is built on top of the HDF5 library and the NumPy package. It
features an object-oriented interface that, combined with C extensions
for the performance-critical parts of the code (generated using
Cython), makes it a fast, yet extremely easy to use tool for
interactively save and retrieve very large amounts of data. One
important feature of PyTables is that it optimizes memory and disk
resources so that they take much less space (between a factor 3 to 5,
and more if the data is compressible) than other solutions, like for
example, relational or object oriented databases.

Not a RDBMS replacement
-----------------------

PyTables is not designed to work as a relational database replacement,
but rather as a teammate. If you want to work with large datasets of
multidimensional data (for example, for multidimensional analysis), or
just provide a categorized structure for some portions of your
cluttered RDBS, then give PyTables a try. It works well for storing
data from data acquisition systems (DAS), simulation software, network
data monitoring systems (for example, traffic measurements of IP
packets on routers), or as a centralized repository for system logs,
to name only a few possible uses.

Tables
------

A table is defined as a collection of records whose values are stored
in fixed-length fields. All records have the same structure and all
values in each field have the same data type. The terms "fixed-length"
and strict "data types" seems to be quite a strange requirement for an
interpreted language like Python, but they serve a useful function if
the goal is to save very large quantities of data (such as is
generated by many scientific applications, for example) in an
efficient manner that reduces demand on CPU time and I/O.

Arrays
------

There are other useful objects like arrays, enlargeable arrays or
variable length arrays that can cope with different missions on your
project. Also, quite a bit of effort has been invested to make
browsing the hierarchical data structure a pleasant
experience. PyTables implements a few easy-to-use methods for
browsing. See the documentation (located in the ``doc/`` directory)
for more details.

Easy to use
-----------

One of the principal objectives of PyTables is to be user-friendly.
To that end, special Python features like generators, slots and
metaclasses in new-brand classes have been used. In addition,
iterators has been implemented were context was appropriate so as to
enable the interactive work to be as productive as possible. For these
reasons, you will need to use Python 2.6 or higher to take advantage of
PyTables.

Platforms
---------

We are using Linux on top of Intel32 and Intel64 boxes as the main
development platforms, but PyTables should be easy to compile/install
on other UNIX or Windows machines.  Nonetheless, caveat emptor: more
testing is needed to achieve complete portability, we'd appreciate
input on how it compiles and installs on your platform.

Compiling
---------

To compile PyTables you will need, at least, a recent version of HDF5
(C flavor) library, the Zlib compression library and the NumPy and
Numexpr packages. Besides, if you want to take advantage of the LZO
and bzip2 compression libraries support you will also need recent
versions of them. LZO and bzip2 compression libraries are, however,
optional.

We've tested this PyTables version with HDF5 1.8.11/1.8.12, NumPy 1.7.1/1.8.0
and Numexpr 2.2.2, and you *need* to use these versions, or higher, to
make use of PyTables.

Installation
------------

The Python Distutils are used to build and install PyTables, so it is
fairly simple to get things ready to go. Following are very simple
instructions on how to proceed. However, more detailed instructions,
including a section on binary installation for Windows users, is
available in Chapter 2 of the User's Manual (``doc/usersguide.pdf`` or
http://www.pytables.org/moin/HowToUse).

1. First, make sure that you have HDF5, NumPy and Numexpr installed
   (you will need at least HDF5 1.8.4, HDF5 >= 1.8.7 is strongly recommended,
   NumPy 1.4.1 and Numexpr 2.0).
   If don't, get them from http://www.hdfgroup.org/HDF5/,
   http://www.numpy.org and http://code.google.com/p/numexpr.
   Compile/install them.

   Optionally, consider to install the excellent LZO compression
   library from http://www.oberhumer.com/opensource/.  You can also
   install the high-performance bzip2 compression library, available
   at http://www.bzip.org/.

2. From the main PyTables distribution directory run this command,
   (plus any extra flags needed as discussed above)::

    $ python setup.py build_ext --inplace

3. To run the test suite, set the PYTHONPATH environment variable to
   include the ``.`` directory, enter the Python interpreter and issue
   the commands::

    >>> import tables
    >>> tables.test()

   If there is some test that does not pass, please send the
   complete output for tests back to us.

4. To install the entire PyTables Python package, run this command as
   the root user (remember to add any extra flags needed)::

    $ python setup.py install


That's it!  Good luck, and let us know of any bugs, suggestions,
gripes, kudos, etc. you may have.

----

  **Enjoy data!**

  -- The PyTables Team

.. Local Variables:
.. mode: text
.. coding: utf-8
.. fill-column: 70
.. End:
Something went wrong with that request. Please try again.