Skip to content

Commit

Permalink
Documentation updates for recent API changes (#174)
Browse files Browse the repository at this point in the history
Add more documentation on using the new TdmsFile.open method and
prefer use of channel[:] to access data rather than channel.data.
  • Loading branch information
adamreeve committed Mar 31, 2020
1 parent a957c07 commit 06bec71
Show file tree
Hide file tree
Showing 7 changed files with 208 additions and 84 deletions.
38 changes: 27 additions & 11 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,31 +14,47 @@ npTDMS

npTDMS is a cross-platform Python package for reading and writing TDMS files as produced by LabVIEW,
and is built on top of the `numpy <http://www.numpy.org/>`__ package.
Data read from a TDMS file is stored in numpy arrays,
and numpy arrays are also used when writing TDMS file.
Data is read from TDMS files as numpy arrays,
and npTDMS also allows writing numpy arrays to TDMS files.

TDMS files are structured in a hierarchy of groups and channels.
A TDMS file can contain multiple groups, which may each contain multiple channels.
A file, group and channel may all have properties associated with them,
but only channels have array data.

TDMS files can contain multiple data channels organised intro groups.
Typical usage when reading a TDMS file might look like::

from nptdms import TdmsFile

tdms_file = TdmsFile.read("path_to_file.tdms")
channel = tdms_file['Group']['Channel1']
data = channel.data
time = channel.time_track()
# do stuff with data
group = tdms_file['group name']
channel = group['channel name']
channel_data = channel[:]
channel_properties = channel.properties

The ``TdmsFile.read`` method reads all data into memory immediately.
When you are working with large TDMS files or don't need to read all channel data,
you can instead use ``TdmsFile.open``. This is more memory efficient but
accessing data can be slower::

with TdmsFile.open("path_to_file.tdms"):
group = tdms_file['group name']
channel = group['channel name']
channel_data = channel[:]

And to write a file::
npTDMS also has rudimentary support for writing TDMS files.
Using npTDMS to write a TDMS file looks like::

from nptdms import TdmsWriter, ChannelObject
import numpy

with TdmsWriter("path_to_file.tdms") as tdms_writer:
data_array = numpy.linspace(0, 1, 10)
channel = ChannelObject('Group', 'Channel1', data_array)
channel = ChannelObject('group name', 'channel name', data_array)
tdms_writer.write_segment([channel])

For more information, see the `npTDMS documentation <http://nptdms.readthedocs.io>`__.
For more detailed documentation on reading and writing TDMS files,
see the `npTDMS documentation <http://nptdms.readthedocs.io>`__.

Installation
------------
Expand Down Expand Up @@ -72,7 +88,7 @@ Limitations
-----------

This module doesn't support TDMS files with XML headers or with
extended floating point data.
extended precision floating point data.

TDMS files support timestamps with a resolution of 2^-64 seconds but these
are read as numpy datetime64 values with microsecond resolution.
Expand Down
4 changes: 2 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ Welcome to npTDMS's documentation

npTDMS is a cross-platform Python package for reading and writing TDMS files as produced by LabVIEW,
and is built on top of the `numpy <http://www.numpy.org/>`__ package.
Data read from a TDMS file is stored in numpy arrays,
and numpy arrays are also used when writing TDMS file.
Data is read from TDMS files as numpy arrays,
and npTDMS also allows writing numpy arrays to TDMS files.

Contents
========
Expand Down
4 changes: 2 additions & 2 deletions docs/limitations.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Limitations
===========

npTDMS currently doesn't support reading TDMS files with XML headers,
or files with extended floating point data.
npTDMS currently doesn't support reading TDMS files with XML headers (TDM files),
or files with extended precision floating point data.
20 changes: 16 additions & 4 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,26 @@ Typical usage when reading a TDMS file might look like::
# Access dictionary of properties:
properties = channel.properties
# Access numpy array of data for channel:
data = channel.data
# do stuff with data and properties...
data = channel[:]
# Access a subset of data
data_subset = channel[100:200]

Or to access a channel by group name and channel name directly::

channel = tdms_file[group_name][channel_name]
group = tdms_file[group_name]
channel = group[channel_name]

And to write a TDMS file::
The ``TdmsFile.read`` method reads all data into memory immediately.
When you are working with large TDMS files or don't need to read all channel data,
you can instead use ``TdmsFile.open``. This is more memory efficient but
accessing data can be slower::

with TdmsFile.open("path_to_file.tdms"):
channel = tdms_file[group_name][channel_name]
channel_data = channel[:]

npTDMS also has rudimentary support for writing TDMS files.
Using npTDMS to write a TDMS file looks like::

from nptdms import TdmsWriter, ChannelObject
import numpy
Expand Down
106 changes: 63 additions & 43 deletions docs/reading.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,30 +2,45 @@ Reading TDMS files
==================

To read a TDMS file, create an instance of the :py:class:`~nptdms.TdmsFile`
class using the static :py:meth:`~nptdms.TdmsFile.read` method, passing the path to the file, or an already opened file::
class using one of the static :py:meth:`nptdms.TdmsFile.read` or :py:meth:`nptdms.TdmsFile.open` methods,
passing the path to the file, or an already opened file.
The :py:meth:`~nptdms.TdmsFile.read` method will read all channel data immediately::

tdms_file = TdmsFile.read("my_file.tdms")

This will read all of the contents of the TDMS file, then groups within the file
can be accessed using the
:py:meth:`~nptdms.TdmsFile.groups` method, or by indexing into the file with a group name::
If using the :py:meth:`~nptdms.TdmsFile.open` method, only the file metadata will be read initially,
and the returned :py:class:`~nptdms.TdmsFile` object should be used as a context manager to keep
the file open and allow channel data to be read on demand::

with TdmsFile.open("my_file.tdms") as tdms_file:
# Use tdms_file
...

Using an instance of :py:class:`~nptdms.TdmsFile`, groups within the file
can be accessed by indexing into the file with a group name, or all groups
can be retrieved as a list with the :py:meth:`~nptdms.TdmsFile.groups` method::

all_groups = tdms_file.groups()
group = tdms_file["group name"]
all_groups = tdms_file.groups()

A group is an instance of the :py:class:`~nptdms.TdmsGroup` class,
and can contain multiple channels of data. You can access channels in a group with the
:py:meth:`~nptdms.TdmsGroup.channels` method or by indexing into the group with a channel name::
and can contain multiple channels of data.
You can access channels in a group by indexing into the group with a channel name
or retrieve all channels as a list with the :py:meth:`~nptdms.TdmsGroup.channels` method::

all_group_channels = group.channels()
channel = group["channel name"]
all_group_channels = group.channels()

Channels are instances of the :py:class:`~nptdms.TdmsChannel` class
and have a ``data`` attribute for accessing the channel data as a numpy array::
and act like arrays. They can be indexed with an integer index to retrieve
a single value or with a slice to retrieve all data or a subset of data
as a numpy array::

data = channel.data
all_channel_data = channel[:]
data_subset = channel[100:200]
first_channel_value = channel[0]

If the array is waveform data and has the ``wf_start_offset`` and ``wf_increment``
If the channel contains waveform data and has the ``wf_start_offset`` and ``wf_increment``
properties, you can get an array of relative time values for the data using the
:py:meth:`~nptdms.TdmsChannel.time_track` method::

Expand All @@ -36,13 +51,13 @@ you can pass ``absolute_time=True`` to get an array of absolute times in UTC.

A TDMS file, group and channel can all have properties associated with them, so each of the
:py:class:`~nptdms.TdmsFile`, :py:class:`~nptdms.TdmsGroup` and :py:class:`~nptdms.TdmsChannel`
classes provide access to these properties as a dictionary using their ``properties`` property::
classes provide access to these properties as a dictionary using their ``properties`` attribute::

# Iterate over all items in the file properties and print them
for name, value in tdms_file.properties.items():
print("{0}: {1}".format(name, value))

# Get a single property value
# Get a single property value from the file
property_value = tdms_file.property("my_property_name")

# Get a group property
Expand All @@ -55,34 +70,21 @@ Reading large files
-------------------

TDMS files are often too large to easily fit in memory so npTDMS offers a few ways to deal with this.

If you want to work with all file data as if it was in memory,
you can pass the ``memmap_dir`` argument when reading a file.
This will read data into memory mapped numpy arrays on disk,
and your operating system will then page data in and out of memory as required.

If you have a large file with multiple channels but only need to read them individually,
you can open a TDMS file for reading without reading all the data immediately
A TDMS file can be opened for reading without reading all the data immediately
using the static :py:meth:`~nptdms.TdmsFile.open` method,
then read channel data as required::
then channel data is read as required::

with TdmsFile.open(tdms_file_path) as tdms_file:
channel = tdms_file[group_name][channel_name]
channel_data = channel.read_data()

You also have the option to read only a subset of the data.
For example, to read 200 data points, beginning at offset 1,000::

with TdmsFile.open(tdms_file_path) as tdms_file:
channel = tdms_file[group_name][channel_name]
offset = 1000
length = 200
channel_data = channel.read_data(offset, length)

Alternatively, you may have an application where you wish to stream all data chunk by chunk.
:py:meth:`~nptdms.TdmsFile.data_chunks` is a generator that produces instances of
:py:class:`~nptdms.DataChunk`, which can be used after opening a TDMS file with
:py:meth:`~nptdms.TdmsFile.open`.
all_channel_data = channel[:]
data_subset = channel[100:200]

TDMS files are written in multiple segments, where each segment can in turn have
multiple chunks of data.
When accessing a value or a slice of data in a channel, npTDMS will read whole chunks at a time.
npTDMS also allows streaming data from a file chunk by chunk using
:py:meth:`nptdms.TdmsFile.data_chunks`. This is a generator that produces instances of
:py:class:`~nptdms.DataChunk`.
For example, to compute the mean of a channel::

channel_sum = 0.0
Expand All @@ -94,19 +96,27 @@ For example, to compute the mean of a channel::
channel_sum += channel_chunk[:].sum()
channel_mean = channel_sum / channel_length

This approach can also be useful to stream TDMS data to another format on disk or into a data store.
It's also possible to stream data chunks for a single channel using :py:meth:`~nptdms.TdmsChannel.data_chunks`::
This approach can be useful to stream TDMS data to another format on disk or into a data store.
It's also possible to stream data chunks for a single channel using :py:meth:`nptdms.TdmsChannel.data_chunks`::

with TdmsFile.open(tdms_file_path) as tdms_file:
channel = tdms_file[group_name][channel_name]
for chunk in channel.data_chunks():
channel_chunk_data = chunk[:]

In cases where you don't need to read the file data and only need to read metadata, you can
If you don't need to read the channel data at all and only need to read metadata, you can
also use the static :py:meth:`~nptdms.TdmsFile.read_metadata` method::

tdms_file = TdmsFile.read_metadata(tdms_file_path)

In cases where you need to work with large arrays of channel data as if all data was in memory,
you can also pass the ``memmap_dir`` argument when reading a file.
This will read data into memory mapped numpy arrays on disk,
and your operating system will then page data in and out of memory as required::

with tempfile.TemporaryDirectory() as temp_memmap_dir:
tdms_file = TdmsFile.read(tdms_file_path, memmap_dir=temp_memmap_dir)

Timestamps
----------

Expand All @@ -115,7 +125,7 @@ Note that TDMS files are capable of storing times with a precision of 2 :sup:`-6
so some precision is lost when reading them in npTDMS.

Timestamps in TDMS files are stored in UTC time and npTDMS does not do any timezone conversions.
If you would like to convert a time from a TDMS file to your local timezone,
If timestamps need to be converted to the local timezone,
the arrow package is recommended. For example::

import datetime
Expand All @@ -132,12 +142,22 @@ Scaled data
-----------

The TDMS format supports different ways of scaling data, and DAQmx raw data in particular is usually scaled.
The :py:attr:`~nptdms.TdmsChannel.data` property of the channel returns this scaled data.
You can additionally use the :py:attr:`~nptdms.TdmsChannel.raw_data` property to access the unscaled data.
The data retrieved from a :py:attr:`~nptdms.TdmsChannel` has scaling applied.
If you have opened a TDMS file with :py:meth:`~nptdms.TdmsFile.read`,
you can access the raw unscaled data with the :py:attr:`~nptdms.TdmsChannel.raw_data` property of a channel.
Note that DAQmx channels may have multiple raw scalers rather than a single raw data channel,
in which case you need to use the :py:attr:`~nptdms.TdmsChannel.raw_scaler_data`
property to access the raw data as a dictionary of scaler id to raw data array.

When you've opened a TDMS file with :py:meth:`~nptdms.TdmsFile.open`, you instead need to use
:py:attr:`~nptdms.TdmsChannel.read_data`, passing ``scaled=False``::

with TdmsFile.open(tdms_file_path) as tdms_file:
channel = tdms_file[group_name][channel_name]
unscaled_data = channel.read_data(scaled=False)

This will return an array of raw data, or a dictionary of scaler id to raw scaler data for DAQmx data.

Conversion to other formats
---------------------------

Expand Down
9 changes: 7 additions & 2 deletions docs/writing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,15 @@ instance of one of:
- :py:class:`nptdms.TdmsGroup` or :py:class:`nptdms.TdmsChannel`.
A TDMS object that was read from a TDMS file using :py:class:`nptdms.TdmsFile`.

Each of ``RootObject``, ``GroupObject`` and ``ChannelObject``
Each of :py:class:`~nptdms.RootObject`, :py:class:`~nptdms.GroupObject` and :py:class:`~nptdms.ChannelObject`
may optionally have properties associated with them, which
are passed into the ``__init__`` method as a dictionary.
The data types supported as property values are:

- Integers
- Floating point values
- Strings
- datetime objects
- datetime or numpy datetime64 objects
- Boolean values

For more control over the data type used to represent a property value, for example
Expand All @@ -63,10 +63,15 @@ is given below::
channel_object = ChannelObject("group_1", "channel_1", data, properties={})

with TdmsWriter("my_file.tdms") as tdms_writer:
# Write first segment
tdms_writer.write_segment([
root_object,
group_object,
channel_object])
# Write another segment with more data for the same channel
more_data = np.array([6.0, 7.0, 8.0, 9.0, 10.0])
channel_object = ChannelObject("group_1", "channel_1", more_data, properties={})
tdms_writer.write_segment([channel_object])

You could also read a TDMS file and then re-write it by passing
:py:class:`~nptdms.TdmsGroup` and :py:class:`~nptdms.TdmsChannel`
Expand Down

0 comments on commit 06bec71

Please sign in to comment.