# Binary Formats

Last time, we worked directly with byte-level data.  Input and output of binary data is reasonably straightforward given a known structure.  It is common for commercial software to use [proprietary data formats](https://en.wikipedia.org/wiki/Proprietary_software#Proprietary_file_formats_and_protocols), and [hundreds of formats](https://en.wikipedia.org/wiki/List_of_file_formats) exist for a wide variety of applications.  In the scientific community, a variety of [formats](https://en.wikipedia.org/wiki/List_of_file_formats#Scientific_data_(data_exchange)) were developed to facilitate the exchange of large, scientific data sets, including the popular [HDF5 format](https://portal.hdfgroup.org/display/HDF5/HDF5).  Other formats have been developed specifically to support visualization of large data sets; of these, we'll explore the [VTK formats](https://lorensen.github.io/VTKExamples/site/VTKFileFormats/).  Although the HDF5 and VTK formats have very different purposes, their specifications are well developed and serve as good examples for those navigating the use of such third-party formats.

## HDF5

Of the scientific-data  developed, perhaps the [Hierarchical Data Format](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) is the most widely used.

In [4]:
import h5py
import numpy as np

name =  'mytestfile.hdf5'
f = h5py.File('mytestfile.hdf5', 'w')

# some "attributes" for the "root" group
f.attrs['file_name'] = 'mytestfile.hdf5'
f.attrs['author'] = 'roberts'

# make a new group and give it an attribute
day1 = f.create_group('day1')
day1.attrs['note'] = 'Experiments from day 1'
# add some datasets
day1.create_dataset("array", data=np.random.rand(10))
f.close()

OSError: Unable to create file (unable to truncate a file which is already open)

## VTK