# Reading and Writing Files

## HDF5

Scipp supports writing variables, data arrays, and dataset to [HDF5](https://portal.hdfgroup.org/documentation/index.html) files.
Reading of HDF5 is supported *only* for these scipp-specific files.
Other HDF5-based formats are not supported at this point.
For reading the HDF5-based [NeXus](https://www.nexusformat.org/) files, see [scippneutron](https://scipp.github.io/scippneutron/).

<div class="alert alert-warning">

**Warning**
    
We do not recommend to use Scipp HDF5 files for archiving or as the sole means of storing valuable data.
The current Scipp HDF5 schema is not a standard and will likely be subject to change due to the early development status of scipp.
**Future versions of Scipp may not be able to read older files.**
    
That being said, the file format is quite simple and based on the HDF5 standard so it would still be possible to recover data from such files in such a case.
Note that the Scipp version is stored as an HDF5 attribute of the saved objects.
    
</div>

In [None]:
import numpy as np
import scipp as sc

x = sc.Variable(dims=['x'], values=np.arange(10))
var = sc.Variable(dims=['x', 'y'], values=np.random.rand(9, 3))
a = sc.DataArray(data=var, coords={'x': x})

a.save_hdf5(filename='test.hdf5')

In [None]:
b = sc.io.load_hdf5(filename='test.hdf5')

In [None]:
b

## CSV

<div class="alert alert-info">

**Note**

CSV support requires [pandas](https://pandas.pydata.org/) which must be installed separately.
    
</div>

CSV files can be read into datasets with [scipp.io.load_csv](../generated/modules/scipp.io.csv.load_csv.rst).
For example, given the following CSV-encoded data can be read into a dataset as shown:

In [None]:
csv_content = '''a [m],b [s],c
1,5,9
2,6,10
3,7,11
4,8,12'''

In [None]:
from io import StringIO

ds = sc.io.load_csv(StringIO(csv_content), header_parser='bracket')
ds

This example uses `StringIO` to load the data directly from a string.
But `load_csv` can also load from a file on your hard drive or even from a remote server.
Simply pass the path or URL of the file as the first argument.
See also [pandas.read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html).

See [scipp.io.load_csv](../generated/modules/scipp.io.csv.load_csv.rst) for more options to customize how the data is structured in the dataset.

## Using pandas

The CSV reader shown above is a wrapper around `pandas.read_csv` and provides commonly used functionality.
But pandas supports many more file readers for, among others, Excel, JSON, and XML files.
See [pandas IO tools](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html) for a complete list.

It is possible to use pandas manually to load these files and then convert the result to a Scipp dataset using [from_pandas](../generated/modules/scipp.compat.pandas_compat.from_pandas.rst).
For example, JSON can be read as follows:

In [None]:
json = '''{"A [m]": {"0": 1, "1": 3, "2": 5},
"B [m/s]": {"0": 2, "1": 4, "2": 6}}'''

In [None]:
import pandas as pd

df = pd.read_json(json)
df

In [None]:
sc.compat.from_pandas(df, header_parser='bracket')

## NeXus

Scipp has no built-in support for loading [NeXus](https://www.nexusformat.org/) files.
However, the `scippneutron` package can internally use [Mantid](https://www.mantidproject.org) to load such files, or any other Mantid-supported file type, see [scippneutron](https://scipp.github.io/scippneutron/) and in particular [scippneutron.load_with_mantid](https://scipp.github.io/scippneutron/generated/functions/scippneutron.load_with_mantid.html).