# 2b. Using yadg in Python via `yadg.extractors.extract()`

The functionality that is accessed when using the [command line `yadg extract`](02a_yadg_extract.ipynb) can be conveniently accessed from within Python, using the `yadg.extractors.extract()` function. The [documentation of this function](https://dgbowl.github.io/yadg/5.1/apidoc/yadg.extractors.html#yadg.extractors.extract) is embedded below:

In [1]:
%%html 
<iframe src="https://dgbowl.github.io/yadg/5.1/apidoc/yadg.extractors.html#yadg.extractors.extract" width="100%" height="300"></iframe>

## 2b.2 Extracting data from a BioLogic `.mpr` file
In analogy with the command line example in the previous section of the tutorial, we are extracting some PEIS data from the `extract/peis.mpr` file. The returned object will be a `DataTree`. Note that we are letting yadg use the default options for `timezone` (i.e. `localtime`), `encoding` (i.e. `utf-8`), and `locale` (i.e. `LC.NUMERIC` or `en_GB`, if unset). When extracting `mpr` files, only the `timezone` option has any effect. When using other *Extractors* (e.g. `eclab.mpt` or `basic.csv`), the setting appropriate `locale` and `encoding` may be necessary.

In [None]:
import yadg
yadg.extractors.extract(filetype="eclab.mpr", path="extract/peis.mpr")

We obtain the equivalent object to our previous example, with one root node that contains one dimension (`uts`) and 48 data variables.

## 2b.3 Extracting metadata from a PANalytical `.xrdml` file
To extract data from PANalytical `.xrdml` files, we need to pass `panalytical.xrdml` as the `filetype` argument and the correct path:

In [None]:
dt = yadg.extractors.extract(filetype="panalytical.xrdml", path="extract/scans.xrdml")
dt

However, the above object contains not only the metadata, but also the data. Currently, there is no simple way to ask for just the metadata of a `DataTree` object. As workaround, one can traverse the tree and use the `.to_dict(data=False)` method on each tree node (which contains a `xarray.Dataset`):

In [None]:
from IPython.display import JSON
metatree = {k: v.to_dict(data=False) for k, v in dt.to_dict().items()}
JSON(metatree)

The `metatree` object created here is a `dict` that is equivalent to the `JSON` created using `yadg extract --meta-only` in the previous chapter.

[Back to index](index.ipynb)