In [21]:
from pathlib import Path

## exfor-parserpy

There are a few different EXFOR parsers and tools, but I think the easiest one to use is [exfor-parserpy](https://github.com/IAEA-NDS/exfor-parserpy), a python package that can parse EXFOR files into a JSON dictionary and write the JSON dictionary back to the EXFOR format.

## installing the package

You should clone the repo to your local machine:

(SSH)
```bash
git clone git@github.com:IAEA-NDS/exfor-parserpy.git
```

(HTTPS)
```bash
git clone https://github.com/IAEA-NDS/exfor-parserpy.git
```

Then move inside the directory and install with pip:


```bash
pip install -e .
```

The `.` means install the package in the current directory, and the `-e` will cause it to install in "editable mode" - python will pull from these source files instead of creating a static installation of the package somewhere else. With editable mode, you can make changes to these source files and use them elsewhere on your computer. Without editable mode, you will need to re-install the package after making changes to the source files.

You can also install the package through pip, but the source code will be hidden away in the python source directory

```bash
pip install git+https://github.com/iaea-nds/exfor-parserpy.git
```

### using the package

You should already have the EXFOR Master repo cloned on your computer. (If not, see [that tutorial](get_exfor_library.ipynb).) The `exfor_parserpy` package will read those files.

For file manipulation in Python, I prefer to the use the `pathlib` library, but you can also put file names and paths together as strings and use the `os` and `sys` libraries.

In [43]:
# ** put your own path here **
exfor_master_path = Path("/home/amanda/libraries/exfor_master/exforall")

# grab the exfor entry for the paper we're looking at
exfor_entry_path = exfor_master_path / "146" / "14686.x4"

the package will parse the EXFOR entry with the `read_exfor()` method

In [23]:
from exfor_parserpy import read_exfor

In [28]:
parsed_file = read_exfor(exfor_entry_path)

It is possible to have more than one entry in a file, so the top level of the dictionary is the entry numbers. 

In [29]:
parsed_file.keys()

dict_keys(['14686'])

In this case there is only one entry in the file, so the top level is unnecessary 

In [30]:
entry = parsed_file['14686']
entry.keys()

dict_keys(['14686001', '14686002'])

The next level is subentries. Subentry 001 always has just metadata about the experiment, it won't have data.

In [32]:
sub1 = entry['14686001'] 
sub1.keys()

dict_keys(['__entryid', '__subentid', 'BIB'])

The `exfor-parserpy` dictionary has two keys that aren't in the EXFOR file - `__entryid` and `__subentid`

In [36]:
sub1['__entryid'], sub1['__subentid']

('14686', '14686001')

Otherwise, the keys are the sections in the EXFOR file, which can be `BIB`, `COMMON`, and `DATA`. 

Subentry 001 always has `BIB` (bibliographic information about the experiment), never has `DATA`, and sometimes has `COMMON` (non-bibliographic data that is shared between the sub-entries).  

In this case there is only `BIB`

In [37]:
sub1['BIB']

{'TITLE': 'Validation of unresolved neutron resonance parameters\nusing a thick-sample transmission measurement',
 'AUTHOR': '(J.M.Brown,R.C.Block,A.Youmans,H.Choun,A.Ney,E.Blain,\nD.P.Barry,M.J.Rapp,Y.Danon)',
 'INSTITUTE': '(1USARPI)\n(1USAUSA) Naval Nuclear Laboratory, Schenectady,\n          New York',
 'REFERENCE': '(J,NSE,194,221,2020)',
 'FACILITY': '(LINAC,1USARPI) The the experiment was performed at the\n50 MeV electron RPI Linac with a neutron production\ntarget',
 'HISTORY': '(20210525C) Compiled by S.H.'}

This is just the text from the EXFOR file put directly into a python dictionary

In [44]:
file_text = exfor_entry_path.read_text().split("\n")
file_text[:15]

['ENTRY            14686   20210212   20220301   20220228       1488',
 'SUBENT        14686001   20210212   20220301   20220228       1488',
 'BIB                  6         12',
 'TITLE      Validation of unresolved neutron resonance parameters',
 '           using a thick-sample transmission measurement',
 'AUTHOR     (J.M.Brown,R.C.Block,A.Youmans,H.Choun,A.Ney,E.Blain,',
 '           D.P.Barry,M.J.Rapp,Y.Danon)',
 'INSTITUTE  (1USARPI)',
 '           (1USAUSA) Naval Nuclear Laboratory, Schenectady,',
 '                     New York',
 'REFERENCE  (J,NSE,194,221,2020)',
 'FACILITY   (LINAC,1USARPI) The the experiment was performed at the',
 '           50 MeV electron RPI Linac with a neutron production',
 '           target',
 'HISTORY    (20210525C) Compiled by S.H.']