## LIDOInspector Tutorial

### Authors
Rodion Lischnewski

### Introduction
NFDInspector is designed to facilitate the inspection of formal quality issues pertaining to research data. It is currently compatible with the LIDO and EAD metadata standards. The project has been funded by the “4Memory Incubator Funds” of the NFDI4Memory consortium and is being developed and maintained by the Montanhistorisches Dokumentationszentrum (montan.dok) of the Deutsches Bergbau-Museum Bochum.

### Target Group
This tutorial is aimed at new users of the NFDInspector. The aim is to make it easier to get started using the tool based on a use case presented here. A basic understanding of Python programming is required to use the tool independently. Due to the open source architecture, the tool can be integrated into other programs, tools and applications. In its stand-alone form, the tool is used via a command line interface.

### Requirements
It is assumed that a current version of Python is installed on the machine. No further libraries are required for the basic functions. Nevertheless, the installation of packages that are required for further use is recommended:
+ __[pandas](https://pandas.pydata.org/)__
+ __[numpy](https://numpy.org/)__
+ __[lxml](https://lxml.de/)__
+ __[json](https://docs.python.org/3/library/json.html)__

NFDInspector uses following libraries:
+ os
+ json
+ csv
+ re
+ datetime
+ lxml

### Learning goals
* [Installation](#installation)
* [Read metadata records from different sources](#read-metadata-records-from-different-sources)
* [Customize inspection configuration](#configuration)
* [Carry out inspection](#inspection)
* Process or output the results

### Installation
The NFDInspector package includes modules for the inspection of LIDO-xml and EAD-xml formats as standard. To install NFDInspector using pip on macOS or Linux, run:

In [None]:
python3 -m pip install nfdinspector

To install with pip under Windows, run:

In [None]:
py -m pip install nfdinspector

### Import and initialize:
You can import the ``LIDOInspector()``, a class from the ``nfdinspector.lido_inspector`` module.
While initiliazing the LIDOInspector, you can specify the language for the error messages. Though currently only ``'en'`` and ``'de'`` are available. In our case we will stick to english output.

In [1]:
from nfdinspector.lido_inspector import LIDOInspector

lido_inspector = LIDOInspector(error_lang='en')

### Read metadata records from different sources

The easiest way to ingest metadata is from a standalone ``.xml`` file. XML files can also contain more than one LIDO-object. The NFDInspector can destinguish between LIDO-objects in one file.

In [2]:
file_path = '../nfdinspector_tutorials/LIDO_xml/23310.xml'
lido_inspector.read_lido_file(file_path)

### Read metadata from multiple sources

You can read several files by specifying the folder path containing the xml files.

In [3]:
lido_inspector.read_lido_files('../nfdinspector_tutorials/LIDO_xml')

Alternatively, you can parse LIDO-XML directly from a string and forward it to the Inspector by using the function ``.read_lido()``.
This is useful if you are implementing the NFDIsnpector functionality into a larger workflow.

In [None]:
lido_inspector.read_lido(lido_xml_string)

### Configuration
NFDInspector offers the ability to customise the inspection to your specific needs. If no special configuration is specified, the built-in configuration will be used. 
Customisation is usually done via a configuration file. Configurations can be exported and imported in JSON format.
The default configuration is variable and can be changed with package updates. To view the current default configuration, it can be retrieved from LIDOInspector by calling the ``.configuration`` property. The dictionary returned lists the default configuration

In [18]:
lido_config = lido_inspector.configuration

The following script will store the configuration in JSON format in the ``config_path`` location.

In [22]:
import json

config_path = 'lido_config.json'
with open(config_path, 'w') as outfile:
    json.dump(lido_config, outfile, indent=4)

The ``.config_file()`` function can be used to import configuration files from JSON files with compatible syntax.

In [20]:
lido_inspector.config_file(config_path)

We will take a closer look at the configuration options later on. For now, let's take a look at the basic functions.

### Inspection
First we read the xml LIDO files by passing the path to the ``.read_lido_files()`` function. The data is then inspected based on the settings in the configuration. This is done by calling the ``.inspect()`` function.<br>


In [4]:
lido_inspector.inspect()

The results of the inspection are saved under ``LIDOInspector.inspections`` and can be accessed from there in JSON format