# Introduction to `ELAN_Data`
***
[Alejandro Ciuba](https://alejandrociuba.github.io), alejandrociuba@pitt.edu
***
## Introduction
This is the first of several notebooks which will get the reader familiarized with the Python package `ELAN_Data` and its various capabilities. For these tutorials, we will utilize the Corpus of Regional African American Language ([CORAAL](http://lingtools.uoregon.edu/coraal/)); specifically, we will use the `ATL_elanfiles_2020.05.tar.gz`. Please download and unzip this dataset into the dataset folder.

## Table of Contents
1. [The Basics](#the-basics)
2. Loading in Files
3. `ELAN_Data`
4. Advanced `ELAN_Data`
5. `elan_utils`
***
## Necessary Imports

In [30]:
from elan_data import ELAN_Data
from pathlib import Path

import re

import pandas as pd

***
## The Basics
The `elan_data` package is made of two primary Python files structured as follows:

```
src
└── elan_data
    ├── elan_utils.py
    ├── __init__.py
    └── py.typed
```

### Files
| Name | Description |
| :--- | ----------- |
| `elan_utils.py` | Contains helper methods that work with `ELAN_Data` and are planned to be adaptable to be more general purpose tools as well. |
| `__init__.py` | Contains the main `ELAN_Data` code. |
| `py.typed` | This literally does nothing but is needed so our testing framework won't be mad. >:( |

### Simple Code
To start using `elan_data` simply import its main module, `ELAN_Data` and either load-in or create an instance. To load-in a pre-existing file:

In [32]:
# Load in ATL_se0_ag1_f_01_1.eaf from the downloaded dataset
# We recommend using Path to store and modify file paths; it's an official Python package and very robust
FILE = "./datasets/ATL_se0_ag1_f_01_1.eaf"
PATH = Path(FILE)

print(f"Path to .eaf file is {PATH}")

# Load in as an ELAN_Data instance
eaf = ELAN_Data.from_file(file=PATH)

print(eaf)

Path to .eaf file is datasets/ATL_se0_ag1_f_01_1.eaf


AssertionError: 