# NMR Processing Overview

---

1. Split files into different categories.
    1. How many individual fids?
    2. How many array experiments?
    3. How are temperature sets stored?
    4. How are materials stored?
2. Develop / confirm metadata for those categories.
    + Cross reference with documentation provided by Trent.
    + Compare processing demo results to Trent's data. 
    + Meet with Trent to confirm assignments.
3. Prioritize subsets.
3. **Design Bokeh application**
4. Process subsets.

#### Set Local Data Path

---

Since the total available data is around 2 gb it may be stored in different locations on different machines. Define a base path to the data to simplify this.

In [1]:
# data_folder = '/home/tylerbiggs/data/Sep-2016-23Na'
data_folder = '/home/tyler/data/Sep-2016-23Na'

In [2]:
import nmrglue as ng
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import itertools
import glob
import re
import os
%matplotlib inline

In [3]:
# from trentnmr import *

# File Structure

---

From `tree -I *.fid` to find all non-fid directories.

```bash
└── Sep-2016-23Na
    ├── 23Na
    │   └── 27Al
    │       ├── 0808G1-0p15M-AlOH3-3M-NaOH-D2O
    │       ├── 0808G1-0p5M-AlOH3-3M-NaOH-D2O
    │       ├── 0808G1-1M-AlOH3-3M-NaOH-D2O
    │       ├── 0819G1-0p1M-AlOH3-3M-LiOH-D2O
    │       ├── 0819G1-0p5M-AlOH3-3M-KOH-D2O
    │       ├── 0819G1-0p5M-AlOH3-3M-LiOH-D2O
    │       ├── 0819G1-1M-AlOH3-3M-NaOH-D2O
    │       ├── background
    │       └── standard
    └── VT

```

Which seems like and error. Re-ordering to:

```bash
└── Sep-2016-23Na
    ├── 23Na
    ├── 27Al
    │   ├── 0808G1-0p15M-AlOH3-3M-NaOH-D2O
    │   ├── 0808G1-0p5M-AlOH3-3M-NaOH-D2O
    │   ├── 0808G1-1M-AlOH3-3M-NaOH-D2O
    │   ├── 0819G1-0p1M-AlOH3-3M-LiOH-D2O
    │   ├── 0819G1-0p5M-AlOH3-3M-KOH-D2O
    │   ├── 0819G1-0p5M-AlOH3-3M-LiOH-D2O
    │   ├── 0819G1-1M-AlOH3-3M-NaOH-D2O
    │   ├── background
    │   └── standard
    └── VT

```

## Glob Parent Folders

---

In [4]:
# Sodium folders.
VT   = os.path.join(data_folder, 'VT')
Na23 = os.path.join(data_folder, '23Na')

# Aluminum folders.
Al27 = os.path.join(data_folder, '27Al')
# Aluminum sub-paths.
sub_paths_strings = [
    "0808G1-0p15M-AlOH3-3M-NaOH-D2O",
    "0808G1-0p5M-AlOH3-3M-NaOH-D2O",
    "0808G1-1M-AlOH3-3M-NaOH-D2O",
    "0819G1-0p1M-AlOH3-3M-LiOH-D2O",
    "0819G1-0p5M-AlOH3-3M-KOH-D2O",
    "0819G1-0p5M-AlOH3-3M-LiOH-D2O",
    "0819G1-1M-AlOH3-3M-NaOH-D2O",
    "background",
    "standard"
]

Al_sub_paths = [os.path.join(Al27, p) for p in sub_paths_strings]

In [5]:
Al_sub_paths.append(Al27)
sodium_paths = [VT, Na23]

## Glob Helper Functions

---

In [21]:
array_glob = '/*arrays*.fid'
fid_glob = '/*.fid'
special_files = ['reference', 'REF', 'calibration', 'pwX90', 'static',
                 'spin-up', 'without-liquid']

def nmr_glob(path):
    arrays = {fn for fn in glob.iglob(path + array_glob, recursive=False)}
    fids = {fn for fn in glob.iglob(path + fid_glob, recursive=False)}
    
    other_fids = set()
    for f in fids:
        if any(sf in f for sf in special_files):
            other_fids.add(f)

    fids = fids - other_fids
            
    return [list(x) for x in [arrays, fids, other_fids]]


def trim_folder(folders):
    return ['/'.join(os.path.normpath(path).split(os.sep)[5:]) for path in folders]


def process_group(path_list):
    array, fid, other = list(), list(), list()
    for path in path_list:
        a, f, o = nmr_glob(path)
        if a: array.append(a)
        if f: fid.append(f)        
        if o: other.append(o)
        
    return array, fid, other

## Running the Globs

---

In [22]:
process_group(Al_sub_paths)

([['/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-5th-2660Hz-140C-3arrays-600ct.fid',
   '/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-3rd-2740Hz-25C-105C-10arrays-128ctF.fid',
   '/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-4th-2740Hz-105C-132C-10arrays-128ctF.fid',
   '/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-concentrated-Gibbsite-2767Hz-25C-135C-10arrays-128ctF.fid',
   '/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-5th-2762Hz-25C-73arrays-600ct.fid',
   '/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-concentrated-Gibbsite-2767Hz-135C-170C-8arrays-128ctF.fid',
   '/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-3rd-2740Hz-25C-105C-10arrays-128ct.fid',
   '/home/tyler/data/Sep-2016-23Na/27Al/0808G1-0p15M-AlOH3-3M-NaOH-D2O/27Al-5th-2762Hz-25C-74arrays-600ct.fid',
   '/home/tyler/data/Sep-2016-23Na/27A