# LOQ COMPLIANCE PROGRAM NOTEBOOK

Printing hello world as a test (to check out it exports usable code)

In [None]:
print("Hello World")

Exporting works, so now to list the functions of the program to be created, in order from the most basic to all of the bells and whistles. The most basic options (points 1-XX) are what will be necessary for a "successful" program. The options beyond that are the "bells and whistles".

### The "Must Haves":

Input by user:

1. Total Diet Study `.txt` file, which is tab-delimited and format it to allow for analysis
   
2. "Analyte Type", which will be the column containing the "Analyte" that is desired for output
   
3. "Analyte", which is what is being measured in the Total Diet Study
   
4. Optional: Food Number (ignore if not provided)
   
5. Optional: New Cut off concentration (default = 0)

The program must take all of the above files and parameters and output a new `.txt` file containing only rows that have the Analyte requested.

Additional, "behind the scenes", processing.

1. Do NOT include any rows that contain "RAP" (generally in the "Sample Qualifier" column)
   
2. Check the provided LOQ (limit of quantitation) column and compare it with the "Conc" (concentration) observed column. If the LOQ is GREATER than the Conc, then the row is not included in the output, as any values below the LOQ cannot be certain.
   
3. If no new cut off concentration is provided (item #5 from user input), then the program will default to "zero tolerance", meaning anything above zero and the LOQ will be output. If a new cut off IS provided, then this will output anything above the new cut off that is also above the LOQ.
   
4. If the requested new cut off concentration is less than the LOQ, a warning will be printed "Your requested concentration cut off is less than the LOQ provided for your requested analyte"

### The "Bells and Whistles":

1. Have the option to disallow the progression of the run if the requested concentration cut off is less than the LOQ provided (item #4 in the "Behind the Scenes" processing.

2. Find and convert any units in the "Unit" column that are not `mg/kg`.

In [3]:
# testing to see what this notebook is running
from platform import python_version
print(python_version())

3.7.3


## THE CODE BEGINS

#### Import packages and parse/define arguments

In [4]:
# import the necessary packages
import csv
from pandas import DataFrame
import argparse
import sys

In [7]:
# Parsing arguments with argparse
parser = argparse.ArgumentParser(description = 'This script allows data selection for various requested analytes from Total Diet Studies at the FDA.')
parser.add_argument('--file', required=True, help='The Total Diet Study file to be analyzed.')
parser.add_argument('--analyte', required=True, help='The analyte that is to be extracted, e.g. Arsenic.')
parser.add_argument('--type', required=True, help='The type of analyte that the Total Diet Study input file is measuring, e.g. Element.')
parser.add_argument('--out', required=True, help='The directory where you want output files written.')
parser.add_argument('--number', required=False, help='optional: The Food Number associated with a specific food.')
parser.add_argument('--cutoff', required=False, help='optional: Specifiy a new cut-off concentration, default=0.')
args = parser.parse_args()

usage: ipykernel_launcher.py [-h] --file FILE --analyte ANALYTE --type TYPE
                             --out OUT [--number NUMBER] [--cutoff CUTOFF]
ipykernel_launcher.py: error: the following arguments are required: --file, --analyte, --type, --out


SystemExit: 2

In [8]:
# Printing the arguments; this is strictly for testing to make sure argparse worked
print(args)

NameError: name 'args' is not defined

Tested argparse by generating `test_argparse.py` in the `tests` folder of the project. Used it by running the following command:


`python test_argparse.py --file ~/Desktop/Python_program/Individual\ Year\ Analytical\ Results_0/Elements_2003.txt --analyte Arsenic --type Element --out ~/Desktop/Python_program/LOQ_Compliance/tests/`


And retrieved the following output:

`Namespace(analyte='Arsenic', cutoff=None, file='/Users/brittany.ott/Desktop/Python_program/Individual Year Analytical Results_0/Elements_2003.txt', number=None, out='/Users/brittany.ott/Desktop/Python_program/LOQ_Compliance/tests/', type='Element')`

So, the next step is being able to process the `.txt` file containing our data.

#### Processing the input file of data