# Overview

This notebook is intended is to walk you through the process of getting the ECR Viewer running locally with a fully populated database of dummy LA County data. There are several steps that need to be taken:

1. Clone the PHDI repo
2. Download the LAC zip file
3. Convert all of the XML data into FHIR
4. Start looking for errors

### 1. Clone the PHDI repo

If you've got SSH keys set with your GitHub account, then you can run `git@github.com:CDCgov/phdi.git`. Otherwise, you can run `https://github.com/CDCgov/phdi.git`. Either way, you should end up with a directory called `phdi`.

**NOTE**: you should clean up the sample data that comes with the repo by deleting the data in the `phdi/containers/ecr-viewer/seed-scripts/baseECR` and `phdi/containers/ecr-viewer/seed-scripts/baseECR`.

### 2. Download the LAC zip file

The data is saved [here](https://drive.google.com/file/d/17d10TmhGHT9fMF5sXONsLOrZTF9F6WWN/view?usp=drive_link) on Google Drive.

### 3. Convert all of the XML data into FHIR

There are several steps to complete this process:

1. Extract all of the zip files from the `LAC_DATA` zip archive.
2. For each zip file in the `LAC_DATA` archive, extract all of the files into a folder named after the zip file.
3. Fire up the FHIR conversion Docker container.
4. Loop through each folder and use the eICR and RR XML files to create a FHIR JSON file.

#### 0. Import necessary libraries

In [6]:
import pandas as pd
from lxml import etree
from pathlib import Path
from zipfile import ZipFile

#### 1. Extract all of the zip files from the `LAC_DATA` zip archive

In [7]:
infile = Path("./LAC_DATA.zip")
outfile = Path("./LAC_DATA")
with ZipFile(infile, "r") as zip_file:
    zip_file.extractall(path=".")

#### 2. For each zip file in the archive, extract all of the other files into a folder named after the zip file

In [9]:
all_files = list(outfile.glob("*.*"))
for f in all_files:
    with ZipFile(f, "r") as zip_file:
        # assumes you are in the phdi directory
        zip_file.extractall(path=f"./containers/ecr-viewer/seed-scripts/baseECR/{f.stem}/")

#### 3. Launch FHIR Conversion Docker container

Navigate to the `ecr-viewer` subdirectory in the `phdi` directory via `cd containers/ecr-viewer` in your terminal and run the following commands:

```bash
dockId=$(docker run --rm -d -it -p 8080:8080 "$(docker build -q ./../fhir-converter/)")
echo $dockId
```

As long as you see a value printed to the screen, your container should be up and running.

#### 4. Loop through each folder and extract the files

Run the following for loop in your terminal:

```bash
for d in ./baseECR/* ; do
    #first escape ", then /, and finally remove all new lines
    rr=$(sed -e 's/"/\\"/g ; s=/=\\\/=g ; $!s/$/\\n/' "$d/CDA_RR.xml" | tr -d '\n')
    eicr=$(sed 's/"/\\"/g ; s=/=\\\/=g ; $!s/$/\\n/'  "$d/CDA_eICR.xml" | tr -d '\n')
    resp=$(curl -l 'http://localhost:8080/convert-to-fhir' --header 'Content-Type: application/json' --data-raw '{"input_type":"ecr","root_template":"EICR","input_data": "'"$eicr"'","rr_data": "'"$rr"'"}')
    echo $resp | jq '.response.FhirResource' > "./fhir_data/$(basename $d).json"
done
```

To check that everything worked as expected, you can run `ls -lh ./seed-scripts/fhir-data/ | wc -l`, which should return a value of `1282`. 

In order to launch the ECR Viewer, you should simply need to run `docker compose up` from the `phdi/containers/ecr-viewer` directory. However, it should also fail. Seeding the database will fail because not all of the data successfully converted, so there are erroneous files.

**This is where the work needs to be picked up.**