# Intro


## Goal
**WHAT**: Automatic report generation from Hamilton measurements.  
**WHY**: Speed up the report generation, and avoid human errors (copying data, subjective evaluation, ....)

## Tools
Fast iteration in an agile way.  
Generic approach - different plates setup, prameters, ... all with the same code, no changes needed.  

**Python** programming language.  
**jupyter** notebook is currently used, with some functions divided into small modules.  
**Visual Studio Code** IDE (Integrated Development Environment).  
**Markdown** (*.md) format for generated report (Simple, humanly redable).  

## Input:
 - Worklist file path (*.xls) as used for Hamilton input.
   - Sample name
   - Dilution
   - Viscosity
 - Analysys measurement results file path (*.xls) as output from Hamilton.
 - Parameters; constants in code (file path *.json)
   - CV (Coefficient of variation) threshold
   - Referennce value (1.7954e+10 cp/ml)
   - Dilutions [1.0, 2.0, 4.0, 8.0, 16.0, 32.0, 64.0]
   - Decimal digits for output

## Output:
  - Report (*.md, printable to pdf)
    - Could be manually edited
    - Image files
    - Result sheets
  - Estimated size <2kB (current)

## Done
  - Invalid sample:
    - CV >THRESHOLD
    - Only one point
  - Parameters file (*.scv, *.json)
  - Multiple plates (in worklist file)
  - Modules
  - Running modes
    - Python script - automatic run (command line with parameters)

## TODO:
  - Finalize the report
    - 2 decimal places
  - Running modes
    - GUI; use modules to crete an App (code remains the same, but used from GUI)
  - Tests (unit, integration)
  - checksum (*.sdax); put into report
  - Extensive testing...
  - Automatic print to *.pdf ?
  - md2pdf

## Conclusion
End to end evaluation time reduction approximately 2h -> 20min per measurement. (thx Felix)


# Generate report  - POC

[AV9 data folder](<../../Users/hwn6193/OneDrive - Takeda/General - Gene Therapy Analytics (AD+PA)/3_Teams/3.1_Protein_Quantification/_AAV9 Capsid ELISA>)

## Review bugs
### TODO

### Fixed
- mask sample point(s) if `CV>CV_THRESHOLD` and `valid sample_poitns <= MIN_VALID_SAMPLE_POINTS` (Igor)
- `CV[%]` one `{:.1f}` decimal digit (Felix)
- `Result [cp/ml]` three `{:.3e}` (Felix)
- `nan` -> `NA` (Felix)
- control sample image line ending (Sebastian)
- `CV[%]` column format to 2 decimal digits with trailing zeroes (Sebastian/Robert)
- Fit parameter description https://teams.microsoft.com/l/message/19:4ba886dcae16442f802adcc65edc04bb@thread.v2/1688557386620?context=%7B%22contextType%22%3A%22chat%22%7D (Felix)

## Imports

In [1]:
VERBOSE_NOTEBOOK = False
WARNING_DISABLE = True
DEBUG = False

In [2]:
from os import path
import warnings
from scipy.optimize import OptimizeWarning

if WARNING_DISABLE:
    warnings.simplefilter('ignore', RuntimeWarning)
    warnings.simplefilter('ignore', OptimizeWarning)
    warnings.filterwarnings('ignore', category=UserWarning, module='openpyxl')

In [3]:
from mkinout import make_input_paths
WORKING_DIR = './reports/230426_AAV9-ELISA_igi_GN004240-033'

input_files = make_input_paths(WORKING_DIR)
WORKLIST_FILE_PATH = input_files['worklist']
PARAMS_FILE_PATH = input_files['params']

DATA_DIR = './data'

## Layouts

In [4]:
from readdata import read_layouts

PLATE_LAYOUT_ID = 'plate_layout_ident.csv'
PLATE_LAYOUT_NUM = 'plate_layout_num.csv'
PLATE_LAYOUT_DIL_ID = 'plate_layout_dil_id.csv'


g_lay = read_layouts(path.join(DATA_DIR, PLATE_LAYOUT_ID),
                     path.join(DATA_DIR, PLATE_LAYOUT_NUM),
                     path.join(DATA_DIR, PLATE_LAYOUT_DIL_ID))

if VERBOSE_NOTEBOOK:
    display(g_lay)

## Worklist

In [5]:
from worklist import read_worklist, check_worklist
from readdata import read_params

g_wl_raw = read_worklist(WORKLIST_FILE_PATH)
g_valid_plates = check_worklist(g_wl_raw)
g_params = read_params(PARAMS_FILE_PATH)

## Dilution to Concentration

Define dilution dataframe. The dataframe is indexed according plate layout, index of refference dataframe corresponds to refference of the `plate_layout_dil`.

In [6]:
# TODO: read reference value from parameters
REF_VAL_MAX = 1.7954e+10
DILUTIONS = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0, 64.0]

from sample import make_concentration
g_reference_conc = make_concentration(REF_VAL_MAX, DILUTIONS)

if VERBOSE_NOTEBOOK:
    display(g_reference_conc)

## Report generation

In [7]:
from reportmain import report_plate, check_report_crc
from mkinout import make_output_paths, basename_from_inputdir

def gen_report(valid_plates, worklist, params, layout, reference_conc,
               working_dir, base_name):
    reports = []
    for plate in valid_plates:
        print('Processing plate {} of {}'.format(plate, len(valid_plates)))

        output_files = make_output_paths(working_dir, base_name, plate)
        analysis_file_path = output_files['analysis']
        report_file_path = output_files['report']
        report_dir = path.dirname(path.abspath(report_file_path))
        md = report_plate(plate, worklist, params, layout,
                    reference_conc, analysis_file_path, report_dir, report_file_path
                    )
        reports.append({'md': md, 'path': report_file_path})
    return reports

reports = gen_report(g_valid_plates, g_wl_raw, g_params, g_lay, g_reference_conc,
    WORKING_DIR, basename_from_inputdir(WORKING_DIR))

Processing plate 1 of 2


100%|██████████| 21/21 [00:11<00:00,  1.86it/s]


Report for plate 1 saved as ./reports/230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.md
Generating Word ./reports/230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.docx for ./reports/230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.md
Processing plate 2 of 2


100%|██████████| 21/21 [00:10<00:00,  1.92it/s]


Report for plate 2 saved as ./reports/230426_AAV9-ELISA_igi_GN004240-033\results_plate_2\230426_GN004240-033_-_report_plate_2.md
Generating Word ./reports/230426_AAV9-ELISA_igi_GN004240-033\results_plate_2\230426_GN004240-033_-_report_plate_2.docx for ./reports/230426_AAV9-ELISA_igi_GN004240-033\results_plate_2\230426_GN004240-033_-_report_plate_2.md


<Figure size 640x480 with 0 Axes>

In [13]:
CHECK_REPORT_CRC = True
REPORT_PLATES_CRC = [1094899247, 4030313479]
if CHECK_REPORT_CRC:
    for report, crc in zip(reports, REPORT_PLATES_CRC):
        try:
            # check_report_crc(report['md'], crc)
        except Exception as e:
            print('{} for {}'.format(e, report['path']))

Use pandoc to convert markdown to Word.

In [11]:
PDFLATEX_EXE = 'c:/Users/hwn6193/AppData/Local/Programs/MiKTeX/miktex/bin/x64/pdflatex.exe'
REFERENCE_DOCX = 'C:/work/report-gen/custom-reference.docx'
for report in reports:
    report_file_path = path.abspath(report['path'])
    report_dir = path.dirname(path.abspath(report_file_path))
    docx_path = path.splitext(report_file_path)[0] + '.docx'
    print('Generating Word {} for {}'.format(docx_path, report_file_path))
    ! c:/work/pandoc/pandoc -o {docx_path} -f markdown -t docx --resource-path {report_dir} --reference-doc {REFERENCE_DOCX} {report_file_path}

    pdf_path = path.splitext(report_file_path)[0] + '.pdf'
    print(f'Generating PDF {pdf_path} for {report_file_path}')
    ! c:/work/pandoc/pandoc -s -o {pdf_path} --resource-path {report_dir} --pdf-engine {PDFLATEX_EXE} {report_file_path}

Generating Word c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.docx for c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.md
Generating PDF c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.pdf for c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.md
Generating Word c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_2\230426_GN004240-033_-_report_plate_2.docx for c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_2\230426_GN004240-033_-_report_plate_2.md


pdflatex: major issue: So far, you have not checked for MiKTeX updates.
pdflatex: major issue: So far, you have not checked for MiKTeX updates.
pandoc: c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_1\230426_GN004240-033_-_report_plate_1.pdf: withBinaryFile: permission denied (Permission denied)


Generating PDF c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_2\230426_GN004240-033_-_report_plate_2.pdf for c:\work\report-gen\reports\230426_AAV9-ELISA_igi_GN004240-033\results_plate_2\230426_GN004240-033_-_report_plate_2.md


pdflatex: major issue: So far, you have not checked for MiKTeX updates.
pdflatex: major issue: So far, you have not checked for MiKTeX updates.


In [10]:
# ! c:/work/pandoc/pandoc -o custom-reference.docx --print-default-data-file reference.docx