<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Basic File I/O

In [0]:
#| echo: false
#| output: asis
show_doc(read_text_file)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L11){target="_blank" style="float:right; font-size:smaller"}

### read_text_file

>      read_text_file (filename:str)

*generic, read any text file*

In [0]:
#| echo: false
#| output: asis
show_doc(read_recc_info)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L17){target="_blank" style="float:right; font-size:smaller"}

### read_recc_info

>      read_recc_info (info_file:str)

*read a text file of info on the reviewer*

In [None]:
recc_info = read_recc_info("../example/recc_info.txt") 
print(recc_info)

Reccomender Name: Teacher Person 
Title: Professor of Cleverness 

Address: 
Department of Curiosities
Generic University 
1337 Generic Pl. 
Springfield, WA 31416 USA

Phone: 555-123-1337
Email: teacher.person@generic.edu



In [0]:
#| echo: false
#| output: asis
show_doc(read_urls_file)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L22){target="_blank" style="float:right; font-size:smaller"}

### read_urls_file

>      read_urls_file (urls_file:str)

*read a text file where each line is a url of a submission site*

In [None]:
urls = read_urls_file("../example/sample_urls.txt") 
print(f"{len(urls)} urls in list")
for i, url in enumerate(urls): 
    print(f"{i+1} of {len(urls)}: {url}")

1 urls in list
1 of 1: http://localhost:8000/sample_form.html


In [0]:
#| echo: false
#| output: asis
show_doc(read_pdf_text)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L31){target="_blank" style="float:right; font-size:smaller"}

### read_pdf_text

>      read_pdf_text (pdf_file)

In [None]:
letter_text = read_pdf_text("../example/sample_letter.pdf")
print(letter_text)

   Dear Graduate Admissions Committee,  I am writing to recommend Student Person for admission to your graduate program. Having worked closely with them for two years in both teaching and research capacities, I can say they are among the strongest students I have encountered in over a decade of academic work.  Student Person took several of my advanced courses — Quantum Rollercoasters, Physics of Impossible Machines, and a seminar on Neural Networks for Curious Minds. They also worked with me on an independent research project. In every setting, they showed sharp intellectual ability, creative thinking, and real persistence. Their coursework went beyond surface-level competence; they clearly grasped the deeper principles at play. As a researcher, they brought fresh perspectives while staying receptive to guidance.  What stands out most is their dependability. They consistently met deadlines and produced high-quality work. During our independent project, they actually moved ahead of sch

## Parsing HTML (Form) Page

In [0]:
#| echo: false
#| output: asis
show_doc(scrape_form_fields)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L40){target="_blank" style="float:right; font-size:smaller"}

### scrape_form_fields

>      scrape_form_fields (html)

*Extract all fillable form fields from HTML*

In [None]:
html = read_text_file("../example/sample_form.html") 
fields = scrape_form_fields(html) 
[f['id'] for f in fields][:20]

['applicant_name',
 'applicant_message',
 'ferpa_waiver',
 'ferpa_waiver',
 'program',
 'discipline',
 'prefix',
 'first_name',
 'middle_name',
 'last_name',
 'organization',
 'title',
 'phone',
 'email',
 'addr1',
 'addr2',
 'city',
 'state',
 'zip',
 'country']

## LLM Usage
Next we prompt the LLM to figure out which form fields apply, and how: 

In [0]:
#| echo: false
#| output: asis
show_doc(make_prompt)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L65){target="_blank" style="float:right; font-size:smaller"}

### make_prompt

>      make_prompt (fields, recc_info, letter_text)

*build the prompt that will go to the LLM*

In [0]:
#| echo: false
#| output: asis
show_doc(get_field_mappings)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L84){target="_blank" style="float:right; font-size:smaller"}

### get_field_mappings

>      get_field_mappings (fields, recc_info, letter_text,
>                          model='ollama/qwen3:0.6b', debug=False)

*Use LLM to map recommender info and letter to form fields*

|    | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| fields |  |  | list of form fields |
| recc_info |  |  | info on recommending person |
| letter_text |  |  | text of recc letter |
| model | str | ollama/qwen3:0.6b | LLM choice, e.g. "claude-sonnet-4-20250514", |
| debug | bool | False |  |

## Filling in the Form

In [0]:
#| echo: false
#| output: asis
show_doc(get_element_info)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L108){target="_blank" style="float:right; font-size:smaller"}

### get_element_info

>      get_element_info (page, field_id)

*given an id or a name, find the element on the page and get its info*

In [0]:
#| echo: false
#| output: asis
show_doc(should_skip)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L117){target="_blank" style="float:right; font-size:smaller"}

### should_skip

>      should_skip (elem, tag, input_type, skip_prefilled)

*should we fill in this element? Not if there's already a value there.*

In [0]:
#| echo: false
#| output: asis
show_doc(fill_element)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L125){target="_blank" style="float:right; font-size:smaller"}

### fill_element

>      fill_element (elem, tag, input_type, field_id, value)

*actually fill in this element*

In [0]:
#| echo: false
#| output: asis
show_doc(fill_form)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L137){target="_blank" style="float:right; font-size:smaller"}

### fill_form

>      fill_form (page, mappings, skip_prefilled=True, debug=False)

*Fill form fields using Playwright*

In [0]:
#| echo: false
#| output: asis
show_doc(upload_recommendation)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L158){target="_blank" style="float:right; font-size:smaller"}

### upload_recommendation

>      upload_recommendation (page, file_path)

*Upload the recommendation PDF*

In [0]:
#| echo: false
#| output: asis
show_doc(process_url)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L164){target="_blank" style="float:right; font-size:smaller"}

### process_url

>      process_url (page, url, recc_info, letter_text, pdf_path, model,
>                   debug=False)

*Process a single recommendation URL*

# `formalyzer` CLI script

In [0]:
#| echo: false
#| output: asis
show_doc(read_info)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L188){target="_blank" style="float:right; font-size:smaller"}

### read_info

>      read_info (recc_info:str, pdf_path:str, urls:str)

*parse CLI args and read input files*

In [0]:
#| echo: false
#| output: asis
show_doc(setup_browser)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L204){target="_blank" style="float:right; font-size:smaller"}

### setup_browser

>      setup_browser ()

*Connect to Chrome with remote debugging*

In [0]:
#| echo: false
#| output: asis
show_doc(run_formalyzer)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L214){target="_blank" style="float:right; font-size:smaller"}

### run_formalyzer

>      run_formalyzer (recc_info, letter_text, urls, pdf_path, model,
>                      debug=False)

*Main async workflow*

In [0]:
#| echo: false
#| output: asis
show_doc(main)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L231){target="_blank" style="float:right; font-size:smaller"}

### main

>      main (recc_info:str, pdf_path:str, urls:str, model:str='ANTHROPIC',
>            debug:bool=False)

|    | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| recc_info | str |  | text file with recommender name, address, etc |
| pdf_path | str |  | name of PDF recc letter |
| urls | str |  | txt file w/ one URL per line |
| model | str | ANTHROPIC | 'ollama/qwen3:0.6b' for local model |
| debug | bool | False | best to always turn this on, actually |

In [None]:
#main("~/recc_info.txt", "~/recc_letter.pdf","~/recc_urls.txt", debug=True)