<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

```markdown
formalyzer: 

Reads PDF reccomendation letter, fills in admissions form(s)

usage: 
  formalyzer <recc_letter.pdf> <url_list.txt>

Instead of url_list.txt, a single URL can be given (esp. for testing purposes) 

Description: 
Formalyzer will scrape the text from the PDF recc letter, 
and for each URL in url_list, it will: 
- launch a browser tab for that url 
- fill in the form using what the LLM has gleaned from the recc letter
- attach the PDF via the form's upload/attachment button
...and do no more. 
The user will need to review the page and press the Submit button manually.


Requirements: 
- Playwright 
- ANTHROPIC_API_KEY env var. (Could support other LLMs layer)
- pypdf  

Author: Scott H. Hawley, @drscotthawley
```


In [0]:
#| echo: false
#| output: asis
show_doc(read_recc_info)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L10){target="_blank" style="float:right; font-size:smaller"}

### read_recc_info

>      read_recc_info (info_file:str)

*read a text file of info on the reviewer*

In [None]:
recc_info = read_recc_info("~/recc_info.txt") 
recc_info

'Reccomender Name: Scott H. Hawley \nTitle: Professor of Physics \n\nAddress: \nBelmont University \n1900 Belmont Blvd \nNashville, TN 37211\n\nPhone: 615-460-6206\nEmail: scott.hawley@belmont.edu\n'

In [0]:
#| echo: false
#| output: asis
show_doc(read_urls_file)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L16){target="_blank" style="float:right; font-size:smaller"}

### read_urls_file

>      read_urls_file (urls_file:str)

*read a text file where each line is a url of a submission site*

In [None]:
urls = read_urls_file("~/recc_urls.txt") 
print(f"{len(urls)} urls in list")

11 urls in list


In [0]:
#| echo: false
#| output: asis
show_doc(read_pdf_text)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L26){target="_blank" style="float:right; font-size:smaller"}

### read_pdf_text

>      read_pdf_text (pdf_file)

In [None]:
letter_text = read_pdf_text("~/recc_letter.pdf")
#letter_text

In [0]:
#| echo: false
#| output: asis
show_doc(scrape_form_fields)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L35){target="_blank" style="float:right; font-size:smaller"}

### scrape_form_fields

>      scrape_form_fields (html)

*Extract all fillable form fields from HTML*

In [0]:
#| echo: false
#| output: asis
show_doc(get_field_mappings)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L62){target="_blank" style="float:right; font-size:smaller"}

### get_field_mappings

>      get_field_mappings (fields, recc_info, letter_text, model='claude-
>                          sonnet-4-20250514', debug=False)

*Use LLM to map recommender info and letter to form fields*

In [0]:
#| echo: false
#| output: asis
show_doc(fill_form)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L86){target="_blank" style="float:right; font-size:smaller"}

### fill_form

>      fill_form (page, mappings, skip_prefilled=True, debug=False)

*Fill form fields using Playwright*

In [0]:
#| echo: false
#| output: asis
show_doc(upload_recommendation)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L113){target="_blank" style="float:right; font-size:smaller"}

### upload_recommendation

>      upload_recommendation (page, file_path)

*Upload the recommendation PDF*

In [0]:
#| echo: false
#| output: asis
show_doc(process_url)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L121){target="_blank" style="float:right; font-size:smaller"}

### process_url

>      process_url (page, url, recc_info, letter_text, pdf_path, debug=False)

*Process a single recommendation URL*

# `formalyzer` CLI script

In [0]:
#| echo: false
#| output: asis
show_doc(read_info)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L145){target="_blank" style="float:right; font-size:smaller"}

### read_info

>      read_info (recc_info:str, pdf_path:str, urls:str)

*parse CLI args and read input files*

In [0]:
#| echo: false
#| output: asis
show_doc(main)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L187){target="_blank" style="float:right; font-size:smaller"}

### main

>      main (recc_info:str, pdf_path:str, urls:str, debug:bool=False)

In [0]:
#| echo: false
#| output: asis
show_doc(run_formalyzer)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L173){target="_blank" style="float:right; font-size:smaller"}

### run_formalyzer

>      run_formalyzer (recc_info, letter_text, urls, pdf_path, debug=False)

*Main async workflow*

In [0]:
#| echo: false
#| output: asis
show_doc(setup_browser)

---

[source](https://github.com/drscotthawley/formalyzer/blob/main/formalyzer/core.py#L165){target="_blank" style="float:right; font-size:smaller"}

### setup_browser

>      setup_browser ()

*Connect to Chrome with remote debugging*

In [None]:
#main("~/recc_info.txt", "~/recc_letter.pdf","~/recc_urls.txt", debug=True)