# End-to-end handwriting-to-LaTeX Demo

## Summary

We are demoing the process of taking handwritten math and converting to LaTeX using the top tools found in the research phase. The writing and input will be done using tkinter, where writing will occur in a separate window. The math OCR will be done with MathPix's OCR API.

The first test of the full workflow produced a rendered equation from a handwritten input whith about 9/11 characacters correctly. I recognize two problems contributing to this inaccuracy. The first, is the canvas drawing is not the exact image being sent. A background PIL canvas is silently keeping track of the drawing using line objects native to PIL. These objects are not presented as smoothly as the tkinter counterparts see image 2 and image 3. Another source of error is inaccurate user input, ie, bad handwriting.  

Accuracy aside and bad hand-writing aside, it does show we can enact an end-to-end solution for the product.

## Experiment start

In [None]:
# image manipulation
from tkinter import Tk, Canvas, ttk, Button
from tkinter import constants as con
from PIL import ImageGrab, ImageTk, ImageDraw
import PIL

# math
import string
import random
from IPython.display import Markdown as md

# system
import sys
import base64

# requests
import requests
import json

**Constants for requests**

In [None]:
app_id = "jaime_meriz13_gmail_com_0ae761_524ac2"
app_key = "8c504ea5335669f6a2c567f97fab91b34e6fee47f2f8ed849535dd2c2402bf24"

In [3]:
def rnd_image_filename(N=7):
    """
    N: Integer.
    How many digits to append to a file name to make it unique.
    
    filename: String.
    Name of the file to be used 
    
    7^36 = 2.6515x10^30, which is how many letters and numbers in the english 
    alphabet exsit. A choice of 7 means there are this many unique permutations.
    It will nearly guarantee no two files will be named the same even on 
    back-to-back runs
    """
    
    filetag=''.join(random.choices(string.ascii_uppercase + string.digits, k=N))
    filename="figures/canvas_img_"+filetag+".png"
    return filename

In [82]:
# unit test
for i in range(5):
    filename_test = rnd_image_filename()
    print(filename_test)

figures/canvas_img_WZAZWLZ.png
figures/canvas_img_K9N13DE.png
figures/canvas_img_Z0ZF6IF.png
figures/canvas_img_JT60NV4.png
figures/canvas_img_IVRJ0NT.png


## Phase 1: tkinter implementation

This code weaves tkinter's tutorial showcase of the code with a few lines from a code from this video (https://www.youtube.com/watch?v=OdDCsxfI8S0).
The code block below 'Pilot Canvas code' combines what happens in the code blocks below 'Simple Canvas code' and 'Canvas w Save Button'.

## Simple Canvas code

- no buttons or any save feature, just drawing
- may remove # @ diag for the original version, I don't think they're necess.
- **< button -1 >** =  Button 1 is the leftmost button, button 2 is the middle button (where available), and button 3 the rightmost button.
- **< B1-Motion >** =  The mouse is moved, with mouse button 1 being held down (use B2 for the middle button, B3 for the right button).

In [None]:
# CANVAS 1
def savePosn(event):
    global lastx, lasty
    lastx, lasty = event.x, event.y

def addLine(event):
    canvas.create_line((lastx, lasty, event.x, event.y))
    savePosn(event)

root = Tk()

# diag: don't actually need this.
# root.columnconfigure(0, weight=1)
# root.rowconfigure(0, weight=1)

canvas = Canvas(root, bg="white")
# diag: if you add this line, you don't need the grid
canvas.pack()

# see comment @ diag above
# canvas.grid(column=0, row=0, sticky=(con.N, con.W, con.E ,con.S))
canvas.bind("<Button-1>", savePosn)
canvas.bind("<B1-Motion>", addLine)

root.mainloop(0)

## Canvas w Save Button

- has the complete code from the youtube channel which will draw and save using a button.
- the draw feature is crap, which is why we are trying to infuse the other code into it
- modified to use **rnd_image_filename**

In [None]:
# this is the pilot code

width = 600
height = 400
center = height//2
white = (255, 255, 255)
background = 'white'
green = (0,128,0)

def save():
    filename = rnd_image_filename()
    canvas_image.save(filename)

def paint(event):
    # python_green = "#476042"
    x1, y1 = (event.x - 1), (event.y - 1)
    x2, y2 = (event.x + 1), (event.y + 1)
    cv.create_oval(x1, y1, x2, y2,fill="black",width=2)
    draw.line([x1, y1, x2, y2],fill="black",width=2)

root = Tk()

# Tkinter create a canvas to draw on
cv = Canvas(root, width=width, height=height, bg=background)
cv.pack()

# PIL create an empty image and draw object to draw on
# memory only, not visible
canvas_image = PIL.Image.new("RGB", (width, height), white)
draw = ImageDraw.Draw(canvas_image)

# do the Tkinter canvas drawings (visible)
# cv.create_line([0, center, width, center], fill='green')

cv.pack(expand=True, fill="both")
cv.bind("<B1-Motion>", paint)

# do the PIL image/draw (in memory) drawings
# draw.line([0, center, width, center], green)

# PIL image can be saved as .png .jpg .gif or .bmp file (among others)
# filename = "my_drawing.png"
# image1.save(filename)
button=Button(text="save",command=save)
button.pack()
root.mainloop()

print("Complete canvas input and image save.")


## Pilot Canvas Code

- combining elements from prior code to create a canvas w a save button with smooth line drawing.

In [16]:
def pilot_canvas():
    width=600
    height=400
    linewidth=3
    offset=(linewidth)/2
    white=(255,255,255)
    linecolor="BLACK"

    def save(N=5):
        filename["name"] = rnd_image_filename(N=N)
        canvas_image.save(filename["name"])
        print("File was saved as: ", filename["name"] )
        
    def savePosn(event):
        global lastx, lasty
        lastx, lasty = event.x, event.y

    def addLine(event):
        # the canvas call is what you see on screen
        canvas.create_line((lastx, lasty, event.x, event.y),
                            smooth=True,width=linewidth,fill=linecolor)
        # the draw call is in the background (invisible) capturing what will actually get converted.
        draw.line([lastx, lasty, event.x, event.y], fill=linecolor, width=linewidth,joint='curve')
        savePosn(event)

    root = Tk()

    # Tkinter create a canvas to draw on
    canvas = Canvas(root, bg="white", width=width, height=height)
    canvas.pack()

    # PIL create an empty image and draw object to draw on
    # memory only, not visible
    canvas_image = PIL.Image.new("RGB", (width, height), white)
    draw = ImageDraw.Draw(canvas_image)

    canvas.pack(expand=True, fill="both")
    canvas.bind("<Button-1>", savePosn)
    canvas.bind("<B1-Motion>", addLine)

    # Add a save button
    button=Button(text="Save Image",command=lambda: save(N=7))
    button.pack()

    # Add an exit button
    # later....

    root.mainloop()

    # print("File was saved as: ", filename["name"] )

## Phase 2: MathPix OCR API implementation

We implement code from 'top_ocr_tools_mathpix_snip' nb for submitting an API request to MathPix. 

-YAH

In [39]:
def ocr_request(filename):
    dict_request={
            "src": "data:image/png",
            "formats": ["text", "data", "html"],
            "data_options": {
            "include_asciimath": True,
            "include_latex": True
            }
        }

    # put desired filename from earlier.
    file_path = filename["name"]
    image_uri = "data:image/png;base64," + base64.b64encode(open(file_path, "rb").read()).decode()

    # send a request
    r = requests.post("https://api.mathpix.com/v3/text",
        data=json.dumps({'src': image_uri}),
        headers={"app_id": app_id, 
                 "app_key": app_key,
                 "Content-type": "application/json"})

    print(json.dumps(json.loads(r.text), indent=4, sort_keys=True))
    
    json_return = json.loads(r.text)
    latex_return = json_return.get("latex_styled")
    
    
    # expected:
    print(latex_return)
    print()
    return latex_return

### Print the return request

We may also want to save this so that an API request doesn't have to be made very time.

In [84]:
print(json.dumps(json.loads(r.text), indent=4, sort_keys=True))

{
    "auto_rotate_confidence": 0.011462837151636762,
    "auto_rotate_degrees": 0,
    "confidence": 0.47379128643166557,
    "confidence_rate": 0.47379128643166557,
    "is_handwritten": true,
    "is_printed": false,
    "latex_styled": "\\sum_{m}\\left(_{j m}^{2}+\\tan \\left(\\phi_{m}\\right)\\right.",
    "request_id": "3b08353e2e65e4a05c5d68a3061032db",
    "text": "\\( \\sum_{m}\\left(_{j m}^{2}+\\tan \\left(\\phi_{m}\\right)\\right. \\)"
}


In [100]:
json_return = json.loads(r.text)
latex_return = json_return.get("latex_styled")

## Results

**How well did we do??**

In [19]:
# expected: \sum_{m}\left(_{j m}^{2}+\tan \left(\phi_{m}\right)\right.
print(latex_return)
print()
md("$ %s $"%(latex_return))

NameError: name 'latex_return' is not defined

<img src="figures/canvas_img_RB.png" alt="" title="" width="400" height="300" />

We see that almost every character except the 'y' was converted correctly. This is in part due to bad hand-writing. Looking at how y is actually written, with a partial break at the stem, we can see how the OCR thought this was two seperate characters: one closely resembling a j and the second part of the curve resembing a parenthesis. Fortunately, a majority of the LaTeX was successfully converted, so a user could come back and fix this minor LaTeX issue.

<p float="left">
  <img src="figures/img_6V3SBX4_cv.png" alt="" title="" width="400" height="300" />
  <img src="figures/img_6V3SBX4_pil.png" alt="" title="" width="400" height="300" /> 
</p>

The image on the left is from the canvas itself. The image on the right is PIL's hidden rendition, which can be saved and sent to the OCR. 



## Test Case 1:

Hand-write and translate the following equation:

$$ \frac{\partial c}{\partial t} = \nabla \cdot (D \nabla c) - \nabla \cdot (\mathbf{v} c) + R $$

In [37]:
# call the pilot_canvas()
# filename["name"] = pilot_canvas()

filename["name"]='figures/canvas_img_LKNB0E0.png'

In [36]:
filename["name"]

'figures/canvas_img_6V3SBX4.png'

In [41]:
# send the API request
latex_return=ocr_request(filename)

{
    "auto_rotate_confidence": 0.006803493154929896,
    "auto_rotate_degrees": 0,
    "confidence": 0.21025172949225443,
    "confidence_rate": 0.21025172949225443,
    "is_handwritten": true,
    "is_printed": false,
    "latex_styled": "\\left.\\frac{\\partial c}{\\partial t}=\\nabla \\cdot \\mid D \\nabla c\\right)-\\nabla \\cdot(v c)+R",
    "request_id": "af4a7a92e4e05d08e0f6465860b2858c",
    "text": "\\( \\left.\\frac{\\partial c}{\\partial t}=\\nabla \\cdot \\mid D \\nabla c\\right)-\\nabla \\cdot(v c)+R \\)"
}
\left.\frac{\partial c}{\partial t}=\nabla \cdot \mid D \nabla c\right)-\nabla \cdot(v c)+R



In [67]:
# print the returned request
md("$ %s $"%(latex_return))

$ \left.\frac{\partial c}{\partial t}=\nabla \cdot \mid D \nabla c\right)-\nabla \cdot(v c)+R $

In [68]:
# compare to input image (PIL's image)
print(filename["name"])

figures/canvas_img_LKNB0E0.png


**PIL's Image**
<p float="left">
  <img src="figures/canvas_img_LKNB0E0.png"  alt=title="" width="400" height="300" />
</p>

**Original LaTeX**
$$ \frac{\partial c}{\partial t} = \nabla \cdot (D \nabla c) - \nabla \cdot (\mathbf{v} c) + R $$

### How did we do?

We see that only one charater was incorrectly translated one character. The '(' is being confused for '|', and this is partly because the handwriting is confusing the OCR. Other sources of error are that we are using a mouse to input the LaTeX, which is generally harder to use for finer points of drawing.