> ## Make a copy of this notebook (File menu -> Make a Copy...)

### Homework Question 1

This question will explore the difference between classical Gram-Schmidt (GS) and Modified Gram-Schmidt. We use two matrices, one random matrix generated by you, and a second matrix designed to highlight the difference.

1. In lab, you wrote code for MGS. Write a similar function for GS. Show a test on a small matrix. You should get the same result as your MGS function.<br><br>
1. The *$n\times n$ Hilbert matrix* has entry $\frac{1}{i+j-1}$ in the $(i,j)$ position (where $i$ and $j$ are numbered from 1 to $n$). Write a function `hilbert(n)` that returns this matrix. You can use loops if you like. Careful to translate from NumPy (0 to $n-1$) numbering into classical numbering. Show that you get the correct matrix for $n=3$.<br><br>
1. GS and MGS produce orthogonal matrices. If $Q$ is an orthogonal matrix, what is $Q^TQ$?
> We will test how good our functions are by testing how far away from the desired result our actual result is. To do this, we will take our result, $Q$, subtract $Q^TQ$ from the matrix we 'should' get (the answer to Part 3 of this question), and find the *matrix norm* of the result. The matrix norm is simply the square root of the sum of the entries of the matrix, and can be computed using `np.linalg.norm(A)`. Call this number the *error*.
1. Generate a random $200\times 200$ matrix using `np.random.rand(200,200)`. Compute the errors we get using each of GS and MGS.<br><br>
1. Consider the matrix `0.00001*np.eye(n)+hilbert(n)`. Compute the errors for this matrix using each of GS and MGS.<br><br>
1. Comment briefly on the results in the previous two parts of this question.

**Note:** While a complete error analysis is beyond the scope of this class, see [here](https://www.math.uci.edu/~ttrogdon/105A/html/Lecture23.html) if you're interested in why MGS is so much better.

### Homework Question 2

Choose at least one of the following to extend your handwriting recognition work. Doing both well will get extra credit. If you do both, you will have created a handwritten digit recognition system that is as good as can be given only the tools we have.

#### Extending your code to compute binary classifiers for all ten digits
1. Compute binary classifiers for each of the digits between 0 and 9.<br><br>
1. Compute error rates for each digit, both on the training set and the test set.<br><br>
1. For each image in the test set, use all ten classifiers to see how many handwritten digits your classifiers give unique answers to. That is, if your classifiers determine that a particular image may be a 1 or a 2, then you cannot classify that particular image. How many images were not recognized as any digit at all?<br><br>
1. Compute your overall success rate on the test images. That is, compute how many images were correctly and uniquely classified. Also compute the rate of *false positives* (that is, images that were identified as digits, but whose label was wrong), and *false negatives* (images whose digit couldn't be assigned). This is how good your handwriting recognition is!


In [1]:
import numpy as np
import numpy.linalg as LA

def findnum (imagefile,labelfile,num):
    from MNISTHandwriting import readimgs

    images = readimgs('./data/'+ imagefile)[0].astype('float')
    labels = readimgs('./data/'+ labelfile)[0].astype('float')
    total, pix, pix= np.shape(images)
    newimage =np.reshape(images,(total,pix*pix))
    thing = newimage.sum(axis=0)
    nozero = np.nonzero(thing)[0]
    A = np.ones((total,len(nozero)+1))
    A[:,:len(nozero)] = newimage[:,nozero]
    newlabels = 2*(labels==num)-1
    sol = LA.lstsq(A,newlabels,rcond=10**(-10))[0]
    
    bhat = A @ sol
    modelresults = 2*(bhat >0)-1
    error = total-np.sum(newlabels == modelresults)
    return(modelresults)
    #return(error*100/total)
summ = np.zeros((1,10000))
twonum = 0
nonum = 0
for i in range (0,10):
    summ+=(findnum('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte',i))
for i in range (0,10000):
    if (summ[0,i]>-8):
        twonum+=1
    if (summ[0,i]<-8):
        nonum+=1
print(twonum,nonum) 
print(10000-143-2322)
print(7535/10000)

143 2322
7535
0.7535


Compute your overall success rate on the test images. That is, compute how many images were correctly and uniquely classified. Also compute the rate of *false positives* (that is, images that were identified as digits, but whose label was wrong), and *false negatives* (images whose digit couldn't be assigned). This is how good your handwriting recognition is!


In [5]:
def findmatrixnum (imagefile,labelfile):
    from MNISTHandwriting import readimgs
    allresults = np.zeros((10,10000))
    alllabels = np.zeros((10,10000))
    images = readimgs('./data/'+ imagefile)[0].astype('float')
    labels = readimgs('./data/'+ labelfile)[0].astype('float')
    total, pix, pix= np.shape(images)
    newimage =np.reshape(images,(total,pix*pix))
    thing = newimage.sum(axis=0)
    nozero = np.nonzero(thing)[0]
    A = np.ones((total,len(nozero)+1))
    A[:,:len(nozero)] = newimage[:,nozero]
    for i in range (0,10):
        newlabels = 2*(labels==i)-1
        sol = LA.lstsq(A,newlabels,rcond=10**(-10))[0]
        bhat = A @ sol
        modelresults = 2*(bhat >0)-1
        allresults[i] = modelresults
        alllabels[i] = newlabels
    return(allresults,alllabels)
allresults, alllabels = (findmatrixnum('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte'))


In [44]:

total = (allresults==alllabels)
sumtotal =  np.sum(total, axis=0)
summ = np.sum(allresults, axis=0)

print(sumtotal)
print(summ)
print(total)
totalright = 0
fpos = 0
fneg = 0
totalno = 0

for i in range (0,10000):
    if summ[i]==-8:
        totalright+=1
        if (sumtotal[i]!=10):
            fpos+=1
            
    elif summ[i]<-8:
        totalno +=1
        if (sumtotal[i]==10):
            fneg+=1
print(fpos,totalright)
print(fpos/totalright)
print(fneg,totalno)
print(fneg/totalno)


[10 10 10 ...  9 10 10]
[ -8.  -8.  -8. ... -10.  -8.  -8.]
[[ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 ...
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]
 [ True  True  True ...  True  True  True]]
280 7535
0.0371599203715992
0 2322
0.0


#### Optimizing your classifier(s)
1. We decided that a negative result from the model mean 'this image is not 0', and non-negative results means 'this image is 0'. Write code that searches for a better threshold than 0. That is, find the threshold that gives you the lowest error rate on the test set.<br><br>
1. What error rate does your optimized threshold give on the test set?<br><br>
1. Why can't you use the test set to optimize the threshold?
    
##### Note: if you do choose to do both, the thresholds may well not be the same for each digit!