<img src="https://news.illinois.edu/files/6367/543635/116641.jpg" alt="University of Illinois" width="250"/>

## HW: Deep Learning ##

HW submission by group (up to 4 people)
* John Doe <johndoe@illinois.edu>
* Jane Roes <janeroe@illinois.edu>

### imports and graphics configurations ###

In [None]:
import numpy
import pandas
import time
import random
import matplotlib
#%matplotlib notebook
import matplotlib.pyplot as plt
import scipy.stats
import matplotlib.offsetbox as offsetbox
from matplotlib.ticker import StrMethodFormatter

In [None]:
#for some reason, this needs to be in a separate cell
params={
    "font.size":15,
    "lines.linewidth":5,
}
plt.rcParams.update(params)

# **Technology** #

**Technology:** Compute $\cos(k\pi/10)$ for $k\in \{0,1,2,\dots 20\}$

# **Linear Regression** #

**Feature Importance:** Consider linear regression of price upon the feature set
* Square Feet
* number of Beds
* number of Baths
* Year built
* HOA/Month

One by one, remove (using sklearn if you like) each of these features and repeat linear regression.
* Rank features by how the loss (mean square error) changes as each of the features is removed
* Rank features by how the metric (mean absolute error) changes as each of the features is removed

# **Logistic Regression** #

**Categorical Complexity:** Using the data from the lecture, try to use Logistic Regression (you can use sklearn) to predict Townhouse vs Not-Townhouse.  What happens?  Discuss the loss, compared to what we did in lecture.


# **Backpropagation** #

**Numerical Backpropagation** Define
$\phi_n(x)= \cos(2^nx)$ for $x\in \mathbb{R}$ and $n\in \{1,2,3\}$.  Set $x=5$ and define
$$f_5(m_1,m_2,m_3)= \exp\left[\pi \phi_3(m_3\phi_2(m_2\phi_1(m_1x)))\right]$$
for $(m_1,b_1.m_2,b_2.m_3,b_3)\in \mathbb{R}^6$.
* Numerically compute $\lim_{\varepsilon\to 0}\{f_5(10,9,8+\varepsilon)-f_5(10,9,8)\}/\varepsilon$
* Compute $\frac{\partial f_5}{\partial m_3}(10,9,8)$ using pytorch
* Numerically compute $\lim_{\varepsilon\to 0}\{f_5(10,9+\varepsilon,8)-f_5(10,9,8)\}/\varepsilon$
* Compute $\frac{\partial f_5}{\partial m_2}(10,9,8)$ using pytorch
* Numerically compute $\lim_{\varepsilon\to 0}\{f_5(10+\varepsilon,9,8)-f_5(10,9,8)\}/\varepsilon$
* Compute $\frac{\partial f_5}{\partial m_1}(10,9,8)$ using pytorch.

In [8]:
import os
import numpy as np
import pandas
import time
import random
import matplotlib
#%matplotlib notebook
import matplotlib.pyplot as plt
import scipy.stats
#from pandas.plotting import autocorrelation_plot
import matplotlib.offsetbox as offsetbox
from matplotlib.ticker import StrMethodFormatter
import sklearn.linear_model
import sklearn.model_selection
import itertools


def saver(fname):
    plt.savefig(fname+".png",bbox_inches="tight")

def legend(pos="bottom",ncol=3):
    if pos=="bottom":
        plt.legend(bbox_to_anchor=(0.5,-0.2), loc='upper center',facecolor="lightgray",ncol=ncol)
    elif pos=="side":
        plt.legend(bbox_to_anchor=(1.1,0.5), loc='center left',facecolor="lightgray",ncol=1)

def textbox(txt,fname=None):
    plt.figure(figsize=(1,1))
    plt.gca().add_artist(offsetbox.AnchoredText("\n".join(txt), loc="center",prop=dict(size=30)))
    plt.axis('off')
    if fname is not None:
        saver(fname)
    plt.show()
    plt.close()

In [2]:
import torch
import numpy

In [3]:
#for some reason, this needs to be in a separate cell
params={
    "font.size":15,
    "lines.linewidth":5
}
plt.rcParams.update(params)

In [4]:
pngfiles=[f for f in os.listdir(".") if f.endswith(".png")]
print("existing png files: "+str(pngfiles))
#print([os.remove(f) for f in pngfiles])

existing png files: ['dataset.png']


In [5]:
def getfile(location_pair,**kwargs): #tries to get local version and then defaults to google drive version
    (loc,gdrive)=location_pair
    try:
        out=pandas.read_csv(loc,**kwargs)
    except FileNotFoundError:
        print("local file not found; accessing Google Drive")
        loc = 'https://drive.google.com/uc?export=download&id='+gdrive.split('/')[-2]
        out=pandas.read_csv(loc,**kwargs)
    return out

In [7]:
from torch.autograd import Variable
m1 = Variable(torch.tensor(10.0), requires_grad=True)
m2 = Variable(torch.tensor(9.0), requires_grad=True)
m3 = Variable(torch.tensor(8.0), requires_grad=True)
x = 5
f = torch.exp(torch.pi*torch.cos(8*m3*torch.cos(4*m2*torch.cos(2*x*m1))))
f.backward()
print("df/dm1={0:.3f}".format(m1.grad.item()))
print("df/dm2={0:.3f}".format(m2.grad.item()))
print("df/dm3={0:.3f}".format(m3.grad.item()))

df/dm1=-45.408
df/dm2=-0.859
df/dm3=-0.080


In [10]:
#simple way
m1 = 10
m2 = 9
m3 = 8
x = 5
f_base = np.exp(np.pi*np.cos(8*m3*np.cos(4*m2*np.cos(2*x*m1))))
for i in range(10):
    ep = 10**(-i)
    f_wiggle1 = np.exp(np.pi*np.cos(8*(m3+ep)*np.cos(4*m2*np.cos(2*x*m1))))
    f_wiggle2 = np.exp(np.pi*np.cos(8*m3*np.cos(4*(m2+ep)*np.cos(2*x*m1))))
    f_wiggle3 = np.exp(np.pi*np.cos(8*m3*np.cos(4*m2*np.cos(2*x*(m1+ep)))))
    print((f_wiggle1 - f_base)/ep, (f_wiggle2 - f_base)/ep, (f_wiggle3 - f_base)/ep)

0.1903870049025721 4.942395421647324 0.13800803779648446
0.4110421763951727 36.28343078516956 34.499219074021
-0.041519183964310036 4.495466642259322 258.5440048977864
-0.07585113891179357 -0.415842832777194 23096.374457213762
-0.07932501762626665 -0.8141329631292799 78.91095290327867
-0.07967294623351107 -0.8548032975737306 -32.91356145962501
-0.07970774458521124 -0.8588806134565696 -44.15079576482878
-0.07971122407968156 -0.8592884444269533 -45.2885551043003
-0.0797115824735517 -0.8593292973035904 -45.40249018136766
-0.0797116400663711 -0.8593335973361427 -45.41391728646315


In [26]:
#alternate way
from sympy import *
from sympy import symbols, cos, lambdify
m1 = 10
m2 = 9
m3 = 8
x = 5
e = symbols('e')
e_sympy = cos(e)
e_numpy = lambdify(e, e_sympy, "numpy")
f = ( np.exp(np.pi*np.cos(8*(m3+e_sympy)*np.cos(4*m2*np.cos(2*x*m1)))) -
     np.exp(np.pi*np.cos(8*m3*np.cos(4*m2*np.cos(2*x*m1)))) )/e_sympy
y = limit(f, e_sympy, 0)
print(y)

TypeError: loop of ufunc does not support argument 0 of type Add which has no callable cos method

In [22]:
# import sympy
from sympy import *
 
x = symbols('x')
expr = sin(x)/x;
   
print("Expression : {}".format(expr))
     
# Use sympy.limit() method
limit_expr = limit(expr, x, 0) 
     
print("Limit of the expression tends to 0 : {}".format(limit_expr)) 

Expression : sin(x)/x
Limit of the expression tends to 0 : 1


In [21]:
sin(1)

sin(1)

# **FeedForward networks** #

**Redfin Price Prediction**:  Download propery data from Redfin <https://www.redfin.com/> for several neighborhoods of Chicago.  Use multilayer neural networks to predict price based upon the feature set
* Square Feet
* Property Type
* number of Beds
* number of Baths
* Year built
* HOA/Month

---