# STA410 Week 6 Programming Assignment (5 points)

0. **Paired or individual assignment.** Create code solutions for these assignments either individually or in the context of a paired effort. 

   >  Seek homework partners in class, in course discussion board on piazza, etc.   
 
    
1. **Paired students each separately submit their (common) work, including (agreeing) contribution of work statements for each problem.**  
  
   > Students must work in accordance with the [University of Toronto’s Code of Behaviour on Academic Matters](https://governingcouncil.utoronto.ca/secretariat/policies/code-behaviour-academic-matters-july-1-2019) (and see also http://academicintegrity.utoronto.ca.); however, students working in pairs may share work without restriction within their pair. Getting and sharing "hints" from other classmates is encouraged; but, the eventual code creation work and submission must be your own individual or paired creation.
      
2. **Do not delete, replace, or rearranged cells** as this erases `cell ids` upon which automated code tests are based.

   > The "Edit > Undo Delete Cells" option in the notebook editor might be helpful; otherwise, redownload the notebook (so it has the correct required `cells ids`) and repopulate it with your answers (assuming you don't overwrite them when you redownload the notebook).
  >> ***If you are working in any environment other than*** [UofT JupyterHub](https://jupyter.utoronto.ca/hub/user-redirect/git-pull?repo=https://github.com/pointOfive/sta410hw0&branch=master), [Google Colab](https://colab.research.google.com/github/pointOfive/sta410hw0/blob/master/sta410hw0.ipynb), or [UofT JupyterLab](https://jupyter.utoronto.ca/hub/user-redirect/git-pull?repo=https://github.com/pointOfive/sta410hw0&branch=master&urlpath=/lab/tree/sta410hw0), your system must meet the following versioning requirements 
   >>
   >>   - [notebook format >=4.5](https://github.com/jupyterlab/jupyterlab/issues/9729) 
   >>   - jupyter [notebook](https://jupyter.org/install#jupyter-notebook) version [>=6.2](https://jupyter-notebook.readthedocs.io/en/stable/) for "classic" notebooks served by [jupyterhub](https://jupyterhub.readthedocs.io/en/stable/quickstart.html)
   >>   - [jupyterlab](https://jupyter.org/install) version [>=3.0.13](https://github.com/jupyterlab/jupyterlab/releases/tag/v3.0.13) for "jupyterlab" notebooks  
   >>    
   >> otherwise `cell ids` mat not be supported and you will not get any credit for your submitted homework.
   >>
   >> You may check if `cell ids` are present and working by running the following command in a cell 
   >>
   >> `! grep '"id":' <path/to/notebook>.ipynb`
   >>
   >> and making sure the `cell ids` **do not change** when you save your notebook.
   
3. ***You may add cells for scratch work*** but if required answers are not submitted through the provided cells where the answers are requested your answers may not be marked.

 
4. **No cells may have any runtime errors** because this causes subsequent automated code tests to fail and you will not get marks for tests which fail because of previous runtime errors. 

  > Run time errors include, e.g., unassigned variables, mismatched parentheses, and any code which does not work when the notebook cells are sequentially run, even if it was provided for you as part of the starter code. ***It is best to restart and re-run the cells in your notebook to ensure there are no runtime errors before submitting your work.***
  >
  > - The `try`-`except` block syntax catches runtime errors and transforms them into `exceptions` which will not cause subsequent automated code tests to fail.  


5. **No jupyter shortcut commands** such as `! python script.py 10` or `%%timeit` may be included in the final submission as they will cause subsequent automated code tests to fail.

   > ***Comment out ALL jupyter shortcut commands***, e.g., `# ! python script.py 10` or `# %%timeit` in submitted notebooks.


6. **Python library imports are limited** to only libraries imported in the starter code and the [standard python modules](https://docs.python.org/3/py-modindex.html). Importing additional libraries will cause subsequent automated code tests to fail.

  > Unless a problem instructs differently, you may use any functions available from the libraries imported in the starter code; otherwise, you are expected to create your own Python functionality based on the Python stdlib (standard libary, i.e., base Python and standard Python modules).


7. You are encouraged to adapt code you find available online into your notebook; however, if you do so please provide a link to the utilized resource. ***If failure to cite such references is identified and confirmed, your mark will be immediately reduced to 0.***  

In [None]:
# Unless a problem instructs differently, you may use any functions available from the following library imports
import numpy as np
import tensorflow as tf
import logging, os
import matplotlib.pyplot as plt

# Problem 0 (required)

Are you working with a partner to complete this assignment?  
- If not, assign  the value of `None` into the variable `Partner`.
- If so, assign the name of the person you worked with into the variable `Partner`.
    - Format the name as `"<First Name> <Last Name>"` as a `str` type, e.g., "Scott Schwartz".

In [None]:
# Required: only worth points when not completed, in which case, you'll lose points
Partner = #None
# This cell will produce a runtime error until you assign a value to this variable

What was your contribution in completing the code for this assignments problems? Assign one of the following into each of the `Problem_X` variables below.

- `"I worked alone"`
- `"I contributed more than my partner"`
- `"My partner and I contributed equally"`
- `"I contributed less than my partner"`
- `"I did not contribute"`

In [None]:
# Required: only worth points when not completed, in which case, you'll lose points
Problem_1 = #"I worked alone"
Problem_2 = #"I worked alone"
# This cell will produce a runtime error until you assign a value to this variable

# Problem 1 (5 points)

Complete the function `newtons_method(f, x0, K=10, eps=1e-7)` for use with the $d$-variate [Schwefel function](https://www.sfu.ca/~ssurjano/schwef.html)

$$418.9829d - \sum_{i=1}^d x_i\sin\left(\sqrt{|x_i|}\right)$$


  *This problem draws upon the outstanding materials created by [Sonja Surjanovic and Derek Bingham](https://www.sfu.ca/~ssurjano/index.html) of the [Department of Statistics and Actuarial Science at Simon Fraser University](https://www.sfu.ca/stat-actsci.html); specifically, their [optimization resources](https://www.sfu.ca/~ssurjano/optimization.html) which includes an extensive collection of multimodal functions.*  

In [None]:
import logging, os
logging.disable(logging.WARNING)
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

d = 3
@tf.function(input_signature=(tf.TensorSpec(shape=[d], dtype=tf.float32), ))
def schwefel(x):
    y = tf.math.reduce_sum(x*tf.math.sin(tf.math.sqrt(tf.math.abs(x))))
    return 418.9829*x.shape[0] - y

def newtons_method(f, x0, K=10, eps=1e-7):
    
    '''
    Newton's Method with TensorFlow
    
    f   : @tf.function(input_signature=(tf.TensorSpec(shape=[d], dtype=tf.float32), ))
    x0  : [x0_0, x0_1, ..., x0_(d-1)] list initialization 
    K   : (default 10) number of Newton Method steps
    eps : (default 1e-7) stopping criterion `||x_k - x_(k-1)||_2<eps`
    
    returns x_K.numpy().tolist()+[f(x_k).numpy()]
            where `_k` is the last update made on which a stopping criteria (based on K or eps) was met
    '''

    x_k = tf.Variable(x0)
    
    # <complete>
    # Note: Don't actually invert the matrix: X(k+1) = X(k) - (∇²f(X(k)))^-1 @ ∇f(X(k))
    # Solve for X(k+1) using tf.linalg.solve...
    
    return x_k.numpy(),f(x_k).numpy()

## Hints

- Examples of how to use TensorFlow to compute higher order partial derivatives are given here: https://www.tensorflow.org/guide/advanced_autodiff
- You may ignore warning messages regarding "triggered tf.function retracing":
    - these indicate that the same function is being repeatedly placed into the automatic differention graph, which happens intentionally in ***Newton's method*** since partial derivatives are being recalculated at different locations for each ***Newton step*** inside `for k in range(K)`.
    - and the warnings may be silenced with 
    
    ```python
    import logging, os
    logging.disable(logging.WARNING)
    os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
    ```

- If the computation of the ***Hessian*** $H$ is not ***symmetric***, $(H + H^T)/2$ will be ***symmetric***. 
- If the computation of the ***Hessian*** $H$ has `NaN`s or `0` diagonal elements then "monkey patch" them on the basis of the following functions 
    - `tf.where(tf.math.is_nan(H) & (tf.eye(H.shape[0])==1), 1e-7, H)`
    - where, e.g., `H = np.ones((2,2)); H[0,0] = np.NaN; H = tf.Variable(H)`

## Problem 1 Questions 0-1 (2 points)

Local minima will be found with you function for various initializations and parameter settings.

- You do not need to make any variable assignments: your function will be called based on the parameterization specified in the problem prompt.

In [None]:
# Cell for scratch work

# You are welcome to add as many new cells into this notebook as you would like.
# Just do not leave in a state that will produce a runtime errors when notebook cells are run sequentially.

# Any cells included for scratch work that are no longer needed may be deleted so long as 
# - all the required functions are still defined and available when called
# - no cells requiring variable assignments are deleted.

# None of this will not cause problems with `cell ids` assuming your versioning supports `cell ids`
# (as UofT JupyterHub, UofT JupyterLab, an Google Colab will).


In [None]:
# Cell for scratch work


## Problem 1 Question 2 (2 points)

2. What is the location of the minimum value of the $d=3$ ***Schwefel function*** subject to the constraint $x_1, x_2, x_3 \in [-100,100]$ and what is that minimum value?

***Hint***: use the following code to find a good initial value for your `newton_method` function.

```python
grid_n = 11
grid = np.meshgrid(*[np.linspace(100,-100,grid_n) for i in range(3)])
f = grid[0].copy()
f_min = grid[0].copy()
for i in range(grid_n):
    for j in range(grid_n):
        for k in range(grid_n):
            f[i,j,k] = schwefel(tf.Variable([grid[0][i,j,k],grid[1][i,j,k],grid[2][i,j,k]], dtype=tf.float32))

# some things to look at
#min(f.ravel())
#for i in range(grid_n):
#    print(min(f[i].ravel()))
# min(f.ravel()), min(f[i].ravel())
# plt.imshow(f[i])
# grid[0][f==min(f.ravel())],grid[1][f==min(f.ravel())],grid[2][f==min(f.ravel())]
            
# this code is more general for different d
#grid = np.meshgrid(*[np.linspace(100,-100,grid_n) for i in range(d)])
#f = grid[0].flatten()
#for i,x0 in enumerate(zip(*[g.ravel() for g in grid])):
#    f[i] = schwefel(tf.Variable(x0))
#min(f)      
```

In [None]:
# 2 points [format: tuble of four numbers; or, the call to the function below
                                         # with a good `initial_value` choice]
p1q2 = # (x1,x2,x3,y) # newtons_method(schwefel, <initial_value>)

# This cell will produce a runtime error until you assign a value to this variable

In [None]:
# Cell for scratch work

# You are welcome to add as many new cells into this notebook as you would like.
# Just do not leave in a state that will produce a runtime errors when notebook cells are run sequentially.

# Any cells included for scratch work that are no longer needed may be deleted so long as 
# - all the required functions are still defined and available when called
# - no cells requiring variable assignments are deleted.

# None of this will not cause problems with `cell ids` assuming your versioning supports `cell ids`
# (as UofT JupyterHub, UofT JupyterLab, an Google Colab will).


In [None]:
# Cell for scratch work


## Problem 1 Question 3-4 (1 point)

3. (0.5 points) Why is the choice of the initial value important for finding a global optimum for a function like the `schwefel` function? 
    
    1. To increase the speed of convergence of the `newton_method` function
    2. Because the `schwefel` function is not convex
    3. Because the `newton_method` function won't converge for all initial values 
    4. It is not important


4. (0.5 points) What's wrong with running the `newton_method` function for every initial value in the `grid` for the previous problem? 

    1. It takes a very long time
    2. The grid is not dense enough
    3. Nothing is wrong with it -- it is recommended
    4. It's cheating to find an optimum with a grid search



In [None]:
# 0.5 points each [format: `str` either "A" or "B" or "C" or "D" based on the choices above]
p1q3 = #<"A"|"B"|"C"|"D"> 
p1q4 = #<"A"|"B"|"C"|"D"> 
# Uncomment the above and keep each only either "A" or "B" or "C" or "D"

# This cell will produce a runtime error until the `p1q3` and `p1q4` variables are assigned values