# <img src="https://github.com/JuliaLang/julia-logo-graphics/raw/master/images/julia-logo-color.png" height="100" /> _Colab Notebook Template_

## Instructions
1. Work on a copy of this notebook: _File_ > _Save a copy in Drive_ (you will need a Google account). Alternatively, you can download the notebook using _File_ > _Download .ipynb_, then upload it to [Colab](https://colab.research.google.com/).
2. If you need a GPU: _Runtime_ > _Change runtime type_ > _Harware accelerator_ = _GPU_.
3. Execute the following cell (click on it and press Ctrl+Enter) to install Julia, IJulia and other packages (if needed, update `JULIA_VERSION` and the other parameters). This takes a couple of minutes.
4. Reload this page (press Ctrl+R, or ⌘+R, or the F5 key) and continue to the next section.

_Notes_:
* If your Colab Runtime gets reset (e.g., due to inactivity), repeat steps 2, 3 and 4.
* After installation, if you want to change the Julia version or activate/deactivate the GPU, you will need to reset the Runtime: _Runtime_ > _Factory reset runtime_ and repeat steps 3 and 4.

In [None]:
%%shell
set -e

#---------------------------------------------------#
JULIA_VERSION="1.6.0" # any version ≥ 0.7.0
JULIA_PACKAGES="IJulia BenchmarkTools Plots"
JULIA_PACKAGES_IF_GPU="CUDA" # or CuArrays for older Julia versions
JULIA_NUM_THREADS=2
#---------------------------------------------------#

if [ -n "$COLAB_GPU" ] && [ -z `which julia` ]; then
  # Install Julia
  JULIA_VER=`cut -d '.' -f -2 <<< "$JULIA_VERSION"`
  echo "Installing Julia $JULIA_VERSION on the current Colab Runtime..."
  BASE_URL="https://julialang-s3.julialang.org/bin/linux/x64"
  URL="$BASE_URL/$JULIA_VER/julia-$JULIA_VERSION-linux-x86_64.tar.gz"
  wget -nv $URL -O /tmp/julia.tar.gz # -nv means "not verbose"
  tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
  rm /tmp/julia.tar.gz

  # Install Packages
  if [ "$COLAB_GPU" = "1" ]; then
      JULIA_PACKAGES="$JULIA_PACKAGES $JULIA_PACKAGES_IF_GPU"
  fi
  for PKG in `echo $JULIA_PACKAGES`; do
    echo "Installing Julia package $PKG..."
    julia -e 'using Pkg; pkg"add '$PKG'; precompile;"' &>/dev/null
  done

  # Install kernel and rename it to "julia"
  echo "Installing IJulia kernel..."
  julia -e 'using IJulia; IJulia.installkernel("julia", env=Dict(
      "JULIA_NUM_THREADS"=>"'"$JULIA_NUM_THREADS"'"))'
  KERNEL_DIR=`julia -e "using IJulia; print(IJulia.kerneldir())"`
  KERNEL_NAME=`ls -d "$KERNEL_DIR"/julia*`
  mv -f $KERNEL_NAME "$KERNEL_DIR"/julia  

  echo ''
  echo "Success! Please reload this page and jump to the next section."
fi



### **[An implementation of Hypergradient Descent on Stocastic gradient Descent]**

&nbsp;

### **ARGS:**


### **x** : will be our initial guess for finding the optimum.
### **2x-2** : derivate of the function we our trying to find optimum, so basically gradient for univariate function.
### **learn_rate** : learning rate for our algorithm.
### **x_optimum** : will be our current best guess after applying gradient descent with momentum.
### **max_iter** : the maximum number of iterations our algorithm will run if not converged.
### **part_deri** : partial derivative of delta w.r.t to alpha
### **beta** : Hypergradient learning rate
&nbsp;
### **Return:**

### **value** : new optimized value of x
### **no. of iterations** : number of iterations

In [None]:
function HD(x_guess,learn_rate,conv_threshold,max_iter,beta)
    converged=false
    iterations=0
    part_deri=0
    while converged == false
        fd=2*x_guess-2
        gamma=fd*part_deri  #<--HyperGradient
        learn_rate=learn_rate-beta*gamma    #<--Learning rate update

        delta=learn_rate*fd   #parameter update
        part_deri=fd      #partial_derivative w.r.t to alpha
        x_optimum=x_guess+delta

        x_guess=x_optimum
        
        println("no.of iterations: $iterations ","value :$x_guess ")

        if x_guess <= conv_threshold
            converged=true
            end
        iterations += 1
        if iterations > max_iter
            converged=true
            end
        end
end

SyntaxError: ignored

In [None]:
1

0.00010000000000000002

In [None]:
HD(10,0.01,1.0,10,0.0001)

no.of iterations: 0 value :10.18 
no.of iterations: 1 value :9.75683872 
no.of iterations: 2 value :8.790029226603618 
no.of iterations: 3 value :7.50483762706951 
no.of iterations: 4 value :6.167980652737281 
no.of iterations: 5 value :4.966885889903807 
no.of iterations: 6 value :3.9798791445276587 
no.of iterations: 7 value :3.2102712663348796 
no.of iterations: 8 value :2.627782511529386 
no.of iterations: 9 value :2.1941160244230344 
no.of iterations: 10 value :1.8741281489890644 


**Use of Hyper gradient descent**

1)hyper gradient relies only on the value of beta-hypergradient learning rate and is less sensitive to the initial learning rate.

2)Hyper gradient only requires storing of the value for partial derivative w.r.t to learning rate.

3)hyper gradient is useful alternative in comparison to adagrad,rmsprop,adam which could face problems with complexity, applicability in practice, or performance.

Add new code cells by clicking the `+ Code` button (or _Insert_ > _Code cell_).

Have fun!

<img src="https://raw.githubusercontent.com/JuliaLang/julia-logo-graphics/master/images/julia-logo-mask.png" height="100" />

https://arxiv.org/pdf/1703.04782v1.pdf?source=post_page---------------------------