<a href="https://colab.research.google.com/github/Ranjani94/Advanced_Deep_Learning/blob/master/Assignment_4/Reptile.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Sine wave regression using reptile

The goal of our algorithm is to learn to regress the value of y given the x. The value of amplitude is chosen randomly between 0.1 and 5.0 and the value of phase is chosen randomly between 0 and $\pi$. So for each of the tasks, we sample only 10 data points and train the network. i.e for each of the tasks, we sample only 10 (x,y) pairs. let us get to the code and see in detail.

In [2]:
import tensorflow as tf
import numpy as np

In [3]:
def sample_points(k):
    
    num_points = 100
    
    #amplitude
    amplitude = np.random.uniform(low=0.1, high=5.0)
    
    #phase
    phase = np.random.uniform(low=0, high=np.pi)

    x = np.linspace(-5, 5, num_points)

    #y = a*sin(x+b)
    y = amplitude * np.sin(x + phase)
    
    #sample k data points
    sample = np.random.choice(np.arange(num_points), size=k)
    
    return (x[sample], y[sample])

In [4]:
x, y = sample_points(5)
print(x)
print(y)

[ 0.95959596 -0.25252525 -3.58585859  1.56565657  3.98989899]
[ 0.64559314  3.95151586 -4.03001473 -1.73541673 -1.08340086]


### Two layered neural network

Like MAML, reptile also compatible with any algorithms that can be trained with gradient descent. So we use a simple two layered neural network with 64 hidden units.

In [5]:
from tensorflow.python.framework import ops
ops.reset_default_graph()

In [6]:
# tf.reset_default_graph()

In [7]:
num_hidden = 64
num_classes = 1
num_feature = 1

In [8]:
X = tf.placeholder(tf.float32, shape=[None, num_feature])
Y = tf.placeholder(tf.float32, shape=[None, num_classes])

In [9]:
w1 = tf.Variable(tf.random_uniform([num_feature, num_hidden]))
b1 = tf.Variable(tf.random_uniform([num_hidden]))

w2 = tf.Variable(tf.random_uniform([num_hidden, num_classes]))
b2 = tf.Variable(tf.random_uniform([num_classes]))

In [10]:
#layer 1
z1 = tf.matmul(X, w1) + b1
a1 = tf.nn.tanh(z1)

#output layer
z2 = tf.matmul(a1, w2) + b2
Yhat = tf.nn.tanh(z2)

In [11]:
loss_function = tf.reduce_mean(tf.square(Yhat - Y))

In [12]:
optimizer = tf.train.AdamOptimizer(1e-2).minimize(loss_function)

In [13]:
init = tf.global_variables_initializer()

### Reptile

In [14]:
#number of epochs i.e training iterations
num_epochs = 100


#number of samples i.e number of shots
num_samples = 50  

#number of tasks
num_tasks = 2

#number of times we want to perform optimization
num_iterations = 10


#mini btach size
mini_batch = 10

In [15]:
#start the tensorflow session
with tf.Session() as sess:
    
    sess.run(init)
    
    for e in range(num_epochs):
        
        #for each task in batch of tasks
        for task in range(num_tasks):

            #get the initial parameters of the model
            old_w1, old_b1, old_w2, old_b2 = sess.run([w1, b1, w2, b2,])

            #sample x and y
            x_sample, y_sample = sample_points(num_samples)


            #for some k number of iterations perform optimization on the task
            for k in range(num_iterations):

                #get the minibatch x and y
                for i in range(0, num_samples, mini_batch):

                    #sample mini batch of examples 
                    x_minibatch = x_sample[i:i+mini_batch]
                    y_minibatch = y_sample[i:i+mini_batch]


                    train = sess.run(optimizer, feed_dict={X: x_minibatch.reshape(mini_batch,1), 
                                                           Y: y_minibatch.reshape(mini_batch,1)})

            #get the updated model parameters after several iterations of optimization
            new_w1, new_b1, new_w2, new_b2 = sess.run([w1, b1, w2, b2])

            #Now we perform meta update

            #i.e theta = theta + epsilon * (theta_star - theta)

            epsilon = 0.1

            updated_w1 = old_w1 + epsilon * (new_w1 - old_w1) 
            updated_b1 = old_b1 + epsilon * (new_b1 - old_b1) 

            updated_w2 = old_w2 + epsilon * (new_w2 - old_w2) 
            updated_b2 = old_b2 + epsilon * (new_b2 - old_b2) 


            #update the model parameter with new parameters
            w1.load(updated_w1, sess)
            b1.load(updated_b1, sess)

            w2.load(updated_w2, sess)
            b2.load(updated_b2, sess)

        if e%10 == 0:
            loss = sess.run(loss_function, feed_dict={X: x_sample.reshape(num_samples,1), Y: y_sample.reshape(num_samples,1)})

            print("Epoch {}: Loss {}\n".format(e,loss))
            print('Updated Model Parameter Theta\n')
            print('Sampling Next Batch of Tasks \n')
            print('---------------------------------\n')

Epoch 0: Loss 1.7375366687774658

Updated Model Parameter Theta

Sampling Next Batch of Tasks 

---------------------------------

Epoch 10: Loss 7.7497639656066895

Updated Model Parameter Theta

Sampling Next Batch of Tasks 

---------------------------------

Epoch 20: Loss 3.5970847606658936

Updated Model Parameter Theta

Sampling Next Batch of Tasks 

---------------------------------

Epoch 30: Loss 2.590873956680298

Updated Model Parameter Theta

Sampling Next Batch of Tasks 

---------------------------------

Epoch 40: Loss 1.6606582403182983

Updated Model Parameter Theta

Sampling Next Batch of Tasks 

---------------------------------

Epoch 50: Loss 1.640143632888794

Updated Model Parameter Theta

Sampling Next Batch of Tasks 

---------------------------------

Epoch 60: Loss 1.1941721439361572

Updated Model Parameter Theta

Sampling Next Batch of Tasks 

---------------------------------

Epoch 70: Loss 4.566242218017578

Updated Model Parameter Theta

Sampling Next 