In [1]:
import torch


We will input values of temperature, rainfall and humidity taking as random values. Tensor takes inputs in the form of tensors. The columns in each row denote values of temperature, rainfall and humidity. We have given the input datatypes as float and 32 denotes the number of bit given to these tensors.

In [2]:
inputs=torch.tensor([[73, 67, 43],
                      [91, 88, 64],
                      [87, 134, 58],
                      [102, 43, 37],
                       [69, 96, 70]], dtype=torch.float32)

We will have outputs as yields of oranges and mangoes.

In [3]:
targets=torch.tensor([[56, 70],
                       [81, 101],
                       [119, 133],
                       [22, 37],
                       [103, 119]],dtype=torch.float32)

Now we need to generate random values for these parameters. Now in again for machine learning jargon the coefficients(c₁) are called as weights and the contant term c₀ is called as bias term . So we use torch.randn function creates a tensor with the given shape, with elements picked randomly from a normal distribution with mean 0 and standard deviation 1.

In [5]:
# Creation of Weights and biases
w = torch.randn(2, 3, requires_grad=True)
b = torch.randn(2, requires_grad=True)
print(w)
print(b)

tensor([[-1.1128, -1.5336, -0.7399],
        [ 0.4458,  0.2488,  0.2831]], requires_grad=True)
tensor([-0.0365, -1.7325], requires_grad=True)


With PyTorch, we have the ability to calculate the gradient or derivative of the loss w.r.t. to the weights and biases because they have requires_grad set to True.
@ represents matrix multiplication in PyTorch, and the .t method returns the transpose of a tensor. The matrix obtained by passing the input data into the model is a set of predictions for the target variables. Here create a function which gives the simple matrix multiplication of weights with the respective variable input values and added bias to it.

In [6]:
def model(x):
     return x @ w.t() + b

Now we can generate predictions as by giving inputs to the model.

In [7]:
# Generate predictions
preds = model(inputs)
print(preds)

tensor([[-215.8400,   59.6534],
        [-283.6144,   78.8475],
        [-345.2708,   86.8115],
        [-206.8653,   64.9110],
        [-275.8408,   72.7293]], grad_fn=<AddBackward0>)


We need to compute the error now by defining the error function. We give inputs as predictions and targets to these function get the squared differences’ average as the output. .numel()is a tensor function which gives us the total number of inputs given. Hence squared difference sum divided by total number of inputs is our output.

In [8]:
def mse(t1,t2):
  diff=t1-t2
  return torch.sum(diff*diff)/diff.numel()

In [11]:
loss=mse(targets,preds)
loss

tensor(62393.8828, grad_fn=<DivBackward0>)

Now pytorch has given us the ability to calculate the gradients over the loss function which we will use now.

In [12]:
# Computing the gradients here
loss.backward()

If you want to check the values of each of the gradients you can print them as follows

In [13]:
print(w.grad)
print(b.grad)

tensor([[-28580.0137, -31744.3105, -19387.8047],
        [ -1427.0693,  -2414.7451,  -1349.5691]])
tensor([-341.6863,  -19.4095])


The gradients values will be unique to each one as the rand give random values. Also check the values of weights and bias to compare with the gradients.

In [14]:
with torch.no_grad():
  w -= w.grad * 1e-5
  b -= b.grad * 1e-5
  w.grad.zero_()
  b.grad.zero_()

So here we will use the pytorch function of torch.no_grad() to disable the calculation of gradients as it consumes a lot of memory. This is only required only when we have requires_grad=True in for these variables specified earlier. Also another important point is resetting the gradients to zero as pytorch keeps on accumulating the gradients. One must notice that we have multiplied the new weights and bias with the learning. Deciding the learning comes by use of trail and error method. Whichever value gives the faster reduction in values of error must be used
Now we must run this algorithm for a number of cycles until the error is minimum. We have chosen 100 as the number here which can be varied according to dataset used

In [15]:
# Train for 100 epochs
for i in range(100):
 preds = model(inputs)
 loss = mse(preds, targets)
 loss.backward()
 with torch.no_grad():
    w -= w.grad * 1e-5
    b -= b.grad * 1e-5
    w.grad.zero_()
    b.grad.zero_()

Finally we need to calculate the loss to check if we have really reduced the it. For this dataset any error in range of 100 is good enough as it is squared value.

In [16]:
# Calculate loss
preds = model(inputs)
loss = mse(preds, targets)
print(loss)

tensor(207.8029, grad_fn=<DivBackward0>)


One must check the closeness in values of predictions and targets .

In [17]:
print(preds)
print(targets)

tensor([[ 62.0590,  74.3972],
        [ 84.0368, 100.5434],
        [106.6636, 126.6689],
        [ 49.3961,  62.1706],
        [ 88.5765, 103.8958]], grad_fn=<AddBackward0>)
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])
