<a href="https://colab.research.google.com/github/RogerHeederer/NLPwithPyTorch_book/blob/main/AboutTensor_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Coumputing a conditional gradient

In [1]:
import torch
import numpy as np
torch.manual_seed(1234)

<torch._C.Generator at 0x7f1995ec1b70>

In [2]:
def f(x):
  if (x.data > 0).all():
    return torch.sin(x)
  else:
    return torch.cos(x)

In [3]:
x = torch.tensor([1.0], requires_grad=True)
y = f(x)
y.backward()
print(x.grad)

tensor([0.5403])


아래는 에러 남. 아웃풋은 a scalar 형태로 만들어야 함

In [4]:
x = torch.tensor([1.0, 0.5], requires_grad=True)
y = f(x)
print(y)
print(x.grad)

tensor([0.8415, 0.4794], grad_fn=<SinBackward>)
None


In [5]:
y.backward()

RuntimeError: ignored

In [6]:
x = torch.tensor([1.0, 0.5], requires_grad=True)
y = f(x)
y.sum().backward()
print(x.grad)

tensor([0.5403, 0.8776])


This is because we aren't doing the boolean computation and subsequent application of cos and sin on an elementwise basis. So, to solve this, it is common to use masking:

In [7]:
def f2(x):
  mask = torch.gt(x, 0).float()
  return mask * torch.sin(x) + (1 - mask) * torch.cos(x)

x = torch.tensor([1.0, -1], requires_grad=True)
y = f2(x)
y.sum().backward()
print(x.grad)

tensor([0.5403, 0.8415])


In [8]:
def describe_grad(x):
  if x.grad is None:
    print("No gradient information")
  else:
    print("Gradient: \n{}". format(x.grad))
    print("Gradient Function: {}".format(x.grad_fn))

In [9]:
def describe(x):
    print("Type: {}".format(x.type()))
    print("Shape/size: {}".format(x.shape))
    print("Values: \n{}".format(x))

In [10]:
import torch
x = torch.ones(2, 2, requires_grad=True)
describe(x)
describe_grad(x)
print("--------")
print("\n\n")
y = (x + 2) * (x + 5) + 3
describe(y)
print("\n\n")
z = y.mean()
describe(z)
describe_grad(x)
print("--------")
print("\n\n")
z.backward(create_graph=True, retain_graph=True)
describe_grad(x)
print("--------")

Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values: 
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
No gradient information
--------



Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values: 
tensor([[21., 21.],
        [21., 21.]], grad_fn=<AddBackward0>)



Type: torch.FloatTensor
Shape/size: torch.Size([])
Values: 
21.0
No gradient information
--------



Gradient: 
tensor([[2.2500, 2.2500],
        [2.2500, 2.2500]], grad_fn=<CopyBackwards>)
Gradient Function: None
--------


In [11]:
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
y.grad_fn

<AddBackward0 at 0x7f19401ed278>

CUDA Tensors

In [12]:
print(torch.cuda.is_available())

True


In [13]:
x = torch.rand(3,3)
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([3, 3])
Values: 
tensor([[0.0290, 0.4019, 0.2598],
        [0.3666, 0.0583, 0.7006],
        [0.0518, 0.4681, 0.6738]])


In [14]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


In [15]:
x = torch.rand(3, 3).to(device)
describe(x)
print(x.device)

Type: torch.cuda.FloatTensor
Shape/size: torch.Size([3, 3])
Values: 
tensor([[0.3315, 0.7837, 0.5631],
        [0.7749, 0.8208, 0.2793],
        [0.6817, 0.2837, 0.6567]], device='cuda:0')
cuda:0


In [16]:
cpu_device = torch.device("cpu")

In [17]:
#this will break! 왜냐면 x는 GPU tensor. y는 CPU tensor
y = torch.rand(3,3)
x + y

RuntimeError: ignored

In [18]:
y = y.to(cpu_device)
x = x.to(cpu_device)
x + y

tensor([[0.5702, 1.5150, 1.1643],
        [1.0792, 1.0756, 0.9086],
        [1.6482, 1.0236, 1.1084]])

In [19]:
if torch.cuda.is_available():
  a = torch.rand(3,3).to(device='cuda:0')
  print(a)

  b = torch.rand(3,3).cuda()
  print(b)

  print (a + b)

tensor([[0.4757, 0.7842, 0.1525],
        [0.6662, 0.3343, 0.7893],
        [0.3216, 0.5247, 0.6688]], device='cuda:0')
tensor([[0.8436, 0.4265, 0.9561],
        [0.0770, 0.4108, 0.0014],
        [0.5414, 0.6419, 0.2976]], device='cuda:0')
tensor([[1.3193, 1.2107, 1.1086],
        [0.7432, 0.7451, 0.7907],
        [0.8631, 1.1666, 0.9664]], device='cuda:0')


In [20]:
a = a.cpu()
print(a+b)

RuntimeError: ignored