Skip to content

Latest commit

 

History

History
executable file
·
260 lines (197 loc) · 6.34 KB

transfer.md

File metadata and controls

executable file
·
260 lines (197 loc) · 6.34 KB
# Transfer Function Layers # Transfer functions are normally used to introduce a non-linearity after a parameterized layer like [Linear](simple.md#nn.Linear) and [SpatialConvolution](convolution.md#nn.SpatialConvolution). Non-linearities allows for dividing the problem space into more complex regions than what a simple logistic regressor would permit. ## HardTanh ##

Applies the HardTanh function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.

HardTanh is defined as:

  • f(x) = 1, if x > 1,
  • f(x) = -1, if x < -1,
  • f(x) = x, otherwise.
ii=torch.linspace(-2,2)
m=nn.HardTanh()
oo=m:forward(ii)
go=torch.ones(100)
gi=m:backward(ii,go)
gnuplot.plot({'f(x)',ii,oo,'+-'},{'df/dx',ii,gi,'+-'})
gnuplot.grid(true)

## HardShrink ##

module = nn.HardShrink(lambda)

Applies the hard shrinkage function element-wise to the input Tensor. The output is the same size as the input.

HardShrinkage operator is defined as:

  • f(x) = x, if x > lambda
  • f(x) = -x, if x < -lambda
  • f(x) = 0, otherwise
ii=torch.linspace(-2,2)
m=nn.HardShrink(0.85)
oo=m:forward(ii)
go=torch.ones(100)
gi=m:backward(ii,go)
gnuplot.plot({'f(x)',ii,oo,'+-'},{'df/dx',ii,gi,'+-'})
gnuplot.grid(true)

## SoftShrink ##

module = nn.SoftShrink(lambda)

Applies the hard shrinkage function element-wise to the input Tensor. The output is the same size as the input.

HardShrinkage operator is defined as:

  • f(x) = x-lambda, if x > lambda
  • f(x) = -x+lambda, if x < -lambda
  • f(x) = 0, otherwise
ii=torch.linspace(-2,2)
m=nn.SoftShrink(0.85)
oo=m:forward(ii)
go=torch.ones(100)
gi=m:backward(ii,go)
gnuplot.plot({'f(x)',ii,oo,'+-'},{'df/dx',ii,gi,'+-'})
gnuplot.grid(true)

## SoftMax ##

Applies the Softmax function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0,1) and sum to 1.

Softmax is defined as f_i(x) = exp(x_i-shift) / sum_j exp(x_j-shift), where shift = max_i x_i.

ii=torch.exp(torch.abs(torch.randn(10)))
m=nn.SoftMax()
oo=m:forward(ii)
gnuplot.plot({'Input',ii,'+-'},{'Output',oo,'+-'})
gnuplot.grid(true)

Note that this module doesn't work directly with ClassNLLCriterion, which expects the nn.Log to be computed between the SoftMax and itself. Use LogSoftMax instead (it's faster).

## SoftMin ##

Applies the Softmin function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0,1) and sum to 1.

Softmin is defined as f_i(x) = exp(-x_i-shift) / sum_j exp(-x_j-shift), where shift = max_i x_i.

ii=torch.exp(torch.abs(torch.randn(10)))
m=nn.SoftMin()
oo=m:forward(ii)
gnuplot.plot({'Input',ii,'+-'},{'Output',oo,'+-'})
gnuplot.grid(true)

### SoftPlus ###

Applies the SoftPlus function to an n-dimensioanl input Tensor. Can be used to constrain the output of a machine to always be positive.

SoftPlus is defined as f_i(x) = 1/beta * log(1 + exp(beta * x_i)).

ii=torch.randn(10)
m=nn.SoftPlus()
oo=m:forward(ii)
go=torch.ones(10)
gi=m:backward(ii,go)
gnuplot.plot({'Input',ii,'+-'},{'Output',oo,'+-'},{'gradInput',gi,'+-'})
gnuplot.grid(true)

## SoftSign ##

Applies the SoftSign function to an n-dimensioanl input Tensor.

SoftSign is defined as f_i(x) = x_i / (1+|x_i|)

ii=torch.linspace(-5,5)
m=nn.SoftSign()
oo=m:forward(ii)
go=torch.ones(100)
gi=m:backward(ii,go)
gnuplot.plot({'f(x)',ii,oo,'+-'},{'df/dx',ii,gi,'+-'})
gnuplot.grid(true)

## LogSigmoid ##

Applies the LogSigmoid function to an n-dimensional input Tensor.

LogSigmoid is defined as f_i(x) = log(1/(1+ exp(-x_i))).

ii=torch.randn(10)
m=nn.LogSigmoid()
oo=m:forward(ii)
go=torch.ones(10)
gi=m:backward(ii,go)
gnuplot.plot({'Input',ii,'+-'},{'Output',oo,'+-'},{'gradInput',gi,'+-'})
gnuplot.grid(true)

## LogSoftMax ##

Applies the LogSoftmax function to an n-dimensional input Tensor.

LogSoftmax is defined as f_i(x) = log(1/a exp(x_i)), where a = sum_j exp(x_j).

ii=torch.randn(10)
m=nn.LogSoftMax()
oo=m:forward(ii)
go=torch.ones(10)
gi=m:backward(ii,go)
gnuplot.plot({'Input',ii,'+-'},{'Output',oo,'+-'},{'gradInput',gi,'+-'})
gnuplot.grid(true)

## Sigmoid ##

Applies the Sigmoid function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.

Sigmoid is defined as f(x) = 1/(1+exp(-x)).

ii=torch.linspace(-5,5)
m=nn.Sigmoid()
oo=m:forward(ii)
go=torch.ones(100)
gi=m:backward(ii,go)
gnuplot.plot({'f(x)',ii,oo,'+-'},{'df/dx',ii,gi,'+-'})
gnuplot.grid(true)

## Tanh ##

Applies the Tanh function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.

ii=torch.linspace(-3,3)
m=nn.Tanh()
oo=m:forward(ii)
go=torch.ones(100)
gi=m:backward(ii,go)
gnuplot.plot({'f(x)',ii,oo,'+-'},{'df/dx',ii,gi,'+-'})
gnuplot.grid(true)

## ReLU ##

Applies the rectified linear unit (ReLU) function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.

ii=torch.linspace(-3,3)
m=nn.ReLU()
oo=m:forward(ii)
go=torch.ones(100)
gi=m:backward(ii,go)
gnuplot.plot({'f(x)',ii,oo,'+-'},{'df/dx',ii,gi,'+-'})
gnuplot.grid(true)

## AddConstant ##

Adds a (non-learnable) scalar constant. This module is sometimes useful for debuggging purposes: f(x) = x + k, where k is a scalar.

## MulConstant ##

Multiplies input tensor by a (non-learnable) scalar constant. This module is sometimes useful for debuggging purposes: f(x) = k * x, where k is a scalar.