Elliott Activation Function #14

gnitr · 2011-08-28T01:42:03Z

Here's the code the Elliott Activation function in case someone is interested. It is not as popular as tanh and sigmoid but I've seen it used in a few papers.

The implementation is based on this report:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46.7204&rep=rep1&type=pdf

Since I discovered that something like 70% of the training time is spent in Math.tanh or Math.exp I was looking for a cheap alternative. The main advantage of this activation function is that it is very fast to compute. It is bounded between -1 and 1 like tanh but will reach those values more slowly so it might be more suitable for classification tasks.

I've had very mixed and results with this implementation so far. Used with Rprop on a xor problem it seems to perform quite badly in terms of number of iterations and getting stuck in local minima or not being able to go below high MSE values. It is quite unexpected so I'm wondering if maybe there's a mistake somewhere with the derivative.

On the other hand I've also observed excellent results with evolutionary algorithms like GA (and my version of PSO) with often very fast convergence compared to tanh and sigmoid. That's why I put this code here in case it might be useful to someone else.

/*
 */
package org.encog.engine.network.activation;

/**
 * Computationally efficient alternative to ActivationTANH.
 * Its output is in the range [-1, 1], and it is derivable.
 * 
 * It will approach the -1 and 1 more slowly than Tanh so it 
 * might be more suitable to classification tasks than predictions tasks.
 * 
 * Elliott, D.L. "A better activation function for artificial neural networks", 1993
 * http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46.7204&rep=rep1&type=pdf
 */
public class ActivationElliott implements ActivationFunction {

    /**
     * Serial id for this class.
     */
    private static final long serialVersionUID = 1234L;

    /**
     * The parameters.
     */
    private final double[] params;

    /**
     * Construct a basic HTAN activation function, with a slope of 1.
     */
    public ActivationElliott() {
        this.params = new double[0];
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public final void activationFunction(final double[] x, final int start,
            final int size) {
        for (int i = start; i < start + size; i++) {
            x[i] = 1.0 / (1.0 + (Math.abs(x[i])));
        }
    }

    /**
     * @return The object cloned;
     */
    @Override
    public final ActivationFunction clone() {
        return new ActivationElliott();
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public final double derivativeFunction(final double b, final double a) {
        return (1.0 - a) * (1.0 - a);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public final String[] getParamNames() {
        final String[] result = {};
        return result;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public final double[] getParams() {
        return this.params;
    }

    /**
     * @return Return true, Elliott activation has a derivative.
     */
    @Override
    public final boolean hasDerivative() {
        return true;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public final void setParam(final int index, final double value) {
        this.params[index] = value;
    }

}

The text was updated successfully, but these errors were encountered:

seemasingh · 2011-08-29T15:31:16Z

Thanks for the code, I will get this added in Encog 3.1.

gnitr · 2011-08-29T20:32:58Z

Hey cool, thanks for looking into this!

I'd be curious to know if you manage to make it work properly with backprogation methods as I've had no luck with it. But I don't see why as I've seen it being successfully used in various research reports.

If it happens to work then it might become be a useful alternative to tanh for intensive training tasks. I think it can take more iterations than tanh to meet a training condition but the iterations should take much less CPU time.

Geoffroy

gnitr · 2011-08-30T00:14:39Z

Here's another page which mentions this function and also a variant which is analogous to the logistic sigmoid this time:
http://www.dontveter.com/bpr/activate.html

I've done more tests with it and got some interesting results.

Due to the shape of the curve of these activation functions it is important to rescale the output to [0.1, 0.9] instead of [0, 1]. The best results I've had so far were with an evolutionary algorithm on the XOR dataset. Using a combination of the Elliott function and it's sigmoid variant the number of iterations was 5 times lower than with TANH+SIGMOID and less trapped into local minima. Moreover the CPU time per iteration was twice better.

This difference was not visible on all the datatsets I've tried but at least it seems to confirm that it's certainly worth exploring their applicability when working evolutionary algorithms like GA or Neat for instance.

seemasingh · 2011-09-08T13:28:18Z

That is pretty interesting, I read the articles on the activation function. I added your code to Encog and I also was not able to get any sort of propagation (derivative based) training to work. I plugged the activation function you had in the code into R and came up with a different derivative. So I am THINKING there might be some disconnect there. I will take a look.

seemasingh · 2011-09-08T14:00:00Z

Okay I added the class. I also got the derivative working, it now performs just as well as tanh in my initial testing. I also added this to the workbench so that you can easily toggle between the Elliott chart and the tanh chart and see the minor difference.

I actually split this into two files:

https://github.com/encog/encog-java-core/blob/master/src/main/java/org/encog/engine/network/activation/ActivationElliott.java

https://github.com/encog/encog-java-core/blob/master/src/main/java/org/encog/engine/network/activation/ActivationElliottSymmetric.java

ActivationElliott - is similar to Encog's sigmoid function, that is range [0,1]

ActivationElliottSymmetric - is similar to Encog's tanh function, that is range [-1,1]

Did not do any performance benchmarks, however, I bet it is much faster.

jeffheaton · 2011-09-08T14:38:48Z

It will be interesting to see this in a benchmark.

gnitr · 2011-09-09T00:41:46Z

Hey, thanks for integrating this to Encog! I see that you've also included a parameter to control the slope, good thinking!

I'll do more tests once I get some spare time.

Geoffroy

jeffheaton · 2011-09-09T11:22:31Z

These are quite fast! I will do an official benchmark soon. Thanks for adding them, very useful. I will close the issue once I add them to C# too.

seemasingh · 2011-09-23T21:28:21Z

Implemented in both Java and C#, closing issue.

ghost assigned seemasingh Aug 29, 2011

seemasingh closed this as completed Aug 29, 2011

seemasingh reopened this Aug 29, 2011

seemasingh closed this as completed Sep 23, 2011

daydreamt mentioned this issue Jan 11, 2014

testTemporal fails (and so does mvn package) #163

Closed

lasha81 mentioned this issue Feb 18, 2015

problem with train.getError() #203

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elliott Activation Function #14

Elliott Activation Function #14

gnitr commented Aug 28, 2011

seemasingh commented Aug 29, 2011

gnitr commented Aug 29, 2011

gnitr commented Aug 30, 2011

seemasingh commented Sep 8, 2011

seemasingh commented Sep 8, 2011

jeffheaton commented Sep 8, 2011

gnitr commented Sep 9, 2011

jeffheaton commented Sep 9, 2011

seemasingh commented Sep 23, 2011

Elliott Activation Function #14

Elliott Activation Function #14

Comments

gnitr commented Aug 28, 2011

seemasingh commented Aug 29, 2011

gnitr commented Aug 29, 2011

gnitr commented Aug 30, 2011

seemasingh commented Sep 8, 2011

seemasingh commented Sep 8, 2011

jeffheaton commented Sep 8, 2011

gnitr commented Sep 9, 2011

jeffheaton commented Sep 9, 2011

seemasingh commented Sep 23, 2011