New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImportanceDataSet with BackpropTrainer results in IndexError #168
Comments
Changed def _calcDerivs(self, seq):
"""Calculate error function and backpropagate output errors to yield
the gradient."""
self.module.reset()
for sample in seq:
self.module.activate(sample[0])
error = 0
ponderation = 0.
print("seq shape: (%d x %d)" % (len(seq), len(seq[0])))
print("\n\n")
for offset, sample in reversed(list(enumerate(seq))):
# need to make a distinction here between datasets containing
# importance, and others
target = sample[1]
print("target: %s\n outputbuffer: %s\noffset: %s\n\n" % (target, self.module.outputbuffer, offset))
outerr = target - self.module.outputbuffer[offset]
if len(sample) > 2:
importance = sample[2]
error += 0.5 * dot(importance, outerr ** 2)
ponderation += sum(importance)
self.module.backActivate(outerr * importance)
else:
error += 0.5 * sum(outerr ** 2)
ponderation += len(target)
# FIXME: the next line keeps arac from producing NaNs. I don't
# know why that is, but somehow the __str__ method of the
# ndarray class fixes something,
str(outerr)
self.module.backActivate(outerr)
return error, ponderation and got the following output:
|
Fixed it. I forgot to call '''
Create and train a neural net on the training data, given the actual labels
'''
def create_neural_net(training, labels, weights=None, T=10, silent=False):
input_units = len(training[0])
output_units = len(labels[0])
n = len(training)
net = FeedForwardNetwork()
layer_in = SoftmaxLayer(input_units)
layer_hidden = SigmoidLayer(1000)
layer_hidden2 = SigmoidLayer(50)
layer_out = LinearLayer(output_units)
net.addInputModule(layer_in)
net.addModule(layer_hidden)
net.addModule(layer_hidden2)
net.addOutputModule(layer_out)
net.addConnection(FullConnection(layer_in, layer_hidden))
net.addConnection(FullConnection(layer_hidden, layer_hidden2))
net.addConnection(FullConnection(layer_hidden2, layer_out))
net.sortModules()
training_data = ImportanceDataSet(input_units, output_units) # SupervisedDataSet(input_units, output_units)
for i in xrange(n):
# print len(training[i]) # prints 148
# print len(labels[i]) # prints 13
training_data.newSequence()
training_data.addSample(training[i], labels[i], importance=ones(output_units) * (weights[i] if weights is not None else 1)) # training_data.appendLinked(training[i], labels[i])
trainer = BackpropTrainer(net, training_data)
for i in xrange(T):
if not silent: print "Training %d" % (i + 1)
error = trainer.train()
if not silent: print net.activate(training[0]), labels[0]
if not silent: print "Training iteration %d. Error: %f." % (i + 1, error)
return net |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have a dataset which I am clustering using a gaussian mixture model, and then I want to train a neural network for each of the clusters. I want to use all the points in my dataset weighted based on the probability they are in the cluster for which the net is being trained.
Originally, I was not weighting the training data and it worked fine:
But now when I try to weight the data points:
I get the following error:
The text was updated successfully, but these errors were encountered: