why need to initialize read vector and memory? #9

dragen1860 · 2018-05-25T05:54:08Z

Dear author:
I found you initialize read vector and memory as :

		self.register_buffer('mem_bias', torch.Tensor(N, M))

		# Initialize memory bias
		stdev = 1 / (np.sqrt(N + M))
		nn.init.uniform_(self.mem_bias, -stdev, stdev)

and

init_r_bias = torch.randn(1, M).to('cuda') * 0.01
# the initial value of read vector is not optimized.
self.register_buffer("read{}_bias".format(self.num_read_heads), init_r_bias)

I wonder whether the initialization scheme will make a big difference,
or I can just all initialized to torch.zeros()??

The text was updated successfully, but these errors were encountered:

loudinthecloud · 2018-05-26T19:16:44Z

Since the memory is content addressable, it must be initialized to some bias value. This ensures that all/some cells can be addressed specifically, if needed...

The NTM paper mentions it in section 4:

For NTM the previous state of the controller, the value of the previous read vectors, and the contents of the memory were all reset to bias values.

loudinthecloud self-assigned this May 26, 2018

loudinthecloud closed this as completed May 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why need to initialize read vector and memory? #9

why need to initialize read vector and memory? #9

dragen1860 commented May 25, 2018 •

edited

loudinthecloud commented May 26, 2018 •

edited

why need to initialize read vector and memory? #9

why need to initialize read vector and memory? #9

Comments

dragen1860 commented May 25, 2018 • edited

loudinthecloud commented May 26, 2018 • edited

dragen1860 commented May 25, 2018 •

edited

loudinthecloud commented May 26, 2018 •

edited