Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why need to initialize read vector and memory? #9

Closed
dragen1860 opened this issue May 25, 2018 · 1 comment
Closed

why need to initialize read vector and memory? #9

dragen1860 opened this issue May 25, 2018 · 1 comment
Assignees

Comments

@dragen1860
Copy link

dragen1860 commented May 25, 2018

Dear author:
I found you initialize read vector and memory as :

		self.register_buffer('mem_bias', torch.Tensor(N, M))

		# Initialize memory bias
		stdev = 1 / (np.sqrt(N + M))
		nn.init.uniform_(self.mem_bias, -stdev, stdev)

and

init_r_bias = torch.randn(1, M).to('cuda') * 0.01
# the initial value of read vector is not optimized.
self.register_buffer("read{}_bias".format(self.num_read_heads), init_r_bias)

I wonder whether the initialization scheme will make a big difference,
or I can just all initialized to torch.zeros()??

@loudinthecloud
Copy link
Owner

loudinthecloud commented May 26, 2018

Since the memory is content addressable, it must be initialized to some bias value. This ensures that all/some cells can be addressed specifically, if needed...

The NTM paper mentions it in section 4:

For NTM the previous state of the controller, the value of the previous read vectors, and the contents of the memory were all reset to bias values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants