Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative phase in Boltzmann machine #51

Closed
gorkamunoz opened this issue May 8, 2017 · 6 comments
Closed

Negative phase in Boltzmann machine #51

gorkamunoz opened this issue May 8, 2017 · 6 comments

Comments

@gorkamunoz
Copy link
Collaborator

I was trying to train a Boltzmann machine in the way they do in Benedettis et al. paper (arXiv:1609.02542). I am wondering how to calculate equations (9) and (10) correctly. My problem comes when calculating the ensemble average with respect to the distribution P(z) (the negative phase).
Does this distribution evolve at the same rate as the coupling parameters J and h change through equations (7) and (8)? In order words, each time I update the couplings, I should apply this changes to the model lattice and calculate a new spin distribution?

@apozas
Copy link
Collaborator

apozas commented May 11, 2017

I see what you are asking, but I do not understand the calculate a new spin distribution part. When training, the data that you know is precisely the spins (the zs), and at each step you update the couplings J and h. To have this done, you need to build the distributions corresponding to the data and the model.

The one coming from the data (the positive phase one) seems like it should be somehow "given", or at least this is what it seems to me. It looks like a distribution over all possible spin configurations that is only 1 for the configurations in the training set.

Then, the negative phase one is built in the following manner: for each step you get some Js and hs. Then you build you function E(z), where now the zs are the free parameters. Next, you build the distribution P(z) = 1/Z * exp(-beta E(z)) for some beta, and finally you sample from that distribution.

So for me it looks more difficult to really understand the positive phase distribution.

Maybe I didn't understand well your question, was this helpful?

@gorkamunoz
Copy link
Collaborator Author

gorkamunoz commented May 12, 2017

Ok, so my problem comes when sampling from the distribution, as you indicate in the last paragraph. Maybe I am thinking too much in a 'physical' way, but for me, once I update the Js and the hs, I should let the system evolve in order to go to the equilibrium, and then, sample from the distribution of spins.

But this equilibration process seems quite ineffective, so I guess I am doing something wrong?

You can rephrase all my problems in the following: are equations (9) and (10) constants, or does <>_M change? ( To my understanding <>_D is constant as it comes from my the given training set)

@apozas
Copy link
Collaborator

apozas commented May 12, 2017

Hmmm... I am not very convinced with any "evolution of the physical system" actually taking place (despite the fact that we might have discussed about it a few weeks ago). The process of learning, as I see it, should be composed of these steps:

1.- Compute Q(z) with the training points (this distribution will not change throughout the process). You can already compute the <>_D (the positive phase contribution) if you want, since they will indeed not change throughout the process. You can either do a sampling according to the probability distribution, or just compute _D=\sum_z A(z) Q(z).
2.- Initialize the Js and hs (i.e., give them an initial arbitrary value)
3.- Build the distribution P(z)=1/Z * exp(beta * E(z)), with E(z) the energy function corresponding to the Js and hs defined before.
4.- Compute the negative phase contributions <>_M (again, either by sampling or just computing it "analytically")
5.- Compute the new Js and hs according to Eqs. 9 and 10
6.- Goto step 3 until convergence (i.e., until P(z) ~ Q(z))

@gorkamunoz
Copy link
Collaborator Author

Yes, I had a similar idea. However, my problem comes in step 3! P(z) is a function of the couplings J and h, but also of a certain configuration of spins, that then your are sampling in step 4. How do we define this spins? Is the evolution of this spins what bothers me, as in a physical system they should change as you change the couplings between them.

@apozas
Copy link
Collaborator

apozas commented May 12, 2017 via email

@apozas
Copy link
Collaborator

apozas commented May 24, 2017

I guess after the discussion in the meeting this issue can be closed.

@apozas apozas closed this as completed May 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants