-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretraining with full_det=True
#27
Comments
The reason is deciding where to spend time -- in developing code, testing it and in computational time. Pretraining is just to create some initial state that is vaguely close to the ground state (ie within 10s of Hartrees). There's a deliberate choice for pretraining to both be simple and quite crude (so the optimisation doesn't need to break symmetry, for example). This is also why we use a small basis by default, for example. Better pretraining may or may not improve convergence -- I think there's (quickly) diminishing returns. Note that this is not the only simplification we make during pre-training. There's also some discussion in (e.g.) #14 and maybe also in our papers. |
Thanks for the response! I see. Though, I was wondering since it seemed more involved to pad everything with zeros than training with the full matrix. |
Except we are training the FermiNet to match "the product of the two
(spin-up and spin-down) matrices obtained by Hartree Fock". The determinant
of a block-diagonal matrix is just the product of the determinant of the
blocks.
…On Tue, May 18, 2021 at 11:00 AM Nicholas Gao ***@***.***> wrote:
Thanks for the response! I see. Though, I was wondering since it seemed
more involved to pad everything with zeros than training with the full
matrix.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDACGZD2SV3HELKRGBXZTTOI3C5ANCNFSM45AJYCSQ>
.
|
There seems to be a small misunderstanding. Maybe this figure helps clearing things up. |
Note that we train against the UHF state by default rather than the RHF state. I am not convinced by your spin labels on the right-hand side -- line 90 selects the values of the alpha spin-orbitals evaluated at the positions of the alpha electrons and similar the the beta spin-orbitals and electrons. The More importantly training a neural network to as a function predictor by matching the outputs behaves quite poorly (e.g. arXiv:1706.04859) -- the pretraining algorithm is pretty crude and is really designed to give only a starting point which doesn't encounter numerical problems. FermiNet orbitals can be quite different from Hartree-Fock orbitals, so accurately representing Hartree-Fock from pretraining isn't a priority. |
Hi,
I noticed that the default parameter for
full_det
isTrue
. So, I would expect that during the pretraining one also fits the dense Slater-determinant obtained by Hartree Fock. However, it looks like the code only retrieves two blocks from the Slater determinant and fits it to the diagonal blocks of FermiNet while fitting the rest of FermiNet's orbitals to 0.Is there a good reason to train like this?
Wouldn't a better approach be to fit FermiNet's orbitals to the product of the two (spin-up and spin-down) matrices obtained by Hartree Fock?
The text was updated successfully, but these errors were encountered: