Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve consistency of the lattice API #715

Merged
merged 16 commits into from May 20, 2021
Merged

Improve consistency of the lattice API #715

merged 16 commits into from May 20, 2021

Conversation

femtobit
Copy link
Collaborator

@femtobit femtobit commented May 17, 2021

Since I've started to actively use Lattice in my own code recently, I have noticed a number of things in the interface that I think can be polished a bit.

In order to not burden you (specifically @chrisrothUT and @attila-i-szabo) with a lot of change requests in subsequent PRs, I've implemented several of the API changes I would like to see myself and propose them here as PR so we can discuss this early (also picking up some of the things that were left open when merging PR #702) and have a solid base for finalizing the symmetries (which this PR just moves around).

While it may look like a lot of changes, this PR is mostly just rearrangement and renaming with the aim of making the lattice interface more intuitive and also making the Lattice implementation easier to follow.

ID, position, and basis coords

Most importantly, I am trying to have the API use consistent terms for the different ways of referring to a lattice site or its location as mentions of position, coordinates, cell, label, etc. can easily get confusing otherwise.

Specifically, I propose the following naming convention which is described by the new Lattice docstring:

A lattice built by periodic arrangement of a given unit cell.
The lattice is represented as a Bravais lattice with (:code:`basis_vectors`)
:math:`\{a_d\}_{d=1}^D` (where :math:`D = \mathtt{ndim}` is the dimension of the lattice)
and a unit cell consisting of one or more sites,
The positions of those sites within the unit cell can be specified by the :code:`site_offsets`
parameter. The :code:`extent` is a array where :code:`extent[d]` specifies the number of
times each unit cell is translated along direction :math:`d`.
The full lattice is then generated by placing a site at each of the points
.. math::
R_{rq} = \sum_{d=1}^D r_d a_d + b_q \in \mathbb R^D
where :math:`r_d \in \{1, \ldots, \mathtt{extent}[d]\}` and :math:`b_q = \mathtt{site\_offsets}[q]`.
We also refer to :math:`q` as the `label` of the site within the unit cell.
The lattice class supports three ways of addressing a specific lattice site:
id
An integer index that is used to identify the site in :code:`self.edges()` and
also corresponds to the index of the corresponding site in sequences like
:code:`self.nodes()`, :code:`self.positions` or :code:`self.basis_coords`.
positions
Real-space position vector :math:`R_{rq}` as defined above, which is available from
:func:`~netket.graph.Lattice.positions` and can be resolved into an id via
:func:`~netket.graph.Lattice.id_from_position`.
basis coordinates
where each site is specified by a vector :code:`[r1, ..., rD, q]`
with :math:`r` being the integer vector of length :code:`ndim` specifying the
cell position as multiples of the primitive vectors and the site label :math:`q`
giving the number of the site within the unit cell.
Basis coordinates are available from :func:`~netket.graph.Lattice.basis_coords` and
can be resolved into an id via :func:`~netket.graph.Lattice.id_from_basis_coords`.

I think by using these (or similar) terms consistently in the interface and having a symmetrical naming convention for look-up functions (id_from_position, id_from_basis_coords) the API becomes much easier to use.

Hashing / position-based lookup

As discussed in #702 (comment), I've removed the references to "hashing" (as this is mostly an implementation detail of the dict lookup) and also extracted the id_from_position function (which is generally useful and thus should not be hidden in the internal API).

I hope these changes are not causing too many merge conflicts in anyone's code at the moment, otherwise I apologize, but I'd be very happy to get the Lattice interface polished before we finalize our work on the symmetries.

Let me know what you think.

@femtobit femtobit changed the title Improve consistency of the lattice API RFC: Improve consistency of the lattice API May 17, 2021
@github-actions
Copy link

Hello and thanks for your Contribution!
I will be building previews of the updated documentation at the following link:
https://netket.github.io/netket/preview/lattice-api

Once the PR is closed or merged, the preview will be automatically deleted.

Copy link
Collaborator Author

@femtobit femtobit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some open questions from my side.

netket/graph/lattice.py Outdated Show resolved Hide resolved
netket/graph/lattice.py Show resolved Hide resolved
Copy link
Collaborator

@attila-i-szabo attila-i-szabo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the refactoring, I like the language changes and how some logic got factored out, so it will be easier to move them to their final place. I only have one question before I'd approve.

I'd like to push this through quickly so there would be a version of Lattice and the group classes that remains stable for at least a day, so I can get started without worrying about merge conflicts...

netket/graph/lattice.py Outdated Show resolved Hide resolved
netket/graph/lattice.py Show resolved Hide resolved
netket/graph/lattice.py Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented May 17, 2021

Codecov Report

Merging #715 (6e22ce5) into master (3dc07d8) will increase coverage by 0.20%.
The diff coverage is 92.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #715      +/-   ##
==========================================
+ Coverage   69.26%   69.47%   +0.20%     
==========================================
  Files         216      216              
  Lines       12353    12480     +127     
  Branches     1790     1809      +19     
==========================================
+ Hits         8556     8670     +114     
- Misses       3327     3335       +8     
- Partials      470      475       +5     
Impacted Files Coverage Δ
netket/graph/lattice.py 91.82% <92.51%> (+0.91%) ⬆️
netket/utils/struct/dataclass.py 88.70% <0.00%> (-2.78%) ⬇️
netket/utils/struct/utils.py 83.33% <0.00%> (+1.51%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3dc07d8...6e22ce5. Read the comment docs.

@femtobit femtobit changed the title RFC: Improve consistency of the lattice API Improve consistency of the lattice API May 17, 2021
@femtobit
Copy link
Collaborator Author

So, I'll wait a bit for more feedback, but generally I think this can be merged soon. Future improvements (e.g., with regard to create_sites etc.) are possible, but that can be done without affecting the API.

netket/graph/lattice.py Outdated Show resolved Hide resolved
@gcarleo
Copy link
Member

gcarleo commented May 17, 2021

is this breaking the APIs of the old Lattice?

@gcarleo
Copy link
Member

gcarleo commented May 17, 2021

I am really not in favor of changing all the APIs here, the old nomenclature was inspired by the ALPS, if I recall that correctly http://alps.comp-phys.org/mediawiki/index.php/Tutorials:LatticesAndUnitCells

@attila-i-szabo
Copy link
Collaborator

I guess this is part of the endless debate whether "basis" refers to the primitive lattice vectors or the contents of a unit cell...
Unless we use ALPS for something, I wouldn't consider that to be an authority any more than any other library people might use for their simulations.
More damningly, the International Tables for Crystallography also use "basis" in the first sense

@attila-i-szabo
Copy link
Collaborator

In either case, the old names were clunky, we could call them basis and sites to fit the original meaning of "basis" rather than basis_vectors and atoms_coord

@gcarleo
Copy link
Member

gcarleo commented May 17, 2021

the point is that we cannot just change names to things at every release, we have done it in cases we couldn't avoid, but here it seems to me it is a mere matter of taste

@femtobit
Copy link
Collaborator Author

femtobit commented May 17, 2021

the point is that we cannot just change names to things at every release

I find it unlikely that if people have to change all their sampler, machine, etc. setup code (which they have to for v3, and for good reason) would be pushed over the edge by having to change basis_vectors to primitives etc.

Note that the current Lattice has never been part of a release and, more importantly, IIRC was not actually creating all edges until a recent bug fix, so I don't assume it was widely used given that no one noticed that earlier.

However, I do not feel strongly about the primitive/basis thing, we could just keep basis_vectors for the primitive vectors. I'm not too happy with atoms_coords (especially since it breaks the consistent convention of position for real-space and coords for the integer-valued cell + basis index [= "basis in the primitive vectors sense" here] which I feel makes the interface much easier to work with and also we are not necessarily dealing with atoms anyways) but we could have it as an optional alias for site_positions or so as @attila-i-szabo is proposing, maybe with a deprecation warning.

@gcarleo
Copy link
Member

gcarleo commented May 17, 2021

@femtobit , Latticewas already part of 2.1, with the very same APIs. In 3.0 it was rewritten in python (introducing that bug, now corrected), without changing the interface

@attila-i-szabo
Copy link
Collaborator

Note that the current Lattice has never been part of a release and, more importantly, IIRC was not actually creating all edges until a recent bug fix, so I don't assume it was widely used given that no one noticed that earlier.

My first contribution to NetKet was fixing a malfunctioning np.unique that was bound to break at every attempt of using Lattice, so I second this argument.

However, I do not feel strongly about the primitive/basis thing, we could just keep basis_vectors for the primitive vectors. I'm not too happy with atoms_coords (especially since it breaks the consistent convention and also we are not necessairly dealing with atoms anyways) but we could have it as an optional alias for site_positions or so as @attila-i-szabo is proposing, maybe with a deprecation warning.

This sounds good. I don't feel strongly about what "basis" means, either, maybe stick to basis_vectors for minimal change and allow atoms_coords with a deprecation warning

@gcarleo
Copy link
Member

gcarleo commented May 17, 2021

For these reasons I would not change names, or if you really want to change names we need to support also the old ones and use a deprecation warning

@gcarleo
Copy link
Member

gcarleo commented May 17, 2021

My first contribution to NetKet was fixing a malfunctioning np.unique that was bound to break at every attempt of using Lattice, so I second this argument.

That's great and thanks for finding that bug, but in 2.0 the C++ version of Latticewas working perfectly and nobody has ever complained about basis_vectors in ~2 years :)

@gcarleo
Copy link
Member

gcarleo commented May 17, 2021

my argument in favor of basis_vectors is that it is quite natural to talk about basis , since they are a basis in the sense of linear algebra. I understand that basis is sometimes used in crystallography with a different meaning, but I find that use of basis inconsistent, to be honest

@attila-i-szabo
Copy link
Collaborator

My first contribution to NetKet was fixing a malfunctioning np.unique that was bound to break at every attempt of using Lattice, so I second this argument.

That's great and thanks for finding that bug, but in 2.0 the C++ version of Latticewas working perfectly and nobody has ever complained about basis_vectors in ~2 years :)

That in itself is not an argument for not optimising an interface if we think we can (and everything else is torn up already – I would be against such minor changes after v3.0, which is why we should contemplate them now)

my argument in favor of basis_vectors is that it is quite natural to talk about basis , since they are a basis in the sense of linear algebra. I understand that basis is sometimes used in crystallography with a different meaning, but I find that use of basis inconsistent, to be honest

but I agree with this (and so does the IToCr, which sounds like a good authority on preferred names). atoms_coord is bad, however (see @femtobit's arguments above)

@femtobit
Copy link
Collaborator Author

femtobit commented May 17, 2021

I understand that basis is sometimes used in crystallography with a different meaning, but I find that use of basis inconsistent, to be honest

Yeah, I even agree with this personally. 🙂 It's just that people are regularly confused about "basis" while "primitive vectors" is at least relatively specific to lattices. But this is the part of this PR that I'll give up on most easily. I'm still hoping that's possible to have the API (and implementation) not use too many common terms (vector, coords, index) with slightly different meanings in different places.

In any case, I do think it's possible to get this closer to the old lattice (and adding some @deprecated methods if necessary) and still make the interface easier to grasp. I'll give it a try later and update this PR.

@VolodyaCO
Copy link
Collaborator

VolodyaCO commented May 17, 2021 via email

@attila-i-szabo
Copy link
Collaborator

attila-i-szabo commented May 18, 2021

I wouldn't do it

  • Lookup in a hash table is O(log n), so you can match one transformation in O(n log n) time. The KDTree can't possibly do better than O(n), O(log n) is very small in practice, possibly offset by constant prefactors in the implementation of dict vs KDTree
  • KDTrees are not meant to deal with periodic BCs (that's why we pad the lattice with extra unit cells), so we'd have to keep suffering with matching across them
  • The code will inevitably get harder to understand
  • In time we'll need to generalise Lattice so you can use it with further neighbours, coloured edges, etc. This means that KDTree would be an optional implementation detail.

@VolodyaCO
Copy link
Collaborator

VolodyaCO commented May 18, 2021 via email

netket/graph/lattice.py Outdated Show resolved Hide resolved
@VolodyaCO
Copy link
Collaborator

VolodyaCO commented May 18, 2021 via email

@PhilipVinc
Copy link
Member

PhilipVinc commented May 19, 2021

@femtobit Can you put a list of the API breaking changes in https://github.com/netket/netket/blob/master/CHANGELOG.md ?
I'm trying to keep a changelog of those in order not to annoy people too much. It's also picked up by the website so you can check in the preview if it renders fine.

Assuming you rebase on master.
If you don't it's ok but there'll be a merge conflict i guess to address

@femtobit
Copy link
Collaborator Author

femtobit commented May 19, 2021

@PhilipVinc Thanks, good point, this is addressed now too.

@gcarleo This PR should now actually improve backwards compatibility because it adds back (albeit with deprecation notice) Lattice.vector_to_site which is currently missing on master and matches the old behavior, where vector_to_site(site_to_vector(i)) == i is not generally true because it only gives one site in the unit cell (not taking into account the atom label).

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@attila-i-szabo attila-i-szabo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few last-minute requests, I realised I'd need this functionality soon!

netket/graph/lattice.py Show resolved Hide resolved
netket/graph/lattice.py Outdated Show resolved Hide resolved
netket/graph/lattice.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@femtobit femtobit merged commit 0f6f520 into master May 20, 2021
@femtobit femtobit deleted the lattice-api branch May 20, 2021 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants