-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Framework for saving and loading operators #91
Comments
A few points that come to mind:
|
|
Don't use pickle. It's not good for later compatibility stuff, e.g. it makes your data format python-specific instead of language agnostic. Looking like the existing thing you're already storing is usually a good idea. Using a human-readable string format keeps things simple. The human-readable strings might compress well; you should check how much compression you get with zlib or some other standard compression algorithm. A binary format is good if you're speed-limited or space-limited. |
@Strilanc how do you feel about @jarrodmcc's suggestion to use the HDF5 JSON module? |
@babbush That falls under "looking/acting like the existing thing is good". I don't know much about HDF5... unless there's some reason that it doesn't fit this use case, it seems perfectly acceptable to keep using it. |
Tor the argument that you don't want data to be python-specific, it would also fall under that category. That is, I expect serialization of a python dict to be somewhat python specific, even if it's then stored in an HDF5 file. If that is a strong desire of ours (to have the data be somewhat language agnostic), we're only really left with the option of giant string dump or associated lists of keys and values that. Either has to be processed first then used to build a dictionary, but the lists are likely to be faster/more compact if I had to guess. |
Thanks to @Spaceenter ! |
We have a nice framework for saving and loading instances of the MolecularData class. We should also have something similar for FermionOperator and QubitOperator. In fact, there is a project going on right now which will benefit from this significantly in the coming weeks. A good example of an operator we might want to save and load is an error operator from the Trotter error code. These operators are expensive to compute and complex to analyze so there is good reason that one might want to save the output.
I suggest we continue to use HDF5 and try to loosely parallel the system by which MolecularData is saved. However, automatically generated names for arbitrary FermionOperators and QubitOperators are not a good idea since these classes are quite broad. Naming should then be left up to the user. The directory should perhaps be specified optionally with the default option being the same place where MolecularData is saved by default. I suggest that save() and load() are external functions, kept in utils/. We should anticipate automatic naming functions that will use these primitive save/load functions as subroutines. A good example where this would be helpful would be in saving and loading plane wave Hamiltonians.
We should think about the most efficient way to save both types of operators. An easy (but not necessarily optimal) solution involves calling the str() method that is already implemented in these operators. To load these operators one will need to write a parser. Since these are classes with a small number of attributes that are unlikely to change, it might make sense to use pickle (yes, I know about the security issue). A bigger concern with that is the discrepancy between pickling in python 2 and 3. Is there a standard way to store python dictionaries? That could work since a dictionary essentially defines QubitOperator and FermionOperator. We may also want to think about writing the builtin eval method on these classes. Keep in mind that if somebody is going to the trouble of saving these operators, they are likely rather large and performance should be a priority.
I am curious to hear the opinions of @jarrodmcc, @damiansteiger, @thomashaener, and @Strilanc. We should discuss and agree on a solution prior to any pull requests being opened.
The text was updated successfully, but these errors were encountered: