ProtectedArray type incompatible with newer numpy #119

matthewware · 2020-02-11T20:28:16Z

Trying to run a GST reconstruction with mpi and numpy > 1.15 seems to cause issues with the ProtectedArray class.

https://github.com/pyGSTio/pyGSTi/blob/3ea46e3837d8a6908367c04312ec1dddbef3e2bd/packages/pygsti/baseobjs/protectedarray.py#L48

Based on similarity with:

pandas-dev/pandas#24839

spyder-ide/spyder#8582

https://html.developreference.com/article/10573243/Anaconda+Pandas+breaks+on+reading+hdf+file+on+Python+3.6.x

Seems like downgrading to numpy 1.15.0 should fix this issues. I'm still testing that.

To Reproduce

import pygsti
from pygsti.construction import std2Q_XYZICNOT

#Create a data set
target_model = std2Q_XYZICNOT.target_model()
fiducials = std2Q_XYZICNOT.fiducials
germs = std2Q_XYZICNOT.germs
maxLengths = [1,2,4,8,16]

mdl_datagen = target_model.depolarize(op_noise=0.1, spam_noise=0.001)
listOfExperiments = pygsti.construction.make_lsgst_experiment_list(target_model.operations.keys(), fiducials, fiducials, germs, maxLengths)
ds = pygsti.construction.generate_fake_data(mdl_datagen, listOfExperiments, nSamples=1000,
                                             sampleError="multinomial", seed=1234)
pygsti.io.write_dataset("example_files/mpi_example_dataset.txt", ds)

"""
import time
import pygsti
from pygsti.construction import std2Q_XYZICNOT

#get MPI comm
from mpi4py import MPI
comm = MPI.COMM_WORLD

print("Rank %d started" % comm.Get_rank())

#define target model, fiducials, and germs as before
target_model = std2Q_XYZICNOT.target_model()
fiducials = std2Q_XYZICNOT.fiducials
germs = std2Q_XYZICNOT.germs
maxLengths = [1,2,4,8,16]

#tell gauge optimization to weight the operation matrix
# elements 100x more heavily than the SPAM vector elements, and
# to specifically weight the Gx gate twice as heavily as the other
# gates.
goParams = {'itemWeights':{'spam': 0.01, 'gates': 1.0, 'Gx': 2.0} }

#Specify a per-core memory limit (useful for larger GST calculations)
memLim = 2.1*(1024)**3  # 2.1 GB

#Perform TP-constrained GST
target_model.set_all_parameterizations("TP")

#load the dataset
ds = pygsti.io.load_dataset("example_files/mpi_example_dataset.txt")

start = time.time()
results = pygsti.do_long_sequence_gst(ds, target_model, fiducials, fiducials,
                                      germs, maxLengths,memLimit=memLim,
                                      gaugeOptParams=goParams, comm=comm,
                                      verbosity=2)
end = time.time()
print("Rank %d finished in %.1fs" % (comm.Get_rank(), end-start))
if comm.Get_rank() == 0:
    import pickle
    pickle.dump(results, open("example_files/mpi_example_results.pkl","wb"))
"""
with open("example_files/mpi_example_script.py","w") as f:
     f.write(mpiScript)

mpiexec -n 3 python3 "example_files/mpi_example_script.py"

Expected behavior
A GST reconstruction using mpi

Environment (please complete the following information):

pyGSTi version v0.9.8.3
python version: 3.7
numpy version: 1.18.1

matthewware · 2020-02-11T21:29:04Z

A reconstruction with this line removed is chugging along now. I'll update when it finishes.

matthewware · 2020-02-11T22:24:04Z

This is also failing with a memory limit error though I've certain there's ample RAM left on the machine. The issue might be more rooted in MPI.

matthewware · 2020-02-12T20:17:45Z

Another slight update, I get the same error with openmpi as I do with mpich:

File ".../pygsti/packages/pygsti/algorithms/core.py", line 1120, in do_mc2gst
    assert mlim > 0, 'Not enough memory, exiting..'

This is a single machine with 64 GB of RAM and 8 physical cores. The same error is happening regardless of the number of cores I throw at mpi. I'll keep updating as I make progress.

enielse · 2020-02-12T21:08:09Z

I've confirmed the ProtectedArray issue using the Numpy 1.18.1, and this is fixed by commit 68d0bb8, which is currently in the beta-branch version of pyGSTi. We will pull together a few more fixes before creating a new minor release (which will be v0.9.9.1), so if you need this fix right away please use the beta branch.

Regarding the memory limit issues, could you clarify for me what the problem is. My understanding is that your machine has 8GB of RAM per core, and agree this should be plenty. But the script you cited above tells pyGSTi to (roughly) limit the per-core memory usage to 2.1GB in the line:

memLim = 2.1*(1024)**3  # 2.1 GB

The memory limits in pyGSTi aren't super precise, and so I'd set memLim to something around 7.5GB (or less) in your case, just to play it safe. Have you done this and the MemoryError persists, or does your script still stipulate a 2.1GB limit. If the latter, getting the error seems reasonable since you're running 2Q-GST out to a fairly large depth (16) and so have a lot (around 94K) circuits - and you should try increasing memLim.

enielse · 2020-02-12T21:10:10Z

Oh - and I forgot to mention, the hack-fix of deleting the offending self.base.flags.writeable = True line in protectedarray.py seems like a fine band-aid for the short term -- so sticking with this for now seems ok to me.

enielse · 2020-02-13T19:20:31Z

I've been looking into this issue for longer than I expected. There seems to be an issue with Numpy arrays preserving their flags when getting pickled/unpickled. For example, when I run this code (using Python 3.7 and Numpy 1.18.1):

import numpy, pickle
a = numpy.empty(126, 'd')
a.flags.writeable = False
print("Before:", a.flags)
b = pickle.loads(pickle.dumps(a))
print("After:", b.flags)

I find that when the OWNDATA flag unpickles as False whenever the array size is greater than 125 (so the 126 in the above example is just big enough to get the bad behavior). Also, unpickling seems to always reset the WRITEABLE flag to True now, even if the pickled array is read-only. I'm curious if others can reproduce these results.

I've added some additional commits to deal with this weirdness that should make pyGSTi less reliant on Numpy array flags and thus more robust to these sorts of issues. The latest commit to the beta branch (209cee0) I think has everything up and running again using Numpy 1.18.1.

matthewware · 2020-02-14T22:56:17Z

Update: Increasing the memLim seems to have done the trick. I increased it to 7.5 GB on 6 cores and things have at least made it to the third of five iterations. This is also with the self.flags.writeable = True uncommented. If it runs to completion, I'll interpret the 'not enough memory' error above very differently.

matthewware · 2020-02-16T22:02:21Z

So the reconstruction seems to have worked. @enielse, are there any rules of thumb about how much memory one should allocate?

enielse · 2020-02-17T16:36:44Z

Glad to hear it worked with the increased memLim. As a rule of thumb, I usually set memLim to 10% less than the amount of physical memory that is available per core, since the memory limit is only approximate.

The reasons that memLim exists as a user-imposed (and possibly artificial) memory limit is that

extracting exactly how much memory per core a machine has can be problematic, meaning that is doesn't always work like it should in python, and
there could be scenarios where you want to run multiple instances of GST on the same machine, and budget the total memory accordingly.

To summarize: the writeable=True issue that began this thread is something entirely separate from the memory limit topic also being discussed: the writeable=True bug will be fixed in the next pyGSTi release (version 0.9.9.1) and can be hacked by commenting out the offending line as described above. The memory limit issue wasn't a bug per se, but an instance of poor documentation and resulting confusion about how memLim is used in pyGSTi.

matthewware · 2020-02-17T23:26:18Z

Hi Erik!

Sounds good. I'll stick with this rule of thumb going forward. Thanks for summarizing things for future readers. I'm going to mark the issue solved by 209cee0

matthewware added the bug A bug or regression label Feb 11, 2020

matthewware closed this as completed Feb 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ProtectedArray type incompatible with newer numpy #119

ProtectedArray type incompatible with newer numpy #119

matthewware commented Feb 11, 2020

matthewware commented Feb 11, 2020

matthewware commented Feb 11, 2020

matthewware commented Feb 12, 2020

enielse commented Feb 12, 2020

enielse commented Feb 12, 2020

enielse commented Feb 13, 2020

matthewware commented Feb 14, 2020

matthewware commented Feb 16, 2020

enielse commented Feb 17, 2020

matthewware commented Feb 17, 2020

ProtectedArray type incompatible with newer numpy #119

ProtectedArray type incompatible with newer numpy #119

Comments

matthewware commented Feb 11, 2020

matthewware commented Feb 11, 2020

matthewware commented Feb 11, 2020

matthewware commented Feb 12, 2020

enielse commented Feb 12, 2020

enielse commented Feb 12, 2020

enielse commented Feb 13, 2020

matthewware commented Feb 14, 2020

matthewware commented Feb 16, 2020

enielse commented Feb 17, 2020

matthewware commented Feb 17, 2020