-
Notifications
You must be signed in to change notification settings - Fork 119
Loading Objects breaks '_default_chunksize' Attribute for 'get_output' Method #1519
Comments
I'll have a closer look tomorrow but from a quick glance i think the issue is that the data source is not persisted when saving a model (which is per design). when you try to call |
This is indeed the issue. You can do the following: source = pyemma.coordinates.source(data)
tica = pyemma.coordinates.tica(source)
out = tica.get_output() # this works
tica.save('tica.h5')
tica_restored = pyemma.load('tica.h5')
# tica_restored.get_output() does not work because there is no data source configured
tica_restored.data_producer = source # configure the data source
out = tica_restored.get_output() |
Thank you @clonker for this the exam code to load saved tica and then get tica.get_output from loaded one. I followed your instruction above for tica and got success. But when I try to load the saved cluster, then do cluster.get_output, I got the same error like I got with tica: 'KmeansClustering' object has no attribute '_default_chunksize". I tried to use cluster_restored.data_producer = source or cluster_restored.data_producer = tica , both ways doesn't work. Could you tell me how to fix this. Thank you. here is the code when I got error with '_default_chunksize:
then the error showed up like: `--------------------------------------------------------------------------- File ~/opt/anaconda3/lib/python3.9/site-packages/pyemma/coordinates/data/_base/transformer.py:226, in StreamingEstimationTransformer.get_output(self, dimensions, stride, skip, chunk) File ~/opt/anaconda3/lib/python3.9/site-packages/pyemma/coordinates/data/_base/datasource.py:370, in DataSource.get_output(self, dimensions, stride, skip, chunk) File ~/opt/anaconda3/lib/python3.9/site-packages/pyemma/coordinates/data/_base/transformer.py:181, in StreamingTransformer.chunksize(self) File ~/opt/anaconda3/lib/python3.9/site-packages/pyemma/coordinates/data/_base/iterable.py:71, in Iterable.default_chunksize(self) AttributeError: 'KmeansClustering' object has no attribute '_default_chunksize'` Then I tried to configure the data source by:
so, the error I got like: File ~/opt/anaconda3/lib/python3.9/site-packages/pyemma/coordinates/data/_base/transformer.py:135, in StreamingTransformer.data_producer(self, dp) ValueError: can not set data_producer to non-iterable class of type <class 'function'>` Then I tried configure data source by the different way like:
I still got an error like: `--------------------------------------------------------------------------- File ~/opt/anaconda3/lib/python3.9/site-packages/pyemma/coordinates/data/_base/transformer.py:135, in StreamingTransformer.data_producer(self, dp) ValueError: can not set data_producer to non-iterable class of type <class 'list'>` |
Cheers Julie, try this source = pyemma.coordinates.source(data)
tica = pyemma.coordinates.tica(source)
cluster = pyemma.coordinates.cluster_kmeans(tica)
out = cluster.get_output() # this works
tica.save('tica.h5', overwrite=True)
cluster.save('cluster.h5', overwrite=True)
tica_restored = pyemma.load('tica.h5')
cluster_restored = pyemma.load('cluster.h5')
tica_restored.data_producer = source # configure the data source
cluster_restored.data_producer = tica_restored # configure cluster data source as (restored) tica
out_tica = tica_restored.get_output()
out2 = cluster_restored.get_output()
assert_equal(out, out2) |
Hi there, I have run into this same issue and I do understand the explanation, but I wanted to ask whether this is also the intended behavior for models generated from data which was not streamed via pyemma.coordinates.source but rather loaded in as an object? Code/Usage Details: Please ignore below if that was not the intended behavior. Code Behavior Question: Use Case Details: I understand that this code is no longer under active maintenance, so thank you for your time and any remedies you may suggest, Mikaela |
Hello,
I am trying to save and load pyemma.coordinates objects (for TICA and KmeansClustering) and getting a chunksize error.
For example, while running
pyemma.coordinates.tica()
, saving the output TICA object as an .h5 file using 'save' method causes no issues. Upon loading and attempting to call 'get_output()' method on the loaded object, I get the following error:'TICA' object has no attribute '_default_chunksize'
If I generate the TICA object using
pyemma.coordinates.tica()
and use the 'get_output()' method in one script without saving/loading, I get no issue.If I call a different method (describe, n_frames_total, get_params), I get an expected output with no issues.
Please let me know how to fix this issue.
Below is the code and error. I load in TICA object from a directory as
tica
. I print it as a sanity check. I then call the method.Here's the output.
P.S. sorry about the white space in the error log.
The text was updated successfully, but these errors were encountered: