Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3.0.0dev0 blobs failing when different shapes #256

Closed
steven-murray opened this issue May 31, 2018 · 13 comments
Closed

v3.0.0dev0 blobs failing when different shapes #256

steven-murray opened this issue May 31, 2018 · 13 comments
Labels

Comments

@steven-murray
Copy link

General information:

  • emcee version: v3.0.0dev0
  • platform: any
  • installation method (pip/conda/source/other?): any

Problem description:

Expected behavior:

I expect that returning blobs of various shapes should work.

Actual behavior:

An error is raised: ValueError: setting an array element with a sequence.

What have you tried so far?:

When passing blobs_dtype with shapes for each returned blob, it works fine.

Minimal example:

import emcee

import numpy as np
def lnl(p):
    return p[0], 1.0, np.array([1,2,3])
sampler = emcee.EnsembleSampler(
    nwalkers = 10,
    ndim = 2,
    log_prob_fn = lnl
)

p0 = np.random.normal(size=(10,2))
for sample in sampler.sample(p0, iterations=10):
    continue

The offending line is L394 of ensemble.py: dt = np.atleast_1d(blob[0]).dtype. This will work as intended (I think), if it is instead: dt = [(np.atleast_1d(b).dtype, np.atleast_1d(b).shape) for b in blob[0]], though there might need to be a check for 1-dimensional arrays.

dfm added a commit that referenced this issue Jun 7, 2018
@dfm
Copy link
Owner

dfm commented Jun 7, 2018

I've "fixed" this in the sense that it doesn't fail anymore (instead it infers the dtype as object), but I'd definitely recommend using blobs_dtype for the best results because it's too complicated to come up with a general dtype inference mechanism that will do what you want.

@dfm dfm added the Bug label Jun 7, 2018
@steven-murray
Copy link
Author

I think this will work for the default backend, but not the HDF backend, as it does not understand the "object" dtype. I agree that there's no obvious way to do it that will work in all use cases. At this point, I am changing my likelihood function to present the blobs correctly.

@dfm
Copy link
Owner

dfm commented Jun 18, 2018

Yeah - this definitely won't work for the HDF backend but I don't see a way to fix that.

I'll close this for now, but feel free to re-open if you think that there's more to do.

@dfm dfm closed this as completed Jun 18, 2018
@sibirrer
Copy link

sibirrer commented Oct 1, 2019

as v3.0.0 is now the default pip version, this can cause transition issues for people previously (knowingly or unknowingly) worked with e.g. v2.2.1.

The default blobs input leads to a raise statement.

My travis report can be found here:
https://travis-ci.org/sibirrer/lenstronomy/jobs/591749997

Below the relevant lines that crashed

../../../virtualenv/python3.6.7/lib/python3.6/site-packages/emcee/ensemble.py:324: in sample
514 self.backend.grow(iterations, state.blobs)
self = <emcee.backends.backend.Backend object at 0x7f773bd07cd0>, ngrow = 1
1202blobs = array([None, None, None, None, None, None, None, None, None, None, None,
1203 ...ne, None, None, None, None,
1204 None, None, None, None, None], dtype=object)
1205
1206 def grow(self, ngrow, blobs):
1207 """Expand the storage space by some number of samples
1208
1209 Args:
1210 ngrow (int): The number of steps to grow the chain.
1211 blobs: The current list of blobs. This is used to compute the
1212 dtype for the blobs array.
1213
1214 """
1215 self._check_blobs(blobs)
1216 i = ngrow - (len(self.chain) - self.iteration)
1217 a = np.empty((i, self.nwalkers, self.ndim))
1218 self.chain = np.concatenate((self.chain, a), axis=0)
1219 a = np.empty((i, self.nwalkers))
1220 self.log_prob = np.concatenate((self.log_prob, a), axis=0)
1221 if blobs is not None:
1222> dt = np.dtype((blobs[0].dtype, blobs[0].shape))
1223E AttributeError: 'NoneType' object has no attribute 'dtype'

@dfm
Copy link
Owner

dfm commented Oct 1, 2019

Can you please provide a code snippet that reproduces this error? I'm not sure what you mean by "default" blobs.

@sibirrer
Copy link

sibirrer commented Oct 1, 2019

I mean the
blobs=array([None, None, None, None, None, None, None, None, None, None, None ... None], dtype=object)
statement in the input of
ensemble.py:324

I never defined the blobs myself in v2.2.1. I see that you have a nice transition page and I will first to follow all those steps before coming back to you. It just became urgent to transition to v3.0.0 as the default pip version has changed (today?) and v2.2.1 is not supported anymore on python3 (at least not on travisCI environment)

@dfm
Copy link
Owner

dfm commented Oct 1, 2019

I don't see anything wrong with that line there and there are plenty of tests where the blobs are not definined and they work just fine. So, can you please share a small piece of code that reproduces your issue.

@sibirrer
Copy link

sibirrer commented Oct 1, 2019

The issue was in an incompatible wrapper that used two arguments in the likelihood function output and the way this functions were called to effectively overwrite the blobs statement. Problem fixed on my side. Sorry for bothering you and thanks for the quick replies.

@dfm
Copy link
Owner

dfm commented Oct 1, 2019

No worries. Glad you got it sorted!

@alessiospuriomancini
Copy link

I keep having this issue when using CosmoHammer within MontePython, see issue mentioned above. I believe @sibirrer was experiencing the same issue. How exactly did you manage to fix it?

@sibirrer
Copy link

@alessiospuriomancini: by now I am not using CosmoHammer anymore (as it might not be as well supported) and directly use the MPI support in emcee. The following way I initialize the MPI pool to work with emcee (slightly modified fro schwimmbad): https://github.com/sibirrer/lenstronomy/blob/master/lenstronomy/Sampling/Pool/multiprocessing.py
This fixed an other issue between pickle and dill in storing the different instances of the MPI process.

@alessiospuriomancini
Copy link

Thanks so much @sibirrer. Just to clarify: you use the script multiprocessing.py within some script for emcee, to directly run using emcee without going through CosmoHammer?

@sibirrer
Copy link

yes @alessiospuriomancini, for example here: https://github.com/sibirrer/lenstronomy/blob/master/lenstronomy/Sampling/sampler.py in definition mcmc_emcee()

MehnaazAsad added a commit to MehnaazAsad/RESOLVE_Statistics that referenced this issue Jan 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants