-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ATMCMC2 #1569
ATMCMC2 #1569
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Can you fix that merge conflict? Needs a rebase
return x[(n_steps - 1)::n_steps] | ||
|
||
|
||
def two_gaussians(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use the new NormalMixture here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it just now, but I cannot get it running. Could please someone update the help of the shape of the arrays that are expected? My two mu1 and mu2 are shape (1 x n) how is the total mu matrix expected to be? 2 x n? 1 x 2*n? How are the weights and taus expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g.:
pm.NormalMixture('mixture',
w=np.array([.8, .2]),
mu=np.array([-3., 5.]),
sd=np.array([1., 1.]))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you insist on using that? ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not working?
Did the rebase ... |
How can I start the test the way Travis does? To test if it is running? Did manually: Was working and showed: Ok Travis is now crashing because per default no cPickle - shall I use only Pickle? Will be slower ... |
@hvasbath Seems like tests are still failing. |
If you would be so kind to answer my questions I could fix the test ;) . |
You could try running this with Docker - that'll replicate running the tests with travis. |
Maybe I wasnt clear. I can run the test no problem. ...just changed it to pickle |
Damn now no Queue module ... I need courses in python 3 ... |
Uff I have no clue why this is failing there. If I run it locally with nose its working. Guess will have to try this docker thing ... but next week I will have to prepare for an interview so can be that this will take some time until I find the time ... |
@hvasbath do you use conda or pure python? |
Wait, this is pretty strange -- looks like your tests are failing because of some python3 things that changed from python2. For example, What is extra confusing is that the python2 tests are failing as well: in this stack I see
but then the stack trace looks like it is using python3.5
|
In particular, it may be that the setup script is no longer grabbing the python version from the environment correctly. I will take care of that, then at least half your tests will pass! |
I am using pure python 2.7. no conda. |
Ok 2.7 passing, so thats clearly a python 3 related thing. Anyone any comment? Where it fails- it creates a list of booleans to hand to the sampling to determine whether to show the progressbar or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a really good contribution! Still not sure where that IndexError
on travis is coming from -- let me know if you're out of time and I can run the tests locally to figure it out.
More generally, I'm most worried about the large functions being hard for anyone except you to refactor/maintain/test. On the other hand, it will not break any existing stuffYou did a good job keeping it all decoupled enough from everything else I think it is ok being merged. I suspect we could then work on sharing some of this machinery with the rest of the library so it is a little more manageable and well tested.
|
||
def setUp(self): | ||
super(TestATMCMC, self).setUp() | ||
self.test_folder = mkdtemp(prefix='ATMIP_TEST') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should have a matching tearDown
method that deletes the test_folder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right!
Containing all the information that is unique for each Markov Chain | ||
i.e. [:class:'ATMCMC', chain_number(int), | ||
sampling index(int), start_point(dictionary)] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a doc for pshared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
|
||
pm._log.info('Sampling ...') | ||
|
||
pshared = dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this could be a namedtuple
, so you can make sure all the fields are filled in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will have to look into that. I didnt know that that exists ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can put whatever is meant not to be forked, so this differs for each function that is supposed to be parallelised, so if I understood correct this wont work with a named tuple ...
work = [(step, chain, idx, step.population[step.resampling_indexes[chain]]) | ||
for chain, idx in zip(chains, idxs)] | ||
|
||
for chain in tqdm(atext.parimap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
underscore indicates that you aren't using the variable (for _ in tqdm(...)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
step.n_chains / n_jobs has to be an integer number! | ||
tune : int | ||
Number of iterations to tune, if applicable (defaults to None) | ||
trace : string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this just be called homepath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! will do!
|
||
return a_list | ||
|
||
def srmap(self, tarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is used in some of my models. As I converted the code from the beat package to here, I missed to remove that. I will kill it.
array[slc] = list_arrays[list_ind].ravel() | ||
return array | ||
|
||
def f3map(self, list_arrays): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will kill as well ..
More efficient Queueing compared to Pool in multiprocessing. | ||
Adapted from the pyrocko package. http://www.pyrocko.org | ||
""" | ||
assert all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertions can be turned off when a library is run -- you might be better off manually raising an AssertionError
or TypeError
(or custom error!) instead.
Thanks a lot @ColCarroll for all the comments! And I am happy that you like it. It took a good part of my last year to put that together, so I would be very happy if it becomes more useful. |
Need to rebase after #1592. |
@twiecki or someone else can you please post the github command line, I am still not so familiar with many of the functions... |
@hvasbath http://stackoverflow.com/questions/7244321/how-do-i-update-a-github-forked-repository (in the last line you would need to do that from your PR branch instead of master). |
@hvasbath Any luck? |
@hvasbath Any luck on updating this. I think all you have to do is follow the instructions that Thomas directed you to. |
Sorry guys for being so inactive here- was swamped with my main job. Will look into incoorporating things right now ... |
Can you also look at #1689? |
Did that @twiecki . However, I wont have time to read the paper. I dont know about emcee so far. Looks like this would create an additional dependency emcee, which I am personally not a big fan of. So I would support @fonnesbeck s comment. |
I mean more in terms of implementation. Seems like you both implemented a
backend for particle samplers.
…On Jan 21, 2017 7:11 PM, "Hannes Vasyura-Bathke" ***@***.***> wrote:
Did that. However, I wont have time to read the paper. I dont know about
emcee so far. Looks like this would create an additional dependency emcee,
which I am personally not a big fan of. So I would support @fonnesbeck
<https://github.com/fonnesbeck> s comment.
Do you want me to look at something in particular?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1569 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AApJmG481w2JH9jYHl_-zrU5-6jIKU5Bks5rUkpWgaJpZM4LBhuN>
.
|
Yes in terms of backend it seems quite elegant! However, it is only a numpy array backend and keeps everything in memory. I wonder how feasible that is if the chains get long with many variables? But again this is rather related to the algorithm-maybe it doesnt need to run a long time? In terms of using it for ATMCMC I think it is not straight forward, as it still relies as well on rerunning the theano logp model function (iter_fn). Which is fine if you only want to record the input variables. But in ATMCMC we need to safe the model likelihoods to evaluate the coefficient of variation in the transition step, which would result in recalculating the model twice. |
Conflicts: pymc3/backends/atmcmc_text.py pymc3/blocking.py pymc3/examples/ATMCMC_2gaussians.py pymc3/step_methods/__init__.py pymc3/step_methods/atmcmc.py pymc3/tests/test_atmcmc.py
If the python3 version still fails I am going to need help that someone executes it as @ColCarroll suggested ;) |
As expected so I am going to need help here. The python 2 test is running through ... ====================================================================== ERROR: test_sample (pymc3.tests.test_atmcmc.TestATMCMC) Traceback (most recent call last): File "/home/travis/build/pymc-devs/pymc3/pymc3/tests/test_atmcmc.py", line 72, in test_sample
File "/home/travis/build/pymc-devs/pymc3/pymc3/step_methods/atmcmc.py", line 635, in ATMIP_sample
File "/home/travis/build/pymc-devs/pymc3/pymc3/step_methods/atmcmc.py", line 820, in _iter_parallel_chains
File "/home/travis/miniconda3/envs/testenv/lib/python3.6/site-packages/tqdm/_tqdm.py", line 830, in iter
File "/home/travis/build/pymc-devs/pymc3/pymc3/backends/atmcmc_text.py", line 107, in parimap
File "/home/travis/build/pymc-devs/pymc3/pymc3/step_methods/atmcmc.py", line 766, in work_chain
IndexError: list index out of range -------------------- >> begin captured logging << -------------------- pymc3: INFO: Init new trace! pymc3: INFO: Sample initial stage: ... pymc3: INFO: Beta: 0.000000 Stage: 0 pymc3: INFO: Initialising chain traces ... pymc3: INFO: Sampling ... --------------------- >> end captured logging << --------------------- |
Finally, I found a reference that you statisticians @twiecki @ColCarroll might know better. The algorithm here is actually a sequential monte carlo algorithm. DelMoral et al. 2006 Sequential Monte Carlo Samplers J. R. Statist. Soc. B. I guess I will rename it this way... |
I'm all for you renaming this to Sequential Monte Carlo. Is there anything you need help with to get this merged? |
Created a new PR here: |
finally, updated ATMCMC
follow up to
#1264
#1163
includes test and example