-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BEAST runs in python 2 and 3 #113
Conversation
@mfouesneau : Travis CI error on ezunits. Can you have a look and tell me if this can be fixed easily? |
We can switch to astropy for sure. That might be easy, if we parse a few API calls. the travis errors are on eztables not ezunits if I read correctly. eztables should not be used, is it? |
The end of the travis check link in the previous comment seems to be about ezunits. Or am I miss reading the errors? Excerpt:
|
It it is ezunits, one solution is to move to astropy units. Similar solution to eztables problems (move to astropy tables). |
The simplest interface is to use |
Removed distance info from observations.py as not used anymore (and potentially confusing).
@@ -289,6 +287,5 @@ def get_obscat(obsfile=obsfile, distanceModulus=distanceModulus, | |||
obs: GenFluxCatalog | |||
observation catalog | |||
""" | |||
obs = GenFluxCatalog(obsfile, distanceModulus=distanceModulus, | |||
filters=filters) | |||
obs = GenFluxCatalog(obsfile, filters=filters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
distanceModulus has gone because it's not used or what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer used in the observed catalog. This was left over when we were putting the observed data at 10 pc (absolute mags?). Now we put the model at the distance of the galaxy as this is needed for a number of reasons. Figured it was best to take this out as it means less code to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair.
If we decide not to use the distanceModulus in observed catalog, I think you need to delete all methods in the "Observations" class that use 'distanceModulus'. So users avoid any possible confusions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I've done this and will commit the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait -- but instead of placing observed data at 10 pc, the current convention was to place both the observations and models at the distance of the galaxy. We need to keep distanceModulus in order to convert observations from observed fluxes to intrinsic fluxes. See my other comment on observationmodel/observations.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness, we never covert the observed fluxes to intrinsic fluxes when doing the fitting. Are you arguing for keeping this for some other purpose? Asking to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my misunderstanding -- your thorough explanation below was very helpful. Removing distanceModulus here is OK, as I see no need/use for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. Very useful conversation to have (and have archived!).
@@ -36,8 +37,8 @@ def hdf5diff(fname1, fname2): | |||
hdfb = h5py.File(fname2, 'r') | |||
|
|||
hd = hdf5diff_results() | |||
for sname in hdfa.keys(): | |||
if sname not in hdfb.keys(): | |||
for sname in list(hdfa.keys()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list() is not needed here. a for-loop does the job, unless h5py sucks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. h5py doesn't seem to suck. :-)
for cname, cvalue in hdfa[sname].items(): | ||
if cname not in hdfb[sname].keys(): | ||
for cname, cvalue in list(hdfa[sname].items()): | ||
if cname not in list(hdfb[sname].keys()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list is needed indeed here, but you should only make it once before the for-loop so that you do not instanciate it many times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -262,7 +264,7 @@ def Q_all_memory(prev_result, obs, sedgrid, ast, qnames_in, p=[16., 50., 84.], | |||
|
|||
# get the names of all the children in the ast structure | |||
ast_children = [] | |||
for label, node in ast.root._v_children.items(): | |||
for label, node in list(ast.root._v_children.items()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we mixing h5py and pytables? if so we should only use one.
(list is not needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I've been trying to move to h5py. But this move is not complete as you might imagine. Part of the move to astropy tables.
@@ -123,7 +124,7 @@ def __init__(self, shape, h = 1.0, domain = None, norm = None): | |||
norm = 1.0 | |||
self._normconst = norm | |||
self.domain = domain | |||
if callable(shape): | |||
if isinstance(shape, collections.Callable): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution, I've never used that object but it refers to an ABC class that has multiple methods.
I do not know if any object with a call() method enters that category.
The callable function was simply a test on the call method
class collections.Callable
ABCs for classes that provide respectively the methods contains(), hash(), len(), and call().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how to answer this question. This change was done by the 2to3 program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, where do we use the kernels.py file in the fitting process? Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use it. This was part of previous ways of interpolation - at least that is my memory of this. We could remove this, but should do this with a different issue/pull request. Removing old code we are not using is on the to do list and there is an issue for this.
|
||
import numpy as np | ||
import tables | ||
|
||
import toothpick | ||
from . import toothpick |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have a separe toothpick if in "generic_noisemodel" we make more toothpick functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea for the generic_noisemodel was to provide a noisemodel that could be used by most projects. No need for the same code to be in every project directory.
Keeping the toothpick model details separate allows for someone to make a custom noisemodel if they want.
@@ -331,9 +335,9 @@ def interpolate(self, sedgrid, progress=True): | |||
compl = np.empty((N, M), dtype=float) | |||
|
|||
if progress is True: | |||
it = Pbar(desc='Evaluating model').iterover(range(M)) | |||
it = Pbar(desc='Evaluating model').iterover(list(range(M))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pbar can take the number of iterations (here M) as parameter so you don't have to instanciate the list. (useful when it gets big)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Thanks.
@@ -90,30 +83,11 @@ def __getitem__(self, *args, **kwargs): | |||
|
|||
def keys(self): | |||
""" Returns dataset content names """ | |||
return self.data.keys() | |||
return list(self.data.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should respect the way python works and therefore returns the keys not the list of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. list() removed.
@@ -173,8 +174,10 @@ def gen_spectral_grid_from_stellib(osl, oiso, ages=(1e7,), masses=(3,), | |||
|
|||
# some constants | |||
kdata = 0 | |||
rsun = ezunits.unit['Rsun'].to('m').magnitude # 6.955e8 m | |||
|
|||
rsun = 6.955e8 # in meters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the astropy constant as imported above. Makes more sense and propagates units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I did not do this yet as I wanted the regression testing to not key off of small changes in the solar radius. Will make an issue so that we make the change to astropy constants for the whole BEAST including here at some point.
…d). Also removed the hard-coding of the survey name from the IAU name creation code.
@@ -117,7 +174,6 @@ | |||
|
|||
# read in the observed data | |||
obsdata = datamodel.get_obscat(datamodel.obsfile, | |||
datamodel.distanceModulus, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make the same correction in line 129.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Karl for making this many changes. I don't see major issues in updating python 2 to 3. Tests are needed to make sure that all these changes work well and reproduce what we got with the python 2 version BEAST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One error caught in ezmist's simpletable.py (need to mirror edits made to ezpadova's version), and concerns about dropping distanceModulus.
@@ -125,35 +99,33 @@ def setFilters(self, filters): | |||
|
|||
def getMags(self, num, filters): | |||
raise Exception('Do not use as magnitudes') | |||
return np.array([ self.data[tt][num] - self.distanceModulus for tt in filters]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of distanceModulus seems like it will affect the comparison of obs and model fluxes -- are their any sanity checks that all code is abiding by new convention? i.e. -- when trimming the grid I get an error stating: "no models are brighter than the minimum ASTs run", which could be caused by mismatch in conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After further thought, I think the removal of distanceModulus needs to be reverted. Comparison between models and data is done in intrinsic fluxes (at distance of the galaxy), and convention was to convert flux observations from measured to intrinsic to match the unscaled model fluxes. This conversion from measured to intrinsic still needs to occur to match the flux scale of the models!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not been using the distancemodulus in the observation model for sometime. It was originally used when we took the observed data and put it into absolute units (magnitudes I believe). Then the models were all computed in absolute units (distance of 10 pc). And we were comparing the models to the observations in magnutide units - which was incorrect!
This is not the correct way to do the comparison given to get the correct noisemodel, we have to put the models at the distance of the galaxy we are observing and compute the uncertainties on the model fluxes using the ASTs. Thus, we removed the use of the distancemodulus in the observationmodel. But did not remove passing it to the routines that read in the observed data.
We still use the distance to the galaxy. It is used to calculate the model fluxes at the distance of the galaxy. And then we directly compare the observed fluxes to the model fluxes and uncertainties.
So in summary, removing the distancemodulus from the observationmodel code is just remove old code no longer used. And make things consistent and avoid going back to the old and incorrect way of comparing data to models. Ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully this comment was not too long-winded. Just trying to clearly state the issue. Not sure I did. Having some allergy fuzziness issues today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way of saying this is that the distance should be used in the physicsmodel code, but not in the observationsmodel code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given @lcjohnso's comment and error message, it's important to check that properly. @lcjohnso can you give more to @karllark to chew?
I agree with @karllark, Beast makes the models at the distance of the population, not the way around (in principle). The only place where the distance modulus in Observations
class is useful is to generate mock data, but that could be left to the tester.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the further explanation. If I'm understanding correctly, are you saying that the getMags and getFlux functions defined here are never used? If so, then I'd advocate for removing them completely to avoid confusion rather than change their utility/purpose (from distance-corrected to non-distance-corrected).
Can you confirm that these functions are not used elsewhere? Otherwise the abrupt change of removing the distance correction continues to worry me - shouldn't this removal require subsequent edits elsewhere to account for the different treatment of distance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getmags is only used in observations.py itself.
The getflux is used in other places, but the key is that the stock getflux function is redefined in datamodel.py. And we have been redefining it this way for a long time. This actually illustrates how having the distance in the stock versions in observations.py is dangerous. The "experts" know not to use it, but newer users would not necessarily know this. Removing it in the stock functions means that the experts and soon-to-be experts will both use the right functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lcjohnso in your first comment, you had an error. Was this with the phat_small example? I've just run this example and am not getting that error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- I agree that your edits make good sense. Sorry for the delay -- I didn't realize that getFlux()
was being redefined in datamodel.py, so the non-distanceModulus version of the transformation was already in effect.
The error I was encountering was due to an unrelated issue. I was testing the code on SMIDGE data where I'm only fitting a subset of the bands, and the n_detected
keyword in trim_grid.trim_models()
needed to be set to a non-default value to run correctly.
@@ -71,13 +72,13 @@ | |||
if PY3: | |||
iteritems = operator.methodcaller('items') | |||
itervalues = operator.methodcaller('values') | |||
basestring = (str, bytes) | |||
str = (str, bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str -> strtype as in simpletable.py in ezpadova throughout this file, or one could simply duplicate ezpadova's simpletable.py here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this comment. What particular change are you suggesting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the variable currently named str
, which is being set equal to (str, bytes)
, should be renamed strtype throughout the file. Not sure one can make a variable named str
since it is a built-in type. This substitution would make ezmist/simpletable.py consistent with the edits made to ezpadova/simpletable.py (in fact, the files should be identical in the end).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Cliff here, you should not redefine the str type variable, esp. by a tuple containing itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. In fact, having two copies of the same file is not good either. So I moved the simpletable.py from ezpadova to the parent directory and have changed both padova.py and mist.py to import from this single file. Easier to maintain if we have just one version of this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your thorough explanation regarding my questions. The code looks good to go!
BEAST runs in python 2 and 3 (yeah!)
The BEAST has been updated to run in python 3 in addition to python 2. This required changes to many of the files. In addition, the ezpipe "external" code has been removed as it was causing significant issues. The run_beast.py file now has direct calls to the steps needed to create the physicsmodel grid. Finally, the ezunits package was removed for similar reasons and units support moved to the astropy units package.
The update was done by starting with the 2to3 tool. Then the code was tested with both python 2 and 3 simultaneously correcting errors as they arose to allow the code to run in both python versions.
This was mainly done during the M31 Lorentz meeting in Jul 2017, especially on the hackday. And then finished up a couple of weeks later.
This pull request should close issues #8, #11, and #79.