-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature domain for phenomenological and ontological classes #141
Comments
I agree that ra, dec, ccd and quadrant should not be part of the training. I am surprised they were. Were they explicitly used? |
Yes, in config.yaml for |
They should definitely not be used, In xgboost I have not used them, and based on the old header you mention, I am certain that Dima wouldn't have used those. We were pulling parameters like quad because at one point we were looking for bogus objects as a function of quads to understand the types of boguses. |
That makes sense, I can see how those features would be useful for identifying bogus sources. I'll comment out the inclusion of ra, dec, ccd and quad in the ontological feature list. |
This issue is also specifically relevant to the AGN class, for which the Gaia parallax should not be applicable. |
Currently, DNN training on our phenomenological classes uses 40 features generated from the ZTF light curves. The ontological classifiers are trained on all of those features along with 34 more. These additional features consist of AllWISE, Gaia and PanStarrs magnitudes along with the ra, dec, ccd and quadrant of the source.
How should we proceed with these feature domains going forward? I don't think the coordinates and silicon position of the source should be part of the training (especially for the ontological branch where they're used now), since those features should not inform the intrinsic nature of the source.
Also, the inclusion of additional features for the ontological training means that the distinction between the phenomenological
eclipsing
and ontologicalbinary star
classes may be more complex that mentioned in #133. It would be helpful to know how the human classifiers treated these two classes during their labeling.The text was updated successfully, but these errors were encountered: