Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature domain for phenomenological and ontological classes #141

Closed
Tracked by #54
bfhealy opened this issue Oct 28, 2022 · 5 comments
Closed
Tracked by #54

Feature domain for phenomenological and ontological classes #141

bfhealy opened this issue Oct 28, 2022 · 5 comments
Labels
question Further information is requested

Comments

@bfhealy
Copy link
Collaborator

bfhealy commented Oct 28, 2022

Currently, DNN training on our phenomenological classes uses 40 features generated from the ZTF light curves. The ontological classifiers are trained on all of those features along with 34 more. These additional features consist of AllWISE, Gaia and PanStarrs magnitudes along with the ra, dec, ccd and quadrant of the source.

How should we proceed with these feature domains going forward? I don't think the coordinates and silicon position of the source should be part of the training (especially for the ontological branch where they're used now), since those features should not inform the intrinsic nature of the source.

Also, the inclusion of additional features for the ontological training means that the distinction between the phenomenological eclipsing and ontological binary star classes may be more complex that mentioned in #133. It would be helpful to know how the human classifiers treated these two classes during their labeling.

@bfhealy bfhealy added the question Further information is requested label Oct 28, 2022
@bfhealy bfhealy mentioned this issue Oct 28, 2022
48 tasks
@AshishMahabal
Copy link
Collaborator

I agree that ra, dec, ccd and quadrant should not be part of the training. I am surprised they were. Were they explicitly used?

@bfhealy
Copy link
Collaborator Author

bfhealy commented Oct 28, 2022

Yes, in config.yaml for features: they are explicitly listed under the ontological: header. They were commented out in an older-looking header in the list (ontological_d13:), but in the one that's currently being used by the training, they are uncommented.

@AshishMahabal
Copy link
Collaborator

They should definitely not be used, In xgboost I have not used them, and based on the old header you mention, I am certain that Dima wouldn't have used those. We were pulling parameters like quad because at one point we were looking for bogus objects as a function of quads to understand the types of boguses.

@bfhealy
Copy link
Collaborator Author

bfhealy commented Oct 28, 2022

That makes sense, I can see how those features would be useful for identifying bogus sources. I'll comment out the inclusion of ra, dec, ccd and quad in the ontological feature list.

@bfhealy bfhealy linked a pull request Oct 31, 2022 that will close this issue
@bfhealy
Copy link
Collaborator Author

bfhealy commented Nov 17, 2022

This issue is also specifically relevant to the AGN class, for which the Gaia parallax should not be applicable.

@bfhealy bfhealy closed this as completed Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants