Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature scaling considerations #459

Open
bfhealy opened this issue Aug 4, 2023 · 0 comments
Open

Feature scaling considerations #459

bfhealy opened this issue Aug 4, 2023 · 0 comments

Comments

@bfhealy
Copy link
Collaborator

bfhealy commented Aug 4, 2023

During both training and inference, SCoPe normalizes all features to a range [0, 1] based on the min/max of that feature's distribution in the training set. This consistently between training and inference is very important, although since the features in the training set do not necessarily contain the same min/max as those in an arbitrary ZTF field, it means that inference sources may have their features scaled to a range other than [0, 1]. It would be ideal (but likely impractical) to compute features for all ZTF sources before using those min/max values to normalize features for training/inference.

Another scaling option is to normalize a field's features based on the min/max values of that specific field's feature distributions (rather than the training set). However, while the [0, 1] range would then always be enforced, this would create an inconsistency between what a scaled feature value of 0.5 means during training vs. during inference. That inconsistency would seem to be more detrimental to classification than our current approach, which only suffers when features fall outside the range that the classifier trained on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant