Hyper parameters tuning for model #48

momegas · 2022-11-30T11:31:51Z

No description provided.

NickNtamp · 2022-12-09T09:47:35Z

The hyper-parameters tuning is pretty easy to be performed using Gridsearch.

There are some questions thought:

Do you believe that there is a time threshold (e.g. not take more than 20 seconds)?
Do you believe that we have to set an evaluation metric threshold(e.g. if model achieves 90% accuracy, pick that model)?
The model training will be performed once per training set. Means that we will retrain the model only if the training set changes. Do we need to keep track of models (e.g. by using MLflow?)
Regardless of whether we use MLflow or not, do we need to save somewhere the hyper-parameters of the optimal model?

cc: @momegas , @gcharis , @stavrostheocharis

stavrostheocharis · 2022-12-09T11:08:59Z

Do you believe that there is a time threshold (e.g. not take more than 20 seconds)?

Depends on when this pipeline runs. If it is in almost real-time, I think that we should have a threshold. If it runs based on a scheduler it is not a problem

Do you believe that we have to set an evaluation metric threshold(e.g. if model achieves 90% accuracy, pick that model)?

Maybe just pick the one with the highest accuracy. But here what happens if we have a poor model as the best model?

The model training will be performed once per training set. Means that we will retrain the model only if the training set changes. Do we need to keep track of models (e.g. by using MLflow?)

If we are going to keep the model, a solution like this could be implemented, but MLflow will need much effort to integrate it (database, paths, deployment, etc.)

Regardless of whether we use MLflow or not, do we need to save somewhere the hyper-parameters of the optimal model?

I think that this would be good to save them and maybe also keep the eval metrics to show something to the user, in order that he knows exactly how precise is our explanation.

momegas · 2022-12-14T11:41:18Z

I think its important to keep the target of Whitebox in mind. The target is monitoring not create models (at least not now)
With this in mind, I think that we should either have a quick tuning or not at all. How I understood this issue was that it would be just some adjustments on the training. Not create a full other feature.

Think about this and if we can have just this in the timebox we have good. Otherwise, I would look at something else.

NickNtamp · 2022-12-16T08:31:56Z

Having some discussions with @stavrostheocharis , we concluded that the requirements of this task are still pretty blurry. I will try to simplify them with some simple questions below, so please @momegas - when you have the time, let us know.

Do we wish to have some possibilities for a better model - predicting more accurate results? This means more accuracy during the explanability also.
If no, we can close the ticket. If yes, how much time do we wish to sacrifice for performing the fine tuning searching for the best model - here could help also a metric threshold. For instance if we say to the model to iterate through 20 different combinations of hyper-parameters, in case of achieving an acceptable performance even in the 1st iteration, stop there and consider this as the best model.
Do we wish in some way to keep track of the best hyper-parameters?

momegas · 2023-01-03T12:22:48Z

I think we should not spend more time on this as a better model will not give much value to WB at the moment since we are missing more core features.
Feel free to close this if needed @NickNtamp

NickNtamp · 2023-01-03T13:12:48Z

Sure I can close the ticket @momegas .
Before do it, I want to remind to both you and @stavrostheocharis that by not exploring combinations in order to increase the possibility of building a better model in an unknown dataset we accept the high risk of explaining a trash-model. Just imagine that we could build a model that has an accuracy of 20% and we will use it for our explainability feature.

stavrostheocharis · 2023-01-03T13:24:08Z

I would keep this as an issue in the backlog, in order to further investigate it and implement an enhancement in the future

momegas · 2023-01-12T15:37:37Z

It was actually requested! You are right. I will re-open this.

NickNtamp · 2023-03-06T13:51:18Z

We should explore alternatives like https://optuna.org/ here.

momegas added the enhancement New feature or request label Nov 30, 2022

momegas assigned NickNtamp Dec 5, 2022

stavrostheocharis added the blocked This issue is blocked label Jan 2, 2023

NickNtamp closed this as completed Jan 3, 2023

momegas reopened this Jan 12, 2023

momegas unassigned NickNtamp Feb 15, 2023

momegas added this to the 🐻‍❄️ Machine learning and data monitoring milestone Feb 15, 2023

momegas removed the blocked This issue is blocked label Feb 23, 2023

momegas modified the milestones: 🐻‍❄️ Machine learning and data monitoring, 💪 Future upgrades - Whitebox Next Feb 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyper parameters tuning for model #48

Hyper parameters tuning for model #48

momegas commented Nov 30, 2022

NickNtamp commented Dec 9, 2022 •

edited

stavrostheocharis commented Dec 9, 2022

momegas commented Dec 14, 2022

NickNtamp commented Dec 16, 2022

momegas commented Jan 3, 2023 •

edited

NickNtamp commented Jan 3, 2023

stavrostheocharis commented Jan 3, 2023

momegas commented Jan 12, 2023

NickNtamp commented Mar 6, 2023

Hyper parameters tuning for model #48

Hyper parameters tuning for model #48

Comments

momegas commented Nov 30, 2022

NickNtamp commented Dec 9, 2022 • edited

stavrostheocharis commented Dec 9, 2022

momegas commented Dec 14, 2022

NickNtamp commented Dec 16, 2022

momegas commented Jan 3, 2023 • edited

NickNtamp commented Jan 3, 2023

stavrostheocharis commented Jan 3, 2023

momegas commented Jan 12, 2023

NickNtamp commented Mar 6, 2023

NickNtamp commented Dec 9, 2022 •

edited

momegas commented Jan 3, 2023 •

edited