Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error messages when target column dtype is not supported #1095

Open
paxcema opened this issue Jan 25, 2023 · 7 comments
Open

Better error messages when target column dtype is not supported #1095

paxcema opened this issue Jan 25, 2023 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@paxcema
Copy link
Member

paxcema commented Jan 25, 2023

Motivated by an example where a short_text target column returns as error:

Please specify a custom accuracy function for output type short_text.
@paxcema paxcema added the enhancement New feature or request label Jan 25, 2023
@Sumanth077
Copy link

Hi @paxcema I can help with this. Kindly let me know if you are not working on this issue currently.

@paxcema
Copy link
Member Author

paxcema commented Jan 31, 2023

Hey @Sumanth077 — I don't think there's anyone working on this at the moment, so your help would definitely be appreciated here. Let's discuss first, though. Do you have a rough idea on how to tackle this?

I think we may need to store a map of mixers and their respective supported target data types, to then check whether the final JsonAI is capable of tackling the target dtype.

@Sumanth077
Copy link

Sumanth077 commented Feb 1, 2023

Yeah, that would be a great idea @paxcema. Making a list of Mixers we have and their supported target data types.

Can we just start doing that with the 11 Mixers currently available in Mixers category?

@paxcema
Copy link
Member Author

paxcema commented Feb 3, 2023

Yes, I think you could start by adding the supported target data types for these as an attribute, maybe as an attribute in BaseMixer that is overridden in the specific __init__ of each one.

Then, when building either the code or the predictor itself out of a JsonAI object, we can check whether the target data type in the dtype_dict is contained in all mixers' lists of supported data types, as well as the ensemble that will use them. And if it's not contained, then we can raise an informative error. To be precise, this would happen in api.high_level, for the methods code_from_json_ai, code_from_problem and predictor_from_problem.

This way, we will raise an error at "model compilation" time so to speak, which is a valuable time saving.

Does this sound good?

@Sumanth077
Copy link

Sure @paxcema that sounds good. Will look into that and let you know in case of any further clarification.

@Sumanth077
Copy link

Hi @paxcema I am commenting here since this conversation will give you a clear idea on my query.

As you suggested I have made the changes to the base mixer ✅ And would be great to know how to approach in raising an Informative Error.

You have mentioned we should raise an error when building either the code or the predictor itself out of a JsonAI object.

But I guess the above error "Please specify a custom accuracy function for output type short_text." is raised when creating the Json AI object itself from Problem Definition from the "generate_json_ai" function

So would be great to know

  1. When should we raise the Error?
  2. How we should approach in raising it?

Thankyou.

@paxcema
Copy link
Member Author

paxcema commented Feb 27, 2023

I think the error should be raised somewhere within the JsonAI creation process. I recommend creating a new method and call it from api.json_ai.validate_json_ai. This method would take the JsonAI object, check the target dtype then sweep across all mixers that have been added. If any of them does not support this dtype, log an error and raise an Exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants