Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use evaluation and explanation as a standalone package? #14

Closed
ydennisy opened this issue Sep 18, 2020 · 2 comments
Closed

Use evaluation and explanation as a standalone package? #14

ydennisy opened this issue Sep 18, 2020 · 2 comments
Labels
question Further information is requested

Comments

@ydennisy
Copy link

Hey, cool project!

Just wondering if you think it is feasible to use the explanations with other models?

@sergioburdisso sergioburdisso added the question Further information is requested label Sep 21, 2020
@sergioburdisso
Copy link
Owner

sergioburdisso commented Sep 21, 2020

Hey @ydennisy! Thanks :D

I think it could be possible to use explanations with other models, as long as the Live Test tool receives a JSON holding the value each word/element has, it will work regarding the model being used. In principle, updating the server module's code (server.py) so that a JSON for the classification result could be generated by other models will do the trick. However, since most supervised machine learning models work as black boxes, representing input documents as feature vectors, we will need to use some method/technique on top of it (such as LIME) to try to infer/estimate how valuable each raw input element was according to the model being used. This was straightforward for the SS3 model since, by design, the model explicitly learns a confidence value for each input element and each class, and then, the classification is performed directly based on those values. This confidence value tells how relevant each input element is as if we were asking "How much value has this element for this class?". e.g. after training the model, SS3 would learn something like:

value("apple", technology) = 0.7
value("apple", business) = 0.5
value("apple", food) = 0.9

and

value("the", technology) = value("the", business) = value("the", food) = 0

That is, "apple" has a value of 0.7 for technology, 0.5 for business, and 0.9 for food, whereas "the" has no value for any category. Note that even Multinomial Naive Bayes (a white box model) wouldn't work "out of the box" either since it values/weights each element by its probability (log P(w|c)) and hence (stop)words like "the", "a", "with", etc. would have the highest value (i.e. highest probability).

Regarding Evaluation, I think it is also feasible, at least for models with three hyperparameters (or less); with more than three hyperparameters we will need to think about how to adapt the Evaluation 3D Plot UI for letting users select only the three (or less) hyperparameters they are interested in.

@sergioburdisso
Copy link
Owner

(I'm closing this issue, but feel free to reopen it whenever you want, I'll also reopen in case sometime in the future I'm able to actually implement this. Anyways, thanks for raising the question 💪🤓👍Take care!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants