New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lime function with date columns #39
Comments
That is intentional - I have no idea how dates should be sampled in any meaningful way for the permutations |
Understood. I guess I have the issue as I have a time series model, which I understand was not what LIME was designed for, but I agree with your reasoning. Thank you for supporting the package btw! |
Hmm - it might make sense to just hold time constant across permutations so it will give insight into why, at this time, the model behaves as it does..? I’ll reopen and give it some more thoughts |
I'd personally recommend converting the date column to numeric. A date class is a feature to represent time, and time should be better explained by a numeric space that contains the intrinsic relationship from one point to the next, over categories that do not hold this information. Techniques like RandomForest, XGBoost, and even linear regression convert dates to numeric, as if the user wanted to convey date via a category, it should be classed as a character/factor already. However on the topic of time series, are there any consequences when using LIME and a time series model? I understand LIME was built in mind for stationary models (for instance, decisions trees), but could LIME's sampling technique produce misleading results? |
What I'm suggesting is to hold the Date column constant, not converting it to something else. My rationale is that you're often not interested in knowing that your model is time-dependent; that lies implicit in time series. Instead you are more interested in knowing how the different variables, as they are on this specific time, have contributed. |
Ah yes, I understand what you mean now. This makes perfect sense to me now as LIME is sampling around the variables for a given epoch, and therefore it would make best sense to keep time static when doing so. Nice idea! Any idea of when this could be implemented? |
I won't make any promises but it could probably be included in the next update, due within the next couple of months |
Do you have a dummy model and data I can play with? Don't really have any real-life timeseries data to validate with... |
FYI the feature is being implemented in the date-support branch |
Hi Tom,
Apologies for the delay. I don’t really have a model or file I can share, all the data I have is sensitive.
However to test it, using a dummy set should suffice.
Looking forward to seeing the new version!
Thank you for your hard work for this :)
Regards, Arun
…Sent from my iPhone
On 15 Nov 2017, at 13:22, Thomas Lin Pedersen <notifications@github.com<mailto:notifications@github.com>> wrote:
FYI the feature is being implemented in the date-support branch
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub<#39 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AYVYthrIHRj6f2A03QxZTX1ZQZR-9ht2ks5s2uV3gaJpZM4PmUJF>.
|
The explain function errors whenever I have a date column in my dataset. This is a minor issue but I thought I should flag it anyways.
The text was updated successfully, but these errors were encountered: