-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example how to use catboost with the time series data #53
Comments
We don't have any specific support for time series in catboost, so you need to find your own way to prepare data to use it in gradient boosting. |
shouldn't the video be changed, as its misleading? |
No, it's a very common thing to use gradient boosting for time series. The way you are using it is up to your task. |
The whole idea is about you must prepare data for boosting. The idea of tree-based methods is that you do cuts, in order to get maximum entropy. Boosting doesn't make any linear equations to the data. So if you have a training parameter which value is in [9, 11] boosting may do some cuts. But as soon as you check it on valid set, where this feature sails between [11, 13] - previous cuts doesn't work at all and we might get 0.5 prediction accuracy. Okay, you need to normalize the data, but in my opinion ordinary sklearn MinMax or Standard Scalers just reshape the data, so kind of shifting transformation may help. So we get deltas, which are way better for classification problems... Moreover we can try normalize these deltas by dividing it by value of an original feature. |
I'm wondering how to use the "has_time=True" how do I specify which column is my Date column to Catboost? I saw that there is a way to build a "Data format description" file but how do you pass this to the algorythm and does it use to improve the results? Could you give an example of how to implement this when one of your columns is a pandas Date type and has_time=True? (I also already have columns that explodes the dates into it's components). |
Hi,
In the introduction/promo video (https://www.youtube.com/watch?v=00BMdlwKKXI) you have mentioned that Catboost can analyse the time series historical data for weather forecasts.
But I was not able to find anything like this in tutorials: https://github.com/catboost/catboost/tree/master/catboost/tutorials
The text was updated successfully, but these errors were encountered: