-
Notifications
You must be signed in to change notification settings - Fork 92
NSections Issue with Train Dataset #87
Comments
Could you please share the code you have used to receive the error? As I can see you have 17k+ features - are they real or you one-hot-encoded some variables? Alex |
Is the line just above. The 17k features are 20 features and OHE feats are: The dataset is the toy King County housing dataset |
Now all is clear. I'll state some moments I can see from your code below:
Hope this helps. Alex |
@alexmryzhkov - I uncommented cpu_limit to properly utilize 12 threads. I backtracked and used the "wholeDf" with no OHE -- but this has resulted in the same issue. I have set
but I still end up with the same error. For use_algos, that line is commented out -- I will say that it's not quite clear from the documentation how to implement the various algorithms -- for instance, I used "LinearLBFGS" as a result of the documentation rather than the example on Kaggle. In terms of processing / category: I fed categorical feats with the no dummy df (wholeDf) and I fed [dummy cats + orig cats] to roles for "newTrainDum", but no matter what I am receiving the same error. Perhaps I am just giving it too fine a tune for a beginner? Should I just try to run it in a naive style? |
Please check my notebook on the King County dataset - if it works with you, cool. If you have any questions about that - please feel free to ask. Alex |
@alexmryzhkov I rewrote my notebook to better follow the flow of initializing the CV. It worked, which is great! I think the root of the problem may have been trying to set torch.device to 'cuda'? Either way, thank you for your notebook and confirmation on the dataset! |
@AlexanderLavelle if you set torch.device to cuda, do you want to train models on GPU? If yes, you have no need to do that - if your environment has properly installed GPU and torch, our LightAutoML will automatically train CatBoost models on GPU (for other models there will be almost no improvement, especially in Kaggle Kernels). Alex |
@alexmryzhkov I would like to train on GPU, top to bottom. I have sklearn-intelex and GPU versions of lightgbm on my local machine -- so in theory, any dataset within my 4GB nvidia card (planning to upgrade soon), I would like to have the pipeline do every calculation on GPU for speed. As far as gpu/enhanced sklearn (intelex), I have received notices that lightautoml will use the intelex augmented 'auc'. |
Currently we do not have the full GPU pipeline but we are working on it Alex |
Stale issue message |
When I inserted my training df into .fit_predict(), I receive the initiation:
but then the process fails with error:
Any help to adjust my df would be greatly appreciated. I did try to convert to array, but then the target param can't be found (which makes sense).
Thanks in advance!
The text was updated successfully, but these errors were encountered: