We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I use the sample code to prepare the dataset:
device = 'cpu' dataset = sklearn.datasets.fetch_california_housing() task_type = 'regression' X_all = dataset['data'].astype('float32') y_all = dataset['target'].astype('float32') n_classes = None X = {} y = {} X['train'], X['test'], y['train'], y['test'] = sklearn.model_selection.train_test_split( X_all, y_all, train_size=0.8 ) X['train'], X['val'], y['train'], y['val'] = sklearn.model_selection.train_test_split( X['train'], y['train'], train_size=0.8 ) # not the best way to preprocess features, but enough for the demonstration preprocess = sklearn.preprocessing.StandardScaler().fit(X['train']) X = { k: torch.tensor(preprocess.fit_transform(v), device=device) for k, v in X.items() } y = {k: torch.tensor(v, device=device) for k, v in y.items()} # !!! CRUCIAL for neural networks when solving regression problems !!! y_mean = y['train'].mean().item() y_std = y['train'].std().item() y = {k: (v - y_mean) / y_std for k, v in y.items()} y = {k: v.float() for k, v in y.items()}
And I train a LGBMRegressor with the default hyper parameters:
model = lgb.LGBMRegressor() model.fit(X['train'], y['train'])
But when I evaluate on the test fold, I found the performance is 0.68:
>>> test_pred = model.predict(X['test']) >>> test_pred = torch.from_numpy(test_pred) >>> rmse = torch.nn.functional.mse_loss( >>> test_pred.view(-1), y['test'].view(-1)) ** 0.5 * y_std >>> print(f'Test RMSE: {rmse:.2f}.') Test RMSE: 0.68.
Even using the model from rtdl gives me 0.56 RMSE:
rtdl
(epoch) 57 (batch) 0 (loss) 0.1885 (epoch) 57 (batch) 10 (loss) 0.1315 (epoch) 57 (batch) 20 (loss) 0.1735 (epoch) 57 (batch) 30 (loss) 0.1197 (epoch) 57 (batch) 40 (loss) 0.1952 (epoch) 57 (batch) 50 (loss) 0.1167 Epoch 057 | Validation score: 0.7334 | Test score: 0.5612 <<< BEST VALIDATION EPOCH
Is there anything I miss? How can I reproduce the performance in your paper? Thanks!
The text was updated successfully, but these errors were encountered:
Problem solved. There is a bug in the example code. Change
k: torch.tensor(preprocess.fit_transform(v), device=device)
into
k: torch.tensor(preprocess.transform(v), device=device)
Sorry, something went wrong.
@fingertap thank you for reporting this!
448d590
No branches or pull requests
I use the sample code to prepare the dataset:
And I train a LGBMRegressor with the default hyper parameters:
But when I evaluate on the test fold, I found the performance is 0.68:
Even using the model from
rtdl
gives me 0.56 RMSE:Is there anything I miss? How can I reproduce the performance in your paper? Thanks!
The text was updated successfully, but these errors were encountered: