Skip to content

Commit

Permalink
fetch_openml should return numpy arrays (#886)
Browse files Browse the repository at this point in the history
The default was changed from numpy arrays to pandas dataframes, which
broke a notebook. Use parameter to return numpy arrays again.

Additionally, set random seed for shuffling in mnist benchmark script.
  • Loading branch information
BenjaminBossan committed Aug 22, 2022
1 parent 0b7b604 commit 787b859
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
8 changes: 4 additions & 4 deletions examples/benchmarks/mnist.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,11 @@


def get_data(num_samples):
mnist = fetch_openml('mnist_784')
mnist = fetch_openml('mnist_784', as_frame=False)
torch.manual_seed(0)
X = mnist.data.values.astype('float32').reshape(-1, 1, 28, 28)
y = mnist.target.values.astype('int64')
X, y = shuffle(X, y)
X = mnist.data.astype('float32').reshape(-1, 1, 28, 28)
y = mnist.target.astype('int64')
X, y = shuffle(X, y, random_state=0)
X, y = X[:num_samples], y[:num_samples]
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0)
X_train /= 255
Expand Down
2 changes: 1 addition & 1 deletion notebooks/MNIST.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
"metadata": {},
"outputs": [],
"source": [
"mnist = fetch_openml('mnist_784', cache=False)"
"mnist = fetch_openml('mnist_784', as_frame=False, cache=False)"
]
},
{
Expand Down

0 comments on commit 787b859

Please sign in to comment.