Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace load_boston due to Ethical Issues #629

Merged
merged 13 commits into from
May 11, 2021
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,11 +150,11 @@ But our implementations work on multiple GPUs, TPUs and scale dramatically...
```python
from pl_bolts.models.regression import LinearRegression
from pl_bolts.datamodules import SklearnDataModule
from sklearn.datasets import load_boston
from sklearn.datasets import load_diabetes
import pytorch_lightning as pl

# sklearn dataset
X, y = load_boston(return_X_y=True)
X, y = load_diabetes(return_X_y=True)
loaders = SklearnDataModule(X, y)

model = LinearRegression(input_dim=13)
Expand Down
4 changes: 2 additions & 2 deletions docs/source/classic_ml.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ Add either L1 or L2 regularization, or both, by specifying the regularization st
from pl_bolts.models.regression import LinearRegression
import pytorch_lightning as pl
from pl_bolts.datamodules import SklearnDataModule
from sklearn.datasets import load_boston
from sklearn.datasets import load_diabetes

X, y = load_boston(return_X_y=True)
X, y = load_diabetes(return_X_y=True)
loaders = SklearnDataModule(X, y)

model = LinearRegression(input_dim=13)
Expand Down
8 changes: 4 additions & 4 deletions docs/source/datamodules_sklearn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ Utilities to map sklearn or numpy datasets to PyTorch Dataloaders with automatic

.. code-block:: python

from sklearn.datasets import load_boston
from sklearn.datasets import load_diabetes
from pl_bolts.datamodules import SklearnDataModule

X, y = load_boston(return_X_y=True)
X, y = load_diabetes(return_X_y=True)
loaders = SklearnDataModule(X, y)

train_loader = loaders.train_dataloader(batch_size=32)
Expand All @@ -21,10 +21,10 @@ Or build your own torch datasets

.. code-block:: python

from sklearn.datasets import load_boston
from sklearn.datasets import load_diabetes
from pl_bolts.datamodules import SklearnDataset

X, y = load_boston(return_X_y=True)
X, y = load_diabetes(return_X_y=True)
dataset = SklearnDataset(X, y)
loader = DataLoader(dataset)

Expand Down
8 changes: 4 additions & 4 deletions docs/source/introduction_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -338,10 +338,10 @@ We even have prebuilt modules to bridge the gap between Numpy, Sklearn and PyTor

.. code-block:: python

from sklearn.datasets import load_boston
from sklearn.datasets import load_diabetes
from pl_bolts.datamodules import SklearnDataModule

X, y = load_boston(return_X_y=True)
X, y = load_diabetes(return_X_y=True)
datamodule = SklearnDataModule(X, y)

model = LitModel(datamodule)
Expand Down Expand Up @@ -382,10 +382,10 @@ Here's an example for Linear regression

import pytorch_lightning as pl
from pl_bolts.datamodules import SklearnDataModule
from sklearn.datasets import load_boston
from sklearn.datasets import load_diabetes

# link the numpy dataset to PyTorch
X, y = load_boston(return_X_y=True)
X, y = load_diabetes(return_X_y=True)
loaders = SklearnDataModule(X, y)

# training runs training batches while validating against a validation set
Expand Down
8 changes: 4 additions & 4 deletions pl_bolts/datamodules/sklearn_datamodule.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ class SklearnDataset(Dataset):
Mapping between numpy (or sklearn) datasets to PyTorch datasets.

Example:
>>> from sklearn.datasets import load_boston
>>> from sklearn.datasets import load_diabetes
>>> from pl_bolts.datamodules import SklearnDataset
...
>>> X, y = load_boston(return_X_y=True)
>>> X, y = load_diabetes(return_X_y=True)
>>> dataset = SklearnDataset(X, y)
>>> len(dataset)
506
Expand Down Expand Up @@ -114,10 +114,10 @@ class SklearnDataModule(LightningDataModule):

Example:

>>> from sklearn.datasets import load_boston
>>> from sklearn.datasets import load_diabetes
>>> from pl_bolts.datamodules import SklearnDataModule
...
>>> X, y = load_boston(return_X_y=True)
>>> X, y = load_diabetes(return_X_y=True)
>>> loaders = SklearnDataModule(X, y, batch_size=32)
...
>>> # train set
Expand Down
4 changes: 2 additions & 2 deletions pl_bolts/models/regression/linear_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def cli_main():

# create dataset
if _SKLEARN_AVAILABLE:
from sklearn.datasets import load_boston
from sklearn.datasets import load_diabetes
else: # pragma: no cover
raise ModuleNotFoundError(
'You want to use `sklearn` which is not installed yet, install it with `pip install sklearn`.'
Expand All @@ -133,7 +133,7 @@ def cli_main():
# model = LinearRegression(**vars(args))

# data
X, y = load_boston(return_X_y=True) # these are numpy arrays
X, y = load_diabetes(return_X_y=True) # these are numpy arrays
loaders = SklearnDataModule(X, y, batch_size=args.batch_size)

# train
Expand Down