In [6]:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

from sklearn.datasets import load_breast_cancer
# Load the dataset
data = load_breast_cancer()
X = data.data  # features
y = data.target  # target labels (0 for malignant, 1 for benign)

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a logistic regression model (multinomial logistic regression)
model = LogisticRegression(multi_class='ovr', solver='sag', max_iter=1000)

# Fit the model on the training data
model.fit(X_train, y_train)

# Predict on the testing data
y_pred = model.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

Accuracy: 0.97




Scikit-learn's `LogisticRegression` function provides several solvers for optimization. Here's a brief summary of each:

1. **`'liblinear'`**: A Library for Large Linear Classification. It uses a coordinate descent (CD) algorithm, and it can handle both L1 and L2 regularization. It's a good choice for small datasets and is the only solver that supports the "one versus rest" scheme when the `multi_class` option is set to `'ovr'`.

2. **`'newton-cg'`**: Newton's method with conjugate gradient. It's an optimization algorithm that can handle multiclass problems and L2 regularization. It's a good choice for larger datasets, as it converges faster than `'liblinear'` for these.

3. **`'lbfgs'`**: Limited-memory Broyden–Fletcher–Goldfarb–Shanno Algorithm. It's an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm using a limited amount of computer memory. It's a good choice for larger datasets, and it supports multiclass problems and L2 regularization.

4. **`'sag'`**: Stochastic Average Gradient descent. It's a variant of gradient descent, and it's faster than other solvers for large datasets. However, it only supports L2 regularization.

5. **`'saga'`**: Stochastic Average Gradient descent with Augmented factor. It's a variant of `'sag'` that also supports the non-smooth penalty='l1' option (i.e., L1 regularization). This is therefore the solver of choice for sparse multinomial logistic regression and it's robust to unscaled datasets.

Each solver has its strengths and weaknesses, and the best one to use depends on the nature of your data and the specific requirements of your problem.