# Scikit-learn

Scikit-learn is a popular machine learning library in Python that provides a simple and efficient tool for data mining and data analysis. It is built on top of other Python libraries like NumPy, SciPy, and Matplotlib, and it integrates well with other Python libraries for data manipulation and analysis.

Key features of the scikit-learn library include:

* **Supervised Learning:**
Scikit-learn provides algorithms for supervised learning tasks, including classification and regression. Here's an example of using scikit-learn to train a classification model:

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

# Load the iris dataset
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)

# Initialize and train a K-Nearest Neighbors classifier
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)

# Evaluate the model
accuracy = knn.score(X_test, y_test)
print("Accuracy:", accuracy)

* **Unsupervised Learning:**
Scikit-learn also provides algorithms for unsupervised learning tasks, such as clustering and dimensionality reduction. Here's an example of using scikit-learn to perform clustering using K-Means:

In [None]:
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

# Generate synthetic data
X, _ = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)

# Apply K-Means clustering
kmeans = KMeans(n_clusters=4)
kmeans.fit(X)

# Visualize the clusters
plt.scatter(X[:, 0], X[:, 1], c=kmeans.labels_, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], marker='x', color='red', s=100)
plt.show()

* **Model Evaluation and Validation:**
Scikit-learn provides tools for model evaluation and validation, such as cross-validation and performance metrics. Here's an example of using scikit-learn to perform cross-validation on a classification model:

In [None]:
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

# Load the iris dataset
iris = load_iris()

# Initialize a logistic regression classifier
log_reg = LogisticRegression()

# Perform 5-fold cross-validation
scores = cross_val_score(log_reg, iris.data, iris.target, cv=5)
print("Cross-Validation Scores:", scores)

* **Hyperparameter Tuning:**
Scikit-learn provides tools for hyperparameter tuning to optimize model performance. Here's an example of using scikit-learn's GridSearchCV to perform hyperparameter tuning on a support vector machine (SVM) classifier:

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

# Define parameter grid
param_grid = {'C': [0.1, 1, 10, 100], 'gamma': [0.1, 0.01, 0.001], 'kernel': ['rbf', 'linear']}

# Initialize SVM classifier
svm = SVC()

# Perform grid search with cross-validation
grid_search = GridSearchCV(svm, param_grid, cv=5)
grid_search.fit(X_train, y_train)

# Get the best parameters and score
best_params = grid_search.best_params_
best_score = grid_search.best_score_

print("Best Parameters:", best_params)
print("Best Score:", best_score)

These examples showcase some of the functionalities of scikit-learn for various machine learning tasks, including supervised and unsupervised learning, model evaluation and validation, and hyperparameter tuning. Scikit-learn's simplicity and consistency make it a powerful tool for machine learning in Python.