# Machine Learning Tutorial

This notebook demonstrates how to build a simple neural network using Python and popular machine learning libraries.

## Prerequisites

Before starting, make sure you have the following libraries installed:
- numpy
- pandas
- scikit-learn
- matplotlib

In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
import matplotlib.pyplot as plt

# Set random seed for reproducibility
np.random.seed(42)

## Data Preparation

Let's create some sample data for our machine learning model. We'll generate a simple dataset with two features.

```python
def generate_data(n_samples=1000):
    """Generate sample data for training"""
    # Create random features
    X = np.random.randn(n_samples, 2)
    # Create labels based on a simple rule
    y = (X[:, 0] + X[:, 1] > 0).astype(int)
    return X, y
```

The function above creates a binary classification problem where the label depends on the sum of the two features.

In [None]:
def generate_data(n_samples=1000):
    """Generate sample data for training"""
    # Create random features
    X = np.random.randn(n_samples, 2)
    # Create labels based on a simple rule
    y = (X[:, 0] + X[:, 1] > 0).astype(int)
    return X, y

# Generate training data
X, y = generate_data(1000)
print(f"Data shape: {X.shape}")
print(f"Labels shape: {y.shape}")

## Model Training

Now we'll train a simple neural network using scikit-learn's MLPClassifier. This is a multi-layer perceptron that can learn non-linear patterns in the data.

### Key Parameters:
- `hidden_layer_sizes`: The number of neurons in each hidden layer
- `max_iter`: Maximum number of iterations for training
- `random_state`: Random seed for reproducible results

In [None]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the neural network
model = MLPClassifier(
    hidden_layer_sizes=(10, 5),  # Two hidden layers with 10 and 5 neurons
    max_iter=1000,               # Maximum iterations
    random_state=42              # For reproducible results
)

# Train the model
model.fit(X_train, y_train)
print("Model training completed!")

## Model Evaluation

Let's evaluate our trained model and see how well it performs on the test data.

In [None]:
# Make predictions on the test set
y_pred = model.predict(X_test)

# Calculate accuracy
accuracy = model.score(X_test, y_test)
print(f"Model accuracy: {accuracy:.3f}")

# Show some example predictions
print("\nSample predictions:")
for i in range(5):
    print(f"Features: {X_test[i]}, Predicted: {y_pred[i]}, Actual: {y_test[i]}")

## Visualization

Finally, let's create a visualization to better understand our model's decision boundary.

In [None]:
# Create a mesh for plotting decision boundary
h = 0.02  # Step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))

# Make predictions on the mesh
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Create the plot
plt.figure(figsize=(10, 8))
plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.RdYlBu)
scatter = plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.RdYlBu, edgecolors='black')
plt.colorbar(scatter)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Neural Network Decision Boundary')
plt.show()

## Conclusion

In this tutorial, we've successfully:

1. **Generated synthetic data** for a binary classification problem
2. **Trained a neural network** using scikit-learn's MLPClassifier
3. **Evaluated the model** performance on test data
4. **Visualized the decision boundary** to understand how the model makes predictions

This is a basic example, but the same principles apply to more complex machine learning problems. You can experiment with:
- Different network architectures (more layers, different sizes)
- Different activation functions
- Real-world datasets
- More sophisticated evaluation metrics

Happy learning!