## 🎓 Federated Learning from Scratch with PyTorch


### Step 1: Setup and Data Preparation

First, let's install PyTorch and prepare our data. Federated learning is all about distributed data, so we'll simulate this by splitting a single dataset (MNIST) into several smaller datasets, one for each "client."


In [2]:
import torch
from torch import nn, optim
from torch.utils.data import DataLoader, Dataset
from torchvision import datasets, transforms
import copy # We will use this to deep-copy our model to clients
import random # For client selection

In [3]:
# Set a device for training (GPU if available, otherwise CPU)
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [4]:
# Define the data transformations for our dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

#### Download and load the MNIST training and test datasets


In [5]:
# In a real-world scenario, this data would already be on the clients' devices.
train_dataset = datasets.MNIST('./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST('./data', train=False, download=True, transform=transform)

In [6]:
# Let's define the number of clients we'll simulate
NUM_CLIENTS = 10
CLIENT_BATCH_SIZE = 32

#### Partition the data for each client.


In [9]:
# We'll split the training data equally among the 10 clients.
client_data = torch.utils.data.random_split(train_dataset, 
                                            [len(train_dataset) // NUM_CLIENTS] * NUM_CLIENTS
                                            )


In [14]:
# Create a DataLoader for each client's data
client_trainloaders = [
    DataLoader(data, batch_size=CLIENT_BATCH_SIZE, shuffle=True) for data in client_data
]

In [13]:
# Create a single test DataLoader for the server's evaluation
test_dataloader = DataLoader(test_dataset, batch_size=128, shuffle=False)


In [15]:
print(f"Data has been partitioned among {NUM_CLIENTS} clients.")
print(f"Each client has {len(client_data[0])} samples.")
print(f"Test dataset has {len(test_dataset)} samples.")
print(f"Test DataLoader created with batch size {test_dataloader.batch_size}.")

Data has been partitioned among 10 clients.
Each client has 6000 samples.
Test dataset has 10000 samples.
Test DataLoader created with batch size 128.


### Code Explanation:

torch, nn, optim: Standard PyTorch imports for building and training neural networks.

copy: We'll use copy.deepcopy to create independent copies of the global model for each client.

random: To randomly select a subset of clients for each training round.

datasets.MNIST: We use the MNIST dataset because it's simple and a great starting point.

torch.utils.data.random_split: This function is our "magic wand" for simulating decentralized data. It splits the train_dataset into 10 non-overlapping subsets, each representing a single client's private data.

## Step 2: Defining the Neural Network Model

We'll use a simple Multi-Layer Perceptron (MLP) for this task. The model is defined once and will be used by both the server (as the global model) and the clients (as their local models).

In [17]:
# Define the MLP model architecture
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        # 28x28 images, so input size is 784
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28*28, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10) # Output layer for 10 classes (digits 0-9)

    def forward(self, x):
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)
        return x


### Code Explanation:

This is a standard PyTorch nn.Module class. It defines the structure of our model.

The forward method specifies how data flows through the network.

We're using a simple architecture: flatten the image, pass it through two fully connected layers with a ReLU activation, and a final layer for the 10 output classes.

## Step 3: The Client-side Training Loop

Each client needs a function to perform local training. This function will take the client's data and the current global model, train it for a few epochs, and return the updated model parameters.


In [None]:
def client_training(model, trainloader, epochs=1):
    """
    Performs a single round of local training on a client's data.

    Args:
        model (nn.Module): The global model parameters from the server.
        trainloader (DataLoader): The client's local data loader.
        epochs (int): Number of local epochs to train for.

    Returns:
        OrderedDict: The updated state_dict (model parameters) after local training.
    """
    # Create a local copy of the model
    local_model = copy.deepcopy(model).to(DEVICE)
    local_model.train()  # Set the model to training mode

    # Define loss function and optimizer
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(local_model.parameters(), lr=0.01)

    for epoch in range(epochs):
        for images, labels in trainloader:
            images, labels = images.to(DEVICE), labels.to(DEVICE)

            # Zero the gradients
            optimizer.zero_grad()

            # Forward pass
            outputs = local_model(images)
            loss = criterion(outputs, labels)
            # Backward pass and optimization
            loss.backward()
            optimizer.step()
            
    return local_model.state_dict()


SyntaxError: incomplete input (879769786.py, line 22)