# Federated Learning (FL)

This Notebook gives you a brief introduction to FL. 

## Background

Similar to machine unlearning, FL also tries to solve security and privacy problems arising from the fact that a large amount of data is used in ML. You may check the `intro-unlearning.ipynb` Notebook for detailed implications of data security and privacy in ML systems. FL, however, takes a different approach to achieve privacy and security, which means there are some overlapping issues in the cross section between FL and unlearning (that is the main topic of this project). For now, let's keep focusing on FL. 

In traditional settings, data is stored and processed at a centralized location in a centralized manner. All this centralization used with data can lead to security and privacy problems. Despite efforts to mitigate such problems in centralized settings, decentralized methods have been introduced to provide an alternative pathway towards better security and privay protection. However, as one can clearly see that data collection and processing capabilities on individual decentralized machine (client) may not be adequate to train an ML model with good enough performance, there probably needs to be a machanism for orchestrating multiple clients. FL is one such method.

In a typical FL setting (there are, of course, many variations), an ML model is initialized at a central server and then sent to participating clients. The clients train the model with the data that they have, generating an update on the model (sub-update), and then send the sub-update to the server. The server then use the received sub-updates to update (ususally by aggregating) its own model. The updated central model is then sent to clients for further training. 

![fl](https://1.bp.blogspot.com/-K65Ed68KGXk/WOa9jaRWC6I/AAAAAAAABsM/gglycD_anuQSp-i67fxER1FOlVTulvV2gCLcB/s1600/FederatedLearning_FinalFiles_Flow%2BChart1.png)

By using FL, clients can jointly train a model without sending out collected data, protecting against security and privacy problems during data transmission and at the central server. 

## Categorizations

There are many ways to categorize FL. The most common one is by data partitioning. According to the distribution patterns of sample and feature space of data, FL can be categorized into 3 categories: horizontal FL, vertical FL and federated transfer learning. 

- Horizontal FL is useful in the case that the user features overlap a lot, but the users overlap little.

- Vertical FL is for the reverse of the above.

- Federated transfer learning can be used when overlap is rare. 

![ul](https://drive.google.com/uc?export=view&id=1SmJmk6H40KVuWmrojN4-NmN2TfcOeEB5)

There are more ways to categorize FL. 

![ul](https://drive.google.com/uc?export=view&id=1UYMeZLY4mkjfa_krRmOpAk0qqOzjTnDC)

## Challenges

### Privacy protection

Privacy protection needs to be ensured in FL as FL was proposed with an emphasis on privacy, with the clients claiming complete autonomy over data. Some common privacy protection methods include model aggregation, homomorphic encryption and differential privacy. 

### Insufficient training data

Data collected at a ciient can be quite inadequate, when compared to the sample and feature space of data used in the central model. Hence, it becomes a challenge to generate a sub-update using insufficient data at the client and aggregating the sub-update into the central model. 

### Statistical heterogeneity

There can be a lot of clients in FL, and the data held by clients can be non-[IID](https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables)  (Non-Independent and Identically Distributed). When data structure is very different, it becomes a challenge to do FL. 

# References

- C. Zhang, Y. Xie, H. Bai, B. Yu, W. Li, and Y. Gao, “A survey on federated learning,” Knowledge-Based Systems, vol. 216, p. 106775, 2021. [[Paper](https://www.sciencedirect.com/science/article/abs/pii/S0950705121000381)]

- B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Apr. 2017, vol. 54, pp. 1273–1282. [[Paper](https://proceedings.mlr.press/v54/mcmahan17a.html)]