# Medical Insurance - a Federated Learning Use Case.

Revision by Destatis (Julius Weißmann and Oliver Hauke)

## Summary
---

- We stablized the centralized deep neural network (DNN)
  - model with more units and layers, without dropout, Xavier Initializion
  - more robust, faster and precise results
- We fixed the Federated Learning (FL) Algorithm
  - same model as in the centralized setting
  - great improvement in loss, MAE similiar to centralized setting
- outline
  - FL highly suitable for the available data
  - suggestions:
    - fixed train/val/test-split for centralized vs federated
    - cross validation
    - tests for 5 or 9 features

## Initial Results
---

### Centralized

*Training Performance after tuning:*
![](https://github.com/Olhaau/fl-official-statistics-addon/blob/main/original_work/med-insurance/rsquared_hyperparams.jpg?raw=1)




### Federated

S. 
https://github.com/joshua-stock/fl-official-statistics/blob/main/med-insurance/med-insurance-federated.ipynb

- "*Ergebnisse sehen deutlich schlechter aus als zentralisiert.*"
- "*MAE geht nicht unter ~8700 (vs. ~2900 im zentralisierten Modell)*"
- "*R² ist negativ!*"

## Setup
---

In [1]:
# Is a repo-clone and installs needed (e.g. in colabs)? 
need_clone_install = True

### Pull Repo

In [2]:
if need_clone_install:
    import os
    
    # rm repo from gdrive
    if os.path.exists("fl-official-statistics-addon"):
      %rm -r fl-official-statistics-addon

    # clone
    !git clone https://github.com/Olhaau/fl-official-statistics-addon
    %cd fl-official-statistics-addon

    # pull (the currenct version of the repo)
    !git pull

Cloning into 'fl-official-statistics-addon'...
remote: Enumerating objects: 857, done.[K
remote: Counting objects: 100% (39/39), done.[K
remote: Compressing objects: 100% (29/29), done.[K
remote: Total 857 (delta 24), reused 15 (delta 10), pack-reused 818[K
Receiving objects: 100% (857/857), 33.10 MiB | 18.73 MiB/s, done.
Resolving deltas: 100% (384/384), done.
/content/fl-official-statistics-addon
Already up to date.


### Installs

#### Python Version

In [14]:
!sudo apt-get update -y
!sudo apt-get install python3.8
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8

!update-alternatives --install /usr/bin/python python /usr/bin/python3.8
!update-alternatives --list python
!sudo update-alternatives --config python
!sudo update-alternatives --set python /usr/bin/python3.8
!python3 --version

0% [Working]            Get:1 https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/ InRelease [3,622 B]
Hit:2 http://archive.ubuntu.com/ubuntu focal InRelease
Get:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Hit:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease
Hit:5 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu focal InRelease
Get:6 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Hit:8 http://ppa.launchpad.net/cran/libgit2/ubuntu focal InRelease
Get:9 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1,323 kB]
Hit:10 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu focal InRelease
Get:11 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [3,069 kB]
Get:12 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [1,027 kB]
Get:13 http://ppa.launchpad.net/graphics-driv

In [15]:
!python --version

Python 3.9.16


In [20]:
!update-alternatives --install /usr/bin/python python /usr/bin/python3.8 1
!update-alternatives --list python
!sudo update-alternatives --config python
!sudo update-alternatives --set python /usr/bin/python3.8
!python3 --version

/usr/bin/python3.8
/usr/bin/python3.9
There are 2 choices for the alternative python (providing /usr/bin/python).

  Selection    Path                Priority   Status
------------------------------------------------------------
  0            /usr/bin/python3.8   1         auto mode
* 1            /usr/bin/python3.8   1         manual mode
  2            /usr/bin/python3.9   1         manual mode

Press <enter> to keep the current choice[*], or type selection number: 1
Python 3.9.16


In [19]:
!python --version

Python 3.9.16


/usr/bin/python3.8


In [13]:
#S. https://stackoverflow.com/questions/63168301/how-to-change-the-python-version-from-default-3-5-to-3-8-of-google-colab
!update-alternatives --install /usr/bin/python python /usr/bin/python3.7 1

update-alternatives: error: alternative path /usr/bin/python3.7 doesn't exist


In [21]:
!update-alternatives --list python

/usr/bin/python3.8
/usr/bin/python3.9


In [22]:
!sudo update-alternatives --config python

There are 2 choices for the alternative python (providing /usr/bin/python).

  Selection    Path                Priority   Status
------------------------------------------------------------
  0            /usr/bin/python3.8   1         auto mode
* 1            /usr/bin/python3.8   1         manual mode
  2            /usr/bin/python3.9   1         manual mode

Press <enter> to keep the current choice[*], or type selection number: 


In [23]:
!sudo update-alternatives --set python /usr/bin/python3.8

In [28]:
!update-alternatives --install /usr/bin/python python /usr/bin/python3.8

update-alternatives: --install needs <link> <name> <path> <priority>

Use 'update-alternatives --help' for program usage information.


In [29]:
!sudo update-alternatives --set python /usr/bin/python3.8

In [30]:
!python --version

Python 3.9.16


In [12]:
!sudo apt-get update -y
!sudo apt-get install python3.9
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9

!update-alternatives --install /usr/bin/python python /usr/bin/python3.9
!update-alternatives --list python
!sudo update-alternatives --config python
!sudo update-alternatives --set python /usr/bin/python3.9
!python3 --version

0% [Working]            Hit:1 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu focal InRelease
0% [Connecting to archive.ubuntu.com (91.189.91.39)] [Connecting to security.ub                                                                               Hit:2 http://ppa.launchpad.net/cran/libgit2/ubuntu focal InRelease
0% [Connecting to archive.ubuntu.com (91.189.91.39)] [Connecting to security.ub                                                                               Get:3 https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/ InRelease [3,622 B]
0% [Connecting to archive.ubuntu.com (91.189.91.39)] [Connecting to security.ub0% [Connecting to archive.ubuntu.com (91.189.91.39)] [Connecting to security.ub                                                                               Hit:4 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu focal InRelease
0% [Connecting to archive.ubuntu.com (91.189.91.39)] [Connecting to security.ub                                 

In [11]:
!python --version

Python 3.8.10


In [23]:
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9

update-alternatives: --install needs <link> <name> <path> <priority>

Use 'update-alternatives --help' for program usage information.


In [9]:
!sudo update-alternatives --config python3

There are 2 choices for the alternative python3 (providing /usr/bin/python3).

  Selection    Path                Priority   Status
------------------------------------------------------------
* 0            /usr/bin/python3.9   2         auto mode
  1            /usr/bin/python3.8   1         manual mode
  2            /usr/bin/python3.9   2         manual mode

Press <enter> to keep the current choice[*], or type selection number: 1
update-alternatives: using /usr/bin/python3.8 to provide /usr/bin/python3 (python3) in manual mode


In [10]:
!python --version

Python 3.8.10


#### Other

In [31]:
!pip install tensorflow-federated==0.48.*

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tensorflow-federated==0.48.*
  Downloading tensorflow_federated-0.48.0-py2.py3-none-any.whl (42.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.8/42.8 MB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting cachetools~=3.1
  Downloading cachetools-3.1.1-py2.py3-none-any.whl (11 kB)
Collecting typing-extensions~=4.4.0
  Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting jax==0.3.14
  Downloading jax-0.3.14.tar.gz (990 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m990.1/990.1 KB[0m [31m74.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting tensorflow-privacy==0.8.6
  Downloading tensorflow_privacy-0.8.6-py3-none-any.whl (301 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m301.7/301.7 KB[0m [31m34.1 MB/s[0m eta [36m0:00:00[0m
[?25hCo

In [32]:
!pip list

Package                       Version
----------------------------- ------------
absl-py                       1.0.0
alabaster                     0.7.13
albumentations                1.2.1
altair                        4.2.2
anyio                         3.6.2
appdirs                       1.4.4
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arviz                         0.15.1
astropy                       5.2.2
astunparse                    1.6.3
attrs                         21.4.0
audioread                     3.0.0
autograd                      1.5
Babel                         2.12.1
backcall                      0.2.0
beautifulsoup4                4.11.2
bleach                        6.0.0
blis                          0.7.9
bokeh                         2.4.3
branca                        0.6.0
CacheControl                  0.12.11
cached-property               1.5.2
cachetools                    3.1.1
catalogue                     2.0.8
certifi     

In [34]:
!pip install --quiet tensorflow-addons==0.19.*

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.1 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━[0m [32m0.5/1.1 MB[0m [31m14.3 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m18.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [19]:
if need_clone_install:
  !pip install --quiet nest-asyncio==1.5.6
  #!pip install --quiet tensorflow==2.11.*
  !pip install --quiet tensorflow-federated==0.48.*
  !pip install --quiet tensorflow-addons==0.19.*

[2K     [91m━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.2/40.1 MB[0m [31m103.7 MB/s[0m eta [36m0:00:01[0m
[?25h[31mERROR: Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/urllib3/response.py", line 438, in _error_catcher
    yield
  File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/urllib3/response.py", line 519, in read
    data = self._fp.read(amt) if not fp_closed else b""
  File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/cachecontrol/filewrapper.py", line 90, in read
    data = self.__fp.read(amt)
  File "/usr/lib/python3.9/http/client.py", line 463, in read
    n = self.readinto(b)
  File "/usr/lib/python3.9/http/client.py", line 507, in readinto
    n = self.fp.readinto(b)
  File "/usr/lib/python3.9/socket.py", line 704, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.9/ssl.py", line 1242, in recv_into
    return self.read(nbytes, buffer)
  File "/usr

In [12]:
!python --version

Python 3.9.16


In [13]:
!pip list

Package                       Version
----------------------------- ------------
absl-py                       1.0.0
alabaster                     0.7.13
albumentations                1.2.1
altair                        4.2.2
appdirs                       1.4.4
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arviz                         0.15.1
astropy                       5.2.2
astunparse                    1.6.3
attrs                         21.4.0
audioread                     3.0.0
autograd                      1.5
Babel                         2.12.1
backcall                      0.2.0
beautifulsoup4                4.11.2
bleach                        6.0.0
blis                          0.7.9
bokeh                         2.4.3
branca                        0.6.0
CacheControl                  0.12.11
cached-property               1.5.2
cachetools                    3.1.1
catalogue                     2.0.8
certifi                       2022.12.7
cffi    

### Imports

In [33]:
import pandas as pd
from sklearn.model_selection import train_test_split, cross_val_score, ShuffleSplit, RandomizedSearchCV, GridSearchCV

# DNN
import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from tensorflow_addons.metrics import RSquare

ModuleNotFoundError: ignored

### Ingest Data

In [9]:
df = pd.read_csv("output/data/insurance-clean.csv", index_col = 0)
df.head()

Unnamed: 0,age,sex,bmi,children,smoker,region,charges,region0,region1,region2,region3
0,0.021739,0.0,0.321227,0.0,1.0,southwest,16884.924,0.0,0.0,0.0,1.0
1,0.0,1.0,0.47915,0.2,0.0,southeast,1725.5523,0.0,0.0,1.0,0.0
2,0.217391,1.0,0.458434,0.6,0.0,southeast,4449.462,0.0,0.0,1.0,0.0
3,0.326087,1.0,0.181464,0.0,0.0,northwest,21984.47061,0.0,1.0,0.0,0.0
4,0.304348,1.0,0.347592,0.0,0.0,northwest,3866.8552,0.0,1.0,0.0,0.0


#### Train Test Split

In [16]:
# Divide data into train and test data
features = ['age', 'sex', 'bmi', 'children', 'smoker', 'region0', 'region1', 'region2', 'region3']
target = 'charges'

df_ml = df[features + [target]]

X_train, X_test, y_train, y_test = train_test_split(
    df_ml[features], df_ml[[target]], 
    test_size=0.2, random_state=42, shuffle = True)

## Centralized Neural Networks
---

## Federated Learning
---

### Setup

### FedAvg