# ATIC Project
Based on the repo "Federated-Learning-Source" we propose a model and dataset which can be integrated into this framework for financial forecasting and decision-making.



## Setup
Since this project did not run without fixing some parts we present updated setup instructions.

### Docker
To setup the Database as well as the python environment in Docker, we need to setup two seperate Docker container for each application and conncect them in a Docker network.

### Database
As Data managing system Mongodb is used. This Database is only accessed by the global server and does not store any training or test data, instead handles saving and aggegration of model parameters, experiment hyperparamters and node task parameters.  
To setup the Database in Docker follow these steps:  

1. Setup a Docker container for mongodb. In the terminal run:  
    1. Create a docker volumn to store dataset
    `docker volume create mongo_data`  
    2. Create a docker network for communication between database and code
    `docker network create fl_network`  
    3. Download mongodb image for docker
    `docker pull mongo`
    4. Navigate to ATIC_project in terminal
    5. Create a key file for authorization and save it:
    `openssl rand -base64 741 > /your/local/path/mongo-keyfile`
    `chmod 600 /your/local/path/mongo-keyfile` 
    6. Start docker container for mongodb
    `docker run -d \
    --name mongodb_instance \
    --network fl_network \
    -v mongo_data:/data/db \
    -v $(pwd)/mongo-keyfile:/etc/mongo-keyfile \
    -e MONGO_INITDB_ROOT_USERNAME=admin \
    -e MONGO_INITDB_ROOT_PASSWORD=password \
    mongo:latest \
    mongod --replSet rs0 --bind_ip_all --keyFile /etc/mongo-keyfile --auth`

2. Setup database:  
    1. Connect to mongodb docker container:  
    `docker exec -it mongodb_instance mongosh -u admin -p password`
    2. In the MongoDB shell, run:  
    `rs.initiate()`
    3. Create Database:  
    `use federated_learning`  
    `db.model.insert({})`  
    `db.experiment.insert({})`  
    `db.task.insert({})`
    10. Add file `db_conf.key` in globalserver folder with content:  
    {"port": "27017","host":"mongodb_instance","user": "admin","password": "password"}

### App Docker
Navigate to head direcrtory in terminal and run:
1. Build docker image from Dockerfile:  
`docker build -t my-python-jupyter .`
2. Run docker container:  
`docker run -d \
--name my-jupyter-app \
--network fl_network \
-p 8888:8888 \
-v "$(pwd)":/usr/src/app \
my-python-jupyter`
3. Check for jupyter notebook adress: 
`docker logs my-jupyter-app`
4. Copy the adress (should look something like this http://127.0.0.1:8888...) into browser to open notebook

### Local Server setup
Add a file `envs.key` in root directory with content:  
`{  
    "SERVER_ADDRESS": "0.0.0.0",  
    "CLIENT_SECRET": "your_secret_here",  
    "SERVER_PORT": "50000",  
    "CLIENT_INTERFACE_PORT": "50001",  
    "DATA_WRAPPER_URL":"None"  
}`  

## Code
In the next part each code section provides a blueprint with comments to see what is needed to run the federated learning system.

#### Imports

In [12]:
from IPython import get_ipython
get_ipython().run_line_magic('matplotlib', 'inline')
import os
os.chdir('/usr/src/app')
import sys
os.environ['STATIC_VARIABLES_FILE_PATH'] = "globalserver/static_variables.json"
os.environ['PATH_TO_GLOBALSERVER'] = "globalserver/api/"
sys.path.append(os.getcwd())
import json

# Importing the required Keras modules containing model and layers
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# TODO replace these imports with our own dataset
from examples.dummy_example.utils import get_data,save_data_as_json,plot_data

#### Model
The model must be defined in Keras and precompiled toghether with optimizer and metrics

In [13]:
def kkbox_nn(parameters):
    model = Sequential()
    layers = 5
    nodes = 16
    lr = 0.01

    for i in range(layers):
        if i == 0:
            model.add(Dense(nodes, activation=tf.nn.relu, input_shape=(2,)))
        else:
            model.add(Dense(nodes, activation=tf.nn.relu))

    model.add(Dense(1, activation='sigmoid'))
    model.compile(optimizer=tf.keras.optimizers.SGD(lr=lr, momentum=0.9),
                  loss='binary_crossentropy',
                  metrics=[tf.metrics.AUC()])
    model.summary()
    return model


clients = ["r1","r0"]


setup_dict = {
    "model_function": {
        "function": kkbox_nn,
        "parameters": {}
    },
    "git_version": 'e9339081b76ad3a89b1862bd38d8af26f0541f1c',
    "protocol": 'NN',
    "model_name": "test_model",
    "model_description": "this model is just to test the db",
    "testing": True,
    "training_config": {
        'epochs':  10,
        'verbose': 2,
        'batch_size': 100,
        "validation_steps": 40,
        # "dataset":'1',
        "test_steps": 40,
        "steps_per_epoch": 20,#int(14679/1000),
        "skmetrics": [],
        "tfmetrics": ["AUC", "Accuracy"],
        "differential_privacy": {"method": 'before',},
    },
    "rounds": 30,
    "round": ["fetch_model", "train_model", "send_model", "send_training_loss", "send_test_loss", "aggregate"],
    "final_round": ["fetch_model","send_test_loss", "send_training_loss"],
    "clients": clients,
    "experiment_name": "kkbox",
    "experiment_description": f"desc if nice experiment",
    "stop_function": None,
    "upkeep_function": None,
    "preprocessing": {
        "noise": {
            "epsilon": 10000,
            "delta": 1
        }
    },
}

#### Server
Next we start the Workers and the Global Server. The already implemented class Testing is very helpful as it handles the startup process for local setup.

In [None]:
from globalserver.operator_.operator_class_db import Operator
from testing.test_class import Testing

TestSetup = Testing(clients, start_servers=True, clear_logs=True, clear_db=False, interface=False)
operator = Operator()

#### Dataset 
The Dataset needs to be converted from numpy to json format. Therefore you need to create a folder datasets in root directory and adapt the functionality in save_data_as_json to our new dataset.

In [None]:
# just copied from dummy example 
# This is just and easy example dataset for binary classification 
# y = decision boundary
# client_data_final = random subset of training data which can be accessed by individual clients
training_data, client1_data_final,client2_data_final,y=get_data(exp=3)
test_data, _,_,_=get_data(seed=10,exp=3)
save_data_as_json(client1_data_final,client2_data_final,test_data)

#### Training
In this case the model is trained once where two node servers aggregated compared to both training individually

In [None]:
models={}
for clients in [["r1","r0"],["r1"],["r0"]]:
    setup_dict["clients"]= clients
    experiment_id,_=operator.define_and_start_experiment(setup_dict)
    model=operator.get_compiled_model(protocol='NN', experiment_id=experiment_id)
    models[f'{clients}']=model

#### Plot
This is just and example of how to plot

In [None]:
for clients,model in models.items():
    model.evaluate(x=test_data[:,0:2],y=test_data[:,2], verbose=2)
    y_pred=model.predict(x=test_data[:,:2])
    test_data[:, 2]=[1 if y[0]>0.5 else 0 for y in y_pred]
    # test_data[test_data[:,2]>0]: samples with label 1
    # test_data[test_data[:,2]<1]: samples with label 0
    # y: decision boundary
    # plot_data: plots all three above datasets in one plot with different colors
    plot_data([test_data[test_data[:,2]>0],test_data[test_data[:,2]<1],y])

#### Stop Servers
Note: If Servers are not killed before starting a new session this can lead to issues. You can also check with `sudo lsof -i :27017` which processes are still running. This should list mongo as well as global and node server. With `kill -9 PID` you can end proccesses manually.

In [None]:
TestSetup.kill_servers()