On instance, create a temporary workspace directory, fraud-detection example,

In [1]:
!mkdir workspace
!cp -r examples/fraud-detection workspace/
!cp -r examples/utils/gen-cert workspace/fraud-detection/

This example has a separate private data-and-scratch directories for each user or ML node. Create the respective directories and copy data-and-scratch directory. Running this example creates a scratch directory for each user and saves the trained Swarm model in the directory at the end of the training.

In [2]:
!mkdir workspace/fraud-detection/user1 workspace/fraud-detection/user2
!mkdir workspace/fraud-detection/user3 workspace/fraud-detection/user4
!cp -r workspace/fraud-detection/data-and-scratch workspace/fraud-detection/user1/
!cp -r workspace/fraud-detection/data-and-scratch workspace/fraud-detection/user2/
!cp -r workspace/fraud-detection/data-and-scratch workspace/fraud-detection/user3/
!mv workspace/fraud-detection/data-and-scratch workspace/fraud-detection/user4/

Run the gen-cert utility to generate certificates for each Swarm component using the command, gen-cert -e <EXAMPLE-NAME> -i <HOST-INDEX>.

In [17]:
!./workspace/fraud-detection/gen-cert -e fraud-detection -i 1

Generating a RSA private key
........+++++
..........................+++++
writing new private key to 'workspace/fraud-detection/cert/ca/ca-1-key.pem'
-----
./workspace/fraud-detection/gen-cert: line 55: local: can only be used in a function
Signature ok
subject=OU = Swarm-Learning, CN = CA-1
Getting Private key
./workspace/fraud-detection/gen-cert: line 69: local: can only be used in a function
Generating a RSA private key
....................................................................................................................................................+++++
.................................+++++
writing new private key to 'workspace/fraud-detection/cert/sn-1-key.pem'
-----
Signature ok
subject=OU = Swarm-Learning, CN = SN-1
Getting CA Private Key
Generating a RSA private key
.......................................................................+++++
.......+++++
writing new private key to 'workspace/fraud-detection/cert/sl-1-key.pem'
-----
Signature ok
subject=OU = S

Search and replace all occurrences of <CURRENT-PATH> tag in swarm_fd_task.yaml and swop1_profile.yaml files with $(pwd).

In [18]:
!sed -i "s+<CURRENT-PATH>+$(pwd)+g" workspace/fraud-detection/swop/swop*_profile.yaml workspace/fraud-detection/swci/taskdefs/swarm_fd_task.yaml

Create a docker volume and copy Swarm Learning wheel file.

In [19]:
!docker volume rm sl-cli-lib
!docker volume create sl-cli-lib
!docker container create --name helper -v sl-cli-lib:/data hello-world
!docker cp -L lib/swarmlearning-client-py3-none-manylinux_2_24_x86_64.whl helper:/data
!docker rm helper

sl-cli-lib
sl-cli-lib
5f3cdca40b90c896259d89d118e3ac6bd46bd6da239c622ab64fba2c7aa5ccfa
helper


Create a docker network for SN, SWOP, SWCI, SL, and user containers running on the same host.

In [20]:
!docker network create host-1-net

b8a11de84103dd5b7798aa80c67b3115e57c60968d1c2cf902e6639feb2b4f65


Run Swarm Network node (SN1) - sentinel node.

In [21]:
!./scripts/bin/run-sn -d --rm --name=sn1 --network=host-1-net --host-ip=sn1 --sentinel --key=workspace/fraud-detection/cert/sn-1-key.pem --cert=workspace/fraud-detection/cert/sn-1-cert.pem --capath=workspace/fraud-detection/cert/ca/capath --apls-ip=10.128.0.8

5610c5332a5b42341ea5f585c471673e77796dbb0f4c986b238f4989b592882a


In [22]:
!docker ps

CONTAINER ID   IMAGE                                                              COMMAND                  CREATED         STATUS         PORTS     NAMES
5610c5332a5b   hub.myenterpriselicense.hpe.com/hpe_eval/swarm-learning/sn:1.2.0   "/usr/bin/python3 -c…"   4 seconds ago   Up 2 seconds             sn1


In [23]:
!docker logs -f sn1

######################################################################
##                    HPE SWARM LEARNING SN NODE                    ##
######################################################################
## © Copyright 2019-2022 Hewlett Packard Enterprise Development LP  ##
######################################################################
2023-01-18 10:49:55,688 : swarm.blCnt : INFO : Setting up blockchain layer for the swarm node: START
2023-01-18 10:49:57,572 : swarm.blCnt : INFO : Creating Autopass License Provider
2023-01-18 10:49:59,086 : swarm.blCnt : INFO : Creating license server
2023-01-18 10:49:59,086 : swarm.blCnt : INFO : Setting license servers
2023-01-18 10:49:59,138 : swarm.blCnt : INFO : Acquiring floating license 1100000380:1
2023-01-18 10:50:32,883 : swarm.SN : INFO : SMLETHNode: Starting GETH ... 
2023-01-18 10:54:47,746 : swarm.SN : INFO : SMLETHNode: Started I-am-Alive thread
2023-01-18 10:54:47,747 : swarm.blCnt : INFO : Setting up blockchain layer f

Use the Docker logs command to monitor the Sentinel SN node and wait for the node to finish initializing.
The Sentinel node is ready when the following messages appear in the log output:

swarm.blCnt : INFO : Starting SWARM-API-SERVER on port: 30304


according to environment, modify IP and proxy in the profile files under workspace/fraud-detection/swop folder.

Go to "Home Page" of Jupyter notebook
click on New
click on terminal - it will open new terminal window in new browsing tab
execute follwing comand
nano workspace/fraud-detection/swop/swop1_profile.yaml
edit apls host ip from 172.1.1.1 to 10.128.0.8
ctrl+x
press Y
enter

Run Swarm Operator node (SWOP1).

In [24]:
!./scripts/bin/run-swop -d --rm --name=swop1 --network=host-1-net --usr-dir=workspace/fraud-detection/swop --profile-file-name=swop1_profile.yaml --key=workspace/fraud-detection/cert/swop-1-key.pem --cert=workspace/fraud-detection/cert/swop-1-cert.pem --capath=workspace/fraud-detection/cert/ca/capath -e SWOP_KEEP_CONTAINERS=True -e http_proxy= -e https_proxy= --apls-ip=10.128.0.8

0b8a655a5c9c6a64c6317261ecc989a44e102a8e037defc0689edc045e7ad67c


Run SWCI node (SWCI1). It creates, finalizes and assigns below task to task-framework for sequential execution:
user_env_tf_build_task: Builds TensorFlow based Docker image for ML node to run model training.

swarm_fd_task: Create containers out of ML image, and mounts model and data path to run Swarm training.

In [25]:
!./scripts/bin/run-swci -ti --rm --name=swci1 --network=host-1-net --usr-dir=workspace/fraud-detection/swci --init-script-name=swci-init --key=workspace/fraud-detection/cert/swci-1-key.pem --cert=workspace/fraud-detection/cert/swci-1-cert.pem --capath=workspace/fraud-detection/cert/ca/capath -e http_proxy= -e https_proxy= --apls-ip=10.128.0.8

22acb7d69c322c42fb274517fc7b07c1c90acfd47708b3ed56c34c88fc0160f8
######################################################################
##                   HPE SWARM LEARNING SWCI NODE                   ##
######################################################################
## © Copyright 2021, 2022 Hewlett Packard Enterprise Development LP ##
######################################################################
SWCI:0 > ######################################################################
SWCI:0 > # (C)Copyright 2021,2022 Hewlett Packard Enterprise Development LP
SWCI:0 > ######################################################################
SWCI:0 > 
SWCI:0 > # Assumption : SWOP is already running
SWCI:0 > 
SWCI:0 > # SWCI context setup
SWCI:0 > EXIT ON FAILURE
SWCI:0 > EXIT ON FAILURE IS TURNED ON
SWCI:1 > wait for ip sn1
    API Server is UP!
SWCI:2 > create context test-fd with ip sn1
    API Server is UP!
    CONTEXT CREATED : test-fd
SWCI:3 > switch context test-fd
    DEFA

Four nodes of Swarm training are automatically started when the run task (swarm_fd_task) gets assigned and executed. Open a new terminal on host-1 and monitor the Docker logs of ML nodes for Swarm training. Swarm training ends with the following log message:
SwarmCallback : INFO : All peers and Swarm training rounds finished. Final Swarm model was loaded.
Final Swarm model is saved inside each user’s private scratch directory, which is workspace/fraud-detection/user<id>/data-and-scratch/scratch on both the hosts. All the dynamically spawned SL and ML nodes exits after Swarm training. The SN and SWOP nodes continues to run.

Go to "Home 
Page" of Jupyter notebook
click on New
click on terminal - it will open new terminal window in new browsing tab
execute following command

docker ps

it will show you sn1,swop1,swci1, 4 sl and 4 ml container

docker logs -f demo-swarm_fd_task-u-0-034698c2bbf002b7

might be your container name is different so instead demo-swarm_fd_task-u-0-034698c2bbf002b7 use your container name.

In [26]:
!docker ps -a

CONTAINER ID   IMAGE                                                                COMMAND                  CREATED          STATUS                      PORTS     NAMES
f4e797fffa99   user-env-tf2.7.0-swop                                                "python3 model/fraud…"   7 minutes ago    Exited (0) 4 minutes ago              demo-swarm_fd_task-u-3-28d5eacd49052828
95828a46aca8   hub.myenterpriselicense.hpe.com/hpe_eval/swarm-learning/sl:1.2.0     "/usr/bin/python3 -c…"   7 minutes ago    Exited (0) 4 minutes ago              demo-swarm_fd_task-s-3-28d5eacd49052828
95945258880b   user-env-tf2.7.0-swop                                                "python3 model/fraud…"   7 minutes ago    Exited (0) 4 minutes ago              demo-swarm_fd_task-u-2-a7bc64575a5168e4
16c22e3e0633   hub.myenterpriselicense.hpe.com/hpe_eval/swarm-learning/sl:1.2.0     "/usr/bin/python3 -c…"   7 minutes ago    Exited (0) 4 minutes ago              demo-swarm_fd_task-s-2-a7bc64575a5168e4
53da29dea

In [27]:
!docker logs demo-swarm_fd_task-u-3-28d5eacd49052828

2023-01-18 10:57:43.646071: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
***** Starting model = fraud-detection
----------------------------------------------------------------
loading train dataset data-and-scratch/app-data/SB19_CCFDUBL_BAL_TRAIN_2C.csv ..
size of training Data set : 684
----------------------------------------------------------------
loading test dataset data-and-scratch/app-data/SB19_CCFDUBL_BAL_TEST_2C.csv ..
----------------------------------------------------------------
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 1)                 31        
 

In [28]:
!docker logs demo-swarm_fd_task-s-3-28d5eacd49052828

######################################################################
##                    HPE SWARM LEARNING SL NODE                    ##
######################################################################
## © Copyright 2019-2022 Hewlett Packard Enterprise Development LP  ##
######################################################################
2023-01-18 10:57:46,176 : swarm.mlApp : INFO : Creating Autopass License Provider
2023-01-18 10:57:51,610 : swarm.mlApp : INFO : Creating license server
2023-01-18 10:57:51,611 : swarm.mlApp : INFO : Setting license servers
2023-01-18 10:57:51,742 : swarm.mlApp : INFO : Acquiring floating license 1100000378:1
2023-01-18 10:57:53,840 : swarm.mlApp : INFO : Opening pipes to communicate with user container ...
2023-01-18T10:57:53.841651 /tmp/hpe-swarm/demo.3.request.pipe: File exists
2023-01-18T10:57:53.842012 /tmp/hpe-swarm/demo.3.response.pipe: File exists
2023-01-18 10:57:53,843 : swarm.mlCnt : INFO : Setting up SL Container 

clean up

In [29]:
!./scripts/bin/stop-swarm

Stopping all running Swarm Learning containers
No running Swarm Learning containers found
Removing all Swarm Learning containers
95828a46aca8
16c22e3e0633
d14b93e7e6f1
834ce35d62bb
Stopping all running Swarm Network containers
5610c5332a5b
Removing all Swarm Network containers
Error response from daemon: removal of container 5610c5332a5b is already in progress
Stopping all running Swarm Command Interface containers
No running Swarm Command Interface containers found
Removing all Swarm Command Interface containers
No Swarm Command Interface containers found
Stopping all running Swarm Operator containers
0b8a655a5c9c
Removing all Swarm Operator containers
No Swarm Operator containers found


In [30]:
!docker network rm host-1-net

host-1-net


In [31]:
!docker ps

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
