For scheduler and worker
- Install conda virtual environment.
- Run
conda env create --prefix cs230 -f configuration.yaml
underworker
directory. - Activate the environment.
- Install common library with
pip install -e .
command undercommon
directory. - Install
torch
andtorchvision
withpip
.
For FTP server and RabbitMQ broker
Deploy with docker from docker hub:
rabbitmq:3.12-management
garethflowers/ftp-server
For users
- Install common library with
pip install -e .
command undercommon
directory.
The RabbitMQ broker, FTP server, and the GPU capacity of each worker should be set up in the worker/config.json
file.
"broker" : {
"broker_host": "18.119.97.104",
"broker_port": "5673",
"topics": {
"broker_scheduling_topic": "node_1_scheduling"
}
},
...
"ftp" : {
"ftp_host": "169.234.56.23",
"ftp_port": "21"
},
"workers": {
"1": {
"GPU": 8589934592
},
"2": {
"GPU": 2147483648
},
"3": {
"GPU": 0
}
},
...
Scheduler
There are three scheduling algorithm available:
python scheduler.py [next-available, round-robin, priority-based]
e.g.
python scheduler.py next-available
Worker
python daemon.py [worker_id]
e.g.
python daemon.py 1
Please look at tester.py
to learn how a user should send a new task request to scheduler and retrieve the results.