IDunno - Fault-tolerent Distributed ML Cluster

Architecture Overview

Python Environment Setup

In root folder mp4, run

python3 -m venv .env

source .env/bin/activate

.env/Scripts/activate

pip install -r requirement.txt

Test installation

python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"

Expected output:

[{'label': 'POSITIVE', 'score': 0.9998704791069031}]

Go Environment Setup

To compile the project, you need to have Golang installed on your machine. You can download Golang from here. Once you have Golang installed, you need to download dependency gRPC and Protobuf. You can do this by running the following command in your terminal:

go get -u google.golang.org/grpc
go get -u github.com/golang/protobuf/protoc-gen-go

After you have all the dependencies installed, you can compile the project by running the following command in your terminal:

make all

This will generate the executable files in the bin folder.

Start Running Server

You can run the executable file by running the following command in your terminal:

Start DNS Server

cd bin
./dns

Start IDunno Inference Engine

cd bin
./idunno

IDunno server automatically starts python gRPC server for inference.

How to Use

You need to run only one dns executable file. You can run multiple idunno executable files. Each idunno executable file will be a host server. Here is a list of commands you can use in the idunno executable file:

join            # Join the ring
leave           # Leave the ring
list_mem        # List all members in the ring
clear           # clear all content from log file
stat            # print statistics of Bps read/write, #pings, #failures, system elapsed time
low_droprate    # Set the drop rate to low (0.03)
mid_droprate    # Set the drop rate to medium (0.3)
high_droprate   # Set the drop rate to high (0.9)

Here are more commands for the SDFS:

get <sdfsfilename> <local_filename>                         # Get a file from the SDFS
put <local_filename> <sdfsfilename>                         # Put a local file to the SDFS
putdir <local_directory> <sdfs_directory>                   # Put a local directory with all files inside to the SDFS
delete <sdfsfilename>                                       # Delete a file from the SDFS
deldir <sdfs_directory>                                     # Delete a directory with all files inside from the SDFS
ls <sdfsfilename>                                           # List the servers storing the file
store                                                       # List all files stored in the current server
get-versions <sdfsfilename> <num versions> <localfilename>  # Retrieve the last num versions of the file

Here are more commands for the IDunno Learning Cluster:

train <model> <dataset>     # train a model on specified dataset
serve <model> <batch_size>  # start inference on model with a batch size
w                           # display all workers
j                           # display all running jobs & their current states
ij <job_id>                 # display a particular job's current inference output and accuracy
cj                          # display all completed jobs
ijs <job_id>                # displat a particular job's query rate and query processing time statistics
qps [global|local]          # change scheduling mode to be local or global

Configure Frontend UI Dashboard

To provide a better user experience in viewing real-time updates of IDunno system, we built a frontend dashboard using React and TypeScript. Here are steps to start frontend dashboard:

Start HTTP backend server

Backend server automatically starts after executing ./dns.

Start Frontend UI

Go to ./frontend. Run

pnpm install
pnpm run dev

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
api		api
backend		backend
bin		bin
data		data
dns		dns
frontend		frontend
idunno		idunno
inference		inference
logger		logger
ralloc		ralloc
ring		ring
sdfs		sdfs
utils		utils
.gitattributes		.gitattributes
CS425 MP4 Report.pdf		CS425 MP4 Report.pdf
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
logo.jpg		logo.jpg
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IDunno - Fault-tolerent Distributed ML Cluster

Architecture Overview

Python Environment Setup

Go Environment Setup

Start Running Server

Start DNS Server

Start IDunno Inference Engine

How to Use

Configure Frontend UI Dashboard

Start HTTP backend server

Start Frontend UI

Screenshots of Dashboard UI

About

Releases

Packages

Languages

License

Toubat/IDunno

Folders and files

Latest commit

History

Repository files navigation

IDunno - Fault-tolerent Distributed ML Cluster

Architecture Overview

Python Environment Setup

Go Environment Setup

Start Running Server

Start DNS Server

Start IDunno Inference Engine

How to Use

Configure Frontend UI Dashboard

Start HTTP backend server

Start Frontend UI

Screenshots of Dashboard UI

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages