Skip to content

netroid314/ASWCS_back

Repository files navigation

Repository Info

this is respository of DAIG (Distributed A.I Grid) project client program. it is based on PyQT5 because we use tensorflow for model training and others.


What is DAIG?

DAIG (Distributed A.I Grid) is distributed deep learning based machine learning system. Usually, deep learning based machine learning methods require more training time than other methods. One way to solve this long training time problem is using multiple GPUs. However, it is pretty expensive. So, we tried to use other people's left pc resources instead of multiple GPUs


How DAIG works?

DAIG system consists of Learning requestor, Resource provider and Management server. Learning requestor makes project and upload train data to Management server. Then, Management server distribute train data shards and model information to registered Resource providers When all train data shards are used for leatning, Management server save final model and weight result at object storage. Learning requestor can download trained model at anytime.

DAIG structure

image image2


DAIG server dependency

Name version usage
Django 3.1.7 for server development
boto3 1.17.67 for object storage
numpy 1.19.5 for data manipulation
requests 2.25.1 for http communication
h5py 3.1.0 for model saving
iamport for pay procedure

How DAIG's distribution works?

We constructed DAIG distribution and result gathering system based on K-batch sync SGD. And it gathers trained gradients based on all-reduce method. K-batch size can be controlled by Learning requestor. So, its final result is also contorlled by Learning requestor.


How to use DAIG?

This is server program. so, you should better check "https://github.com/netroid314/ASWCS_front"


How to launch server?

First, you need to install python libraries which are listed above. Or you can use requirement file. Then use manage.py for Django server launch. One exmaple is python manage.py runserver 0.0.0.0:8000 Refer Django reference book for more detail


Some points of DAIG server

One way to treat numpy file via https

How K-batch sync SGD is established

However, DAIG also focused on balance among Resource providers. so, it may not be pure K-batch sync SGD. (depends on situation)


Caution!

this project has been developed by korean developers. So, there are some korean comments. And this is server program so please also check https://github.com/netroid314/ASWCS_front.

About

This is repository for DAIG backend development

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages