Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for GPU in NuNET #1

Open
pgwadapool opened this issue Jan 9, 2022 · 5 comments
Open

Support for GPU in NuNET #1

pgwadapool opened this issue Jan 9, 2022 · 5 comments

Comments

@pgwadapool
Copy link

I am looking to run training algorithm (not inferencing) with NuNET just like we do in a Cloud
The first set of requirements are
Desktop Requirements

  1. HW:
    a) x86 CPU
    b) At least 1 Nvidia GPU. Nvidia GPU has better support for ML
    c) Atleast 32G RAM, 1TB HDD
  2. SW: (This is inside the NuNET docker)
    a) OS: Ubuntu 20.04/21.04 LTS.
    b) PyTorch is a must. Tensorflow. CUDA
    c) cuDNN support
  3. API support so that a webpage can interact with NuNET. Want to build something like google Collab environment. This is a stretch goal

Data Requirements:

  1. We will use MNIST dataset[1] for initial testing
  2. The data can be local in the Docker or can access from a location specified while choosing NuNET compute.
  3. If we support loading file from a location that is not in docker, we need to account for some end to end encryption. This is not immediate requirement though as the MNIST is an open dataset

The first test to run will be built using PyTorch or TensorFlow depending on what is supported by NUNet. This is a simple model to learn handwritten digits.
The dataset will consist of MNIST. We will select 5000 Training samples, 500 verification sample. We will use 1000 for testing

[1] Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142. http://yann.lecun.com/exdb/mnist/

@kabirkbr
Copy link
Contributor

@pgwadapool

Thanks for the initial specs. It seems rather clear now and good to go further. To to give an idea of how we map use-case integrations:

  • basically NuNet connects 5 stakeholder types in the multi-sided market, and the first thing that we do is mapping each use-case specification to these stakeholders. in this case, I suppose the mapping looks as follows:

multi-sided_market_general_3

Which means that we have four stakeholders in this case:

  1. Compute providers (HW in your description). So we know what kind of hardware is needed; this will have to be declared in the container that will result from the item below; Also we have to make sure that we have such hardware (or more powerful) onboarded on the platform in order for the use-case to actually work;
  2. AI developer (SW in your description), which will make sure that the ML code that is needed is correctly written and packaged into docker container according to SNET and NuNet specifications; We will provide the initial specifications and then will work on it further;
  3. Application logic -- this is something that you refer to as a Google collab-style webpage from which you will run, monitor and access results of ML workflows.
  4. Data providers -- as you explained MNIST dataset.
  5. Data storage providers -- first of all we do not want to store data in the docker container (we can, but that would be not optimal from the platform logic perspective and in general not optimal). There are few options here -- we can get data directly from the source, if has a accessible weblink, or we can host it somewhere else (on IPFS, or at worst a separate docker volume on NuNet);

Do you agree with this mapping @pgwadapool ? If yes, we can move to the next step and start clarifying details (see next comment).

cc:// @JacKingt0n

@kabirkbr
Copy link
Contributor

Additionally @pgwadapool , could you please provide answers to the following questions:

  1. Which of the 5 sides you will be able to cover? Based on our conversation I presume that would be 2 (SW), possibly 1 (HW) and i think also 3 (application logic). Regarding the last, NuNet definitely cannot cover application logic as we are infrastructure providers for the backend. I think we can agree provisionally that we will cover sides 4 and 5 related to data provision (although we may see down the road that some more involved integration is needed);
    (btw, I think we already have end to end encryption for the data in the move, which does not make a lot of sense for public datasets which are anyway not encrypted at rest...)

  2. We will need to work out API access requirements that are needed for the application; All NuNet containers are accessible and we will provide this access, but the current API may need to be adjusted for the requirements of the webapp that you are thinking about; We'd expect to have these requirement to come from you (or at least start formulating them from your side);

  3. Related to the previous question, could you describe the webapp that you would like to build? This will help us understand better how you want to interact with the ML workflows running on the backend. For reference, you can look a the Fake News Warning showcase application, which is a browser plugin interacting with containers running on NuNet. For sure we can adapt this to the other types of webapps.

Based on these answers we will pick platform features from our roadmap that are needed for this use-case to work and prioritize them accordingly. Note that these platform features are application agnostic, therefore we will develop them for all subsequent applications that may run on NuNet (i.e. they will not be specific for this use-case only).

@pgwadapool
Copy link
Author

@pgwadapool

Thanks for the initial specs. It seems rather clear now and good to go further. To to give an idea of how we map use-case integrations:

* basically NuNet connects 5 stakeholder types in the multi-sided market, and the first thing that we do is mapping each use-case specification to these stakeholders. in this case, I suppose the mapping looks as follows:

multi-sided_market_general_3

Which means that we have four stakeholders in this case:

1. Compute providers (HW in your description). So we know what kind of hardware is needed; this will have to be declared in the container that will result from the item below; Also we have to make sure that we have such hardware (or more powerful) onboarded on the platform in order for the use-case to actually work;

2. AI developer (SW in your description), which will make sure that the ML code that is needed is correctly written and packaged into docker container according to SNET and NuNet specifications; We will provide the initial specifications and then will work on it further;

3. Application logic -- this is something that you refer to as a Google collab-style webpage from which you will run, monitor and access results of ML workflows.

4. Data providers -- as you explained MNIST dataset.

5. Data storage providers -- first of all we do not want to store data in the docker container (we can, but that would be not optimal from the platform logic perspective and in general not optimal). There are few options here -- we can get data directly from the source, if has a accessible weblink, or we can host it somewhere else (on IPFS, or at worst a separate docker volume on NuNet);

Do you agree with this mapping @pgwadapool ? If yes, we can move to the next step and start clarifying details (see next comment).

cc:// @JacKingt0n

Yes. This is fine with me.

@pgwadapool
Copy link
Author

Additionally @pgwadapool , could you please provide answers to the following questions:

1. Which of the 5 sides you will be able to cover? Based on our conversation I presume that would be 2 (SW), possibly 1 (HW) and i think also 3 (application logic). Regarding the last, NuNet definitely cannot cover application logic as we are infrastructure providers for the backend. I think we can agree provisionally that we will cover sides 4 and 5 related to data provision (although we may see down the road that some more involved integration is needed);
   (btw, I think we already have end to end encryption for the data in the move, which does not make a lot of sense for public datasets which are anyway not encrypted at rest...)

2. We will need to work out API access requirements that are needed for the application; All NuNet containers are accessible and we will provide this access, but the current API may need to be adjusted for the requirements of the webapp that you are thinking about; We'd expect to have these requirement to come from you (or at least start formulating them from your side);

3. Related to the previous question, could you describe the webapp that you would like to build? This will help us understand better how you want to interact with the ML workflows running on the backend. For reference, you can look a the [Fake News Warning showcase application](https://medium.com/nunet/nunet-private-alpha-part-1-fake-news-warning-showcase-application-43aadce4eb4d), which is a browser plugin interacting with containers running on NuNet. For sure we can adapt this to the other types of webapps.

Based on these answers we will pick platform features from our roadmap that are needed for this use-case to work and prioritize them accordingly. Note that these platform features are application agnostic, therefore we will develop them for all subsequent applications that may run on NuNet (i.e. they will not be specific for this use-case only).

I should be able to cover 1, 2 and 3. For the webapp API, I will start formulating this. The high level view of the webapp is that when I wish to start the training, I will launch that in the webapp, once the training is complete, the webapp should be able to launch test. I think this will be similar to the "Dog detection" demo we had. The test will instead be of some hand written digit images. The main difference is that in the pvt alpha we mainly focused on inferencing task. In my usecase before inferencing we also need to train the ML. The webapp will provide the necessary inputs for training the model. The model can be assumed to be available in SingularityNet

@pgwadapool
Copy link
Author

https://gitlab.com/nunet/jira-import/-/issues/126
The effort is tracked in the above gitlab
Screen Shot 2022-07-14 at 7 51 31 AM

As of today we are able to onboard GPU, run TensorFlow and PyTorch. Git clone a repo containing ML code, ran Fashion MNIST training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants