Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Grant Proposal: Machine Learning labeled Dataset NFT storage and sharing via IPFS_Filecoin #263

Merged
merged 6 commits into from
Apr 18, 2022
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Open Grant Proposal: Machine Learning labled Dataset NFT storage and sharing via IPFS_Filecoin

**Name of Project:**
Machine Learning labeled Dataset NFT storage and sharing via IPFS_Filecoin

**Proposal Category:** `app-dev`

**Proposer:** `TuninsightBlockchain`

**Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT and APACHE2 licenses?:** Yes

# Project Description

Data labeling or annotation is the process of adding metadata to a dataset. This metadata usually takes the form of tags, which can be added to any type of data, including text, images, and video. Adding comprehensive and consistent tags is a key part of developing a training dataset for machine learning. Deep learning data annotation is the next 100Billion dollar market, it is a pretty mature market in China and india. Machine learning giants like AWS and other companies hired lots of people to annotate the dataset from clients but they cannot guarantee the dataset delivered to the client is unique and non-transferable to other clients without permission.

With our solution we can forge the dataset to NFT and save to filecoin network to make the dataset unique, permission based transferable and profit shared by the original author.

- IPFS enable a low cost transmission of dataset cross continent
NFT makes sure the original dataset owner has the right to choose how many copies to sell and sell to which Decentralized ID owner so he can share the profit from usage. The NFT attribute is also a certificate of the owner's quality of service and brand.
- Filecoin makes the important annotated dataset saved safely on the network, it also gives the owner the choice to discontinue the storage after the contract expires.
- Integration with our business partner FilSwan, we can future move the dataset close to the computing resource required, saving both bandwidth and cost as a turnkey solution for machine learning developers.


## Value

With increasing need and the importance of machine learning in the future industry, data ownership and secure transmission is the key feature of the machine learning industry. It has unique attributes such as annotation demand for supervised learning. Moreover, dataset also needs encryption, and normally dataset size is enormous, which means cross continent data transmission is extremely slow, data computation and storage have to be close to each other geographically. Hence, we cannot attract enough industry users if we do not solve those essential problems above. Solving these would be a great enterprise level example to attract AI and big data companies to utilise Filecoin networks and benefit the filecoin ecosystem.

Risks: Customer may transfer the NFT with the encryption key to other customers, which becomes a risk of abusing the system.


## Deliverables
- User can get data from the website
- User annotated the dataset
- User can mint Dataset to NFT with permitted number of keys
- User can share the NFT via IPFS and backup to Filecoin network.
- Developer received the NFT and the keys for decryption
- (optional) Developer can use the computing resource on Filecoin network for machine learning


## Development Roadmap

| No. | Milestone Description | Funding | Estimated Timeframe |
| --- | --- | --- | --- |
| 1 | Project functionality design and scope definition. | NA | 2 weeks |
| 2 | Project user interface design | NA | 1 weeks |
| 3 | Infrastructure architecture design and implementation plan readiness.| NA | 1 weeks |
| 4 | Command line tool coding | 10,000 | 4 weeks |
| 5 | User interface coding | 10,000 | 3 weeks |
| 7 | Document and tutorial | TBD | 1 weeks |
| 8 | Product release go marketing | 10,000 | 1 weeks |


## Total Budget Requested
$30000 for total. Please see the development roadmap for details.

## Maintenance and Upgrade Plans

Will add more data transmission. Will continuously take comments from the community and make improvements.

# Team

## Team Members

- Zhenglin Xiong
- Shaun Li
- Pingao Wang
- Charles Cao (FilSwan support)


## Team Member LinkedIn Profiles

- https://www.linkedin.com/in/zhenglinxiong/
- https://www.linkedin.com/in/shaunlipy/
- https://www.linkedin.com/in/pingao-wang/
- https://www.linkedin.com/in/charles-cao-09a79526/

## Team Website
https://www.tuninsight.com/

## Relevant Experience
Tuninsight AI is a Montreal based Artificial Intelligence solution provider that develops Intelligent Medical Diagnostic technologies and AI-powered Industrial Visual Inspection and Action Detection products based on massive data. Our AI developers have many years of experience in deep learning research and development.

## Team code repositories

- https://github.com/pingaowang/detectron2.git
- https://github.com/pingaowang/yolo2-pytorch.git

# Additional Information
Tuninsight AI has won 1st place of the 2018 Create @ Alibaba Cloud Startup Contest North America Final at Silicon Valley, award $50,000 from Alibaba Cloud. Tuninsight was invited to represent the Canadian delegation at the United Nations AI for Good Global Summit.