Skip to content

EPAM accelerator to spread-up Computer Vision DataSet preparation for Machine Learning model training.

License

Notifications You must be signed in to change notification settings

druzhynin-oleksii/model_garden

 
 

Repository files navigation

Model Prototyping Sequence Diagram

Computer Vision Model Garden

Goals and Prerequisites

Project Goal: Provide convenient tool for management of Computer Vision datasets within projects providing numerous experiments with visual images.

Solved Problem: Computer Vision projects providing numerous experiments with the image data usually needs collaborative sharing of these data and supporting wide range of datasets formats.

Among popular image dataset annotation tools github.com/opencv/cvat has so far a largest number of supported formats (Pascal VOC, YOLO, COCO, etc., see the table below).

Computer Vision Annotation Tool (CVAT)/a> open-sourced by Intel just lucking the support of such clouds like AWS. CVAT team has shared a post saying that the AWS "issue in backlog at the moment till we have resources to cover it".

The Model Garden tool is an addition to CVAT providing the following functionality:

  • store in S3, reuse and modify image datasets using CVAT tool
  • collaborative usage of datasets through web interface
  • prevent labeling tool crash consequences

NOTE: The currently supported version of CVAT backend API is 0.6.1.

Top Existing Solutions

Usage Order Tool Publisher Web Cloud Pascal VOC YOLO COCO MASK TFRecord MOT
1 github.com/tzutalin/labelImg private N N Y Y N N N N
2 github.com/opencv/cvat Intel Y N Y Y Y Y Y Y
3 github.com/microsoft/VoTT Microsoft N Y Y N N N N N

Project Support

Model Garden was started as EPAM Systems internal initiative to support EPAM Computer Vision teams (e.g. Vudoku Accelerator).

The project is open sourced with the support of epam.github.io.

Technical Features

‍🖌️ Material Design: Intuitive UI based on the world's most widespread design language.

🏃 Single Page Application: Fast, responsive ux to get what you need done without waiting for full-screen refreshes.

🐍 Python Django and Postgres

🏷 AWS S3 DataSet Gallery Model Garden DataSet Galery Model Garden DataSet Saved in S3

Use Cases

Model Prototyping

Model Garden supports case when with only one data scientist works on the project (does labeling and ML training). This can be useful for experimental projects.

Model Prototyping Sequence Diagram

Collaborative DataSet Labeling

Model Garden supports the case of Massive parallel labeling when a manager has a lot of images and a list of labels. In this case, the manager can upload DataSet, create a list of labels, and assign certain DataSet parts to different labelers, and next control their work.

Model Prototyping Sequence Diagram

DataFlow

Material Garden is a mediator between CVat (one of the most popular open-source annotation tool for computer vision) and Amazon S3 (object storage service).

Status Worker is part of Material Garden. This worker checks event updates from CVAT asynchronously.

Model Prototyping Sequence Diagram

CI/CD

The example of Continuous Deployment to the cloud container registry as well as the cloud update is set with help of GitLab CI/CD .gitlab-ci.yml file.

.gitlab-ci.yml file needs the following GitLab CI/CD nvironment variables set:

DEV_AWS_ACCESS_KEY_ID=<ABCDEFGHIJKLMNOPQRST>*
DEV_AWS_SECRET_ACCESS_KEY=<abcdefghijklmnopqrstuvwxyz0123456789-+/>*
DEV_BACKEND_ECR_URI=123456789000.dkr.ecr.eu-central-1.amazonaws.com/model_garden_backend
DEV_FRONTEND_ECR_URI=123456789000.dkr.ecr.eu-central-1.amazonaws.com/model_garden_frontend

PROD_AWS_ACCESS_KEY_ID='<ABCDEFGHIJKLMNOPQRST>*'
PROD_AWS_SECRET_ACCESS_KEY=<abcdefghijklmnopqrstuvwxyz0123456789-+/>*'
PROD_BACKEND_ECR_URI=123456789000.dkr.ecr.eu-central-1.amazonaws.com/model_garden_backend
PROD_FRONTEND_ECR_URI=123456789000.dkr.ecr.eu-central-1.amazonaws.com/model_garden_frontend

RELEASE_AWS_ACCESS_KEY_ID='<ABCDEFGHIJKLMNOPQRST>*'
RELEASE_AWS_SECRET_ACCESS_KEY=<abcdefghijklmnopqrstuvwxyz0123456789-+/>*'
RELEASE_BACKEND_ECR_URI=123456789000.dkr.ecr.eu-central-1.amazonaws.com/model_garden_backend
RELEASE_FRONTEND_ECR_UR=123456789000.dkr.ecr.eu-central-1.amazonaws.com/model_garden_frontend

Installation

Installation Specifications

Deployment

See all the details in the <model_garden_root>/deploy/README.md.

If CI/CD is set via .gitlab-ci.yml the build pipeline is started automatically after commits to master and develop branches

Contacts

About

EPAM accelerator to spread-up Computer Vision DataSet preparation for Machine Learning model training.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 54.1%
  • TypeScript 40.2%
  • JavaScript 2.7%
  • Shell 1.1%
  • HTML 0.7%
  • CSS 0.6%
  • Other 0.6%