Data preparation code for building Kaldi ASR system
-
Updated
Mar 18, 2017 - Python
Data preparation code for building Kaldi ASR system
A tool to download and format PASCAL VOC 2007 dataset for multilabel classification
A tool to download and format NUS-WIDE dataset for multilabel classification
A tool to download and format MS COCO dataset for multilabel classification
middleware in pipeline between dataset and TensorFlow classifier
A single library to (down)load all existing sign language video datasets.
Manage dataset for data science projects
Tool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
A python tool to perform operations on specific datasets (i.e. APP dataset and IMDB dataset)
A single library to (down)load all existing sign language handshape datasets.
Tool for managing datasets of images with compositional semantics, part of VisSE project.
[WIP] VoiceSmith makes training text to speech models easy.
Extraction tool to parse MS Celeb dataset
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework
Python library for handling audio datasets.
A tool for downloading from public image boards (which allow scraping) / preview your images & tags / edit your images & tags. Additional tabs for downloading other desired code repositories as well as S.O.T.A. diffusion and auto-tag/caption models for your purposes. Custom datasets can be added!
Utility for constructing highly efficient in-memory / on-disk datasets.
Machine learning library for classification tasks
Scripts to automatize and standardize dataset handling
Add a description, image, and links to the dataset-manager topic page so that developers can more easily learn about it.
To associate your repository with the dataset-manager topic, visit your repo's landing page and select "manage topics."