The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
-
Updated
Jun 21, 2024 - Python
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
Social Media Mining Toolkit (SMMT) main repository
An AI-driven solution for enhancing safety at construction sites. Utilises YOLOv8 for object detection to identify overhead hazards like heavy loads and steel pipes. Alerts are triggered if personnel are detected beneath these hazards. Dataset sourced from Taiwan's construction industry.
A PointRCNN version of SAnE, which is a web-based semi-automatic annotation tool for point cloud data.
A system for prompted weak supervision.
custom models for named-entity recognition
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
Flippers is a weak supervision library for creating high quality labels using your domain kownledge and weak supervision sources.
Simple Telegram bot to annotate and varify automatic speech recognition datasets
Inference models for Pixano
Supplemental code: Large Language Models for Integrating Social Determinant of Health Data: A Case Study on Heart Failure 30-Day Readmission Prediction
SuperAnnotate HTTP service for Generated Text Detection
AnnoTheia is a data annotation toolkit that identifies when a person speaks in a scene and transcribes their speech, also offering flexibility to replace modules for different languages.
Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information"
`Shyft` is a time-tracking and data-logging utility designed to assist data annotators with managing and monitoring their service records. It represents the first programattic offering from ENCLAIM, a bourgeoning workers' union dedicated to promoting and protecting data annotators' labor interests as the industry continues to evolve.
Synthetic Image Generation as training data for instance segmentation and object detection task
An example how to create your own NER dataset for any purposes from the ground up: from raw text collection to data annotation.
Implementation of "Rethinking Interactive Image Segmentation: Feature Space Annotation", Pattern Recognition 2022
The Streamlit tool for the Filament Synthetic QA Pairs project, used to annotate generated data.
Add a description, image, and links to the data-annotation topic page so that developers can more easily learn about it.
To associate your repository with the data-annotation topic, visit your repo's landing page and select "manage topics."