Turn txt files into an instruction dataset, using Oobabooga's text generation webui to add metadata.
-
Updated
Apr 30, 2024 - Python
Turn txt files into an instruction dataset, using Oobabooga's text generation webui to add metadata.
Tool to convert datasets from "Benchmark Data Sets for Graph Kernels" (K. Kersting et al., 2016) into a format suitable for deep learning research.
Marktplaats.nl (Dutch Classifieds) Listing Scraper
Kvasir-SEG: A Segmented Polyp Dataset
iOS application for creating datasets for Machine Learning projects
Persian Irony Detection, include a Persian dataset, creating a dataset automatically, and finetuning transformer-based language models for the task
This repository contains Jupyter notebooks detailing the experiments conducted in our research paper on Ukrainian news classification. We introduce a framework for simple classification dataset creation with minimal labeling effort, and further compare several pretrained models for the Ukrainian language.
A collection of scripts to create an audio data-set with energy based segmentation.
A dataset creation tool to aggregate, sort and label large volumes of architectural imagery.
PixelPruner is a user-friendly image cropping app for AI-generated art. It supports PNG, JPG, JPEG, and WEBP formats. Easily crop, preview, and manage images with interactive previews, thumbnail views, rotation tools, and customizable output folders. Streamline your workflow and achieve perfect crops every time with PixelPruner.
Deliverables relating to the Advanced Computer Vision for AI University Unit
A set of tools to generate and label dataset from academic papers
Python 3 script and API crawling Google Image to create giant image dataset.
Web Scraping Wikipedia for Disney Movies to create a Disney Movies dataset and then cleaning the data to perform further Data Analysis using the cleaned JSON
Development of a Face Tracking Pipeline for lower face tracking RGB HMCs
The script for parsing sankakucomplex
Simple terminal application to record speech datasets
Tartare: Make homebrew image dataset for machine learning.
Through this project, ONC in partnership with National Institutes of Health (NIH) National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), advanced the application of AI/ML in patient-centered outcomes research (PCOR) by generating high quality training datasets for a chronic kidney disease (CKD) use case – predicting mortality …
A simple project that creates a dataset of News Headlines with Primary Category, Secondary Category, Date, Day, Month,Year, Sentiment, SentimentPolarity, Emotion and Url. All News Headlines are scraped from punch newspaper and sorted into a csv file.
Add a description, image, and links to the dataset-creation topic page so that developers can more easily learn about it.
To associate your repository with the dataset-creation topic, visit your repo's landing page and select "manage topics."