A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
-
Updated
Jul 25, 2024 - Python
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"
The official repo for the paper "HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model"
Paper List of Pre-trained Foundation Recommender Models
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
Official implementation for "UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction" (KDD 2024)
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
trending projects & awesome papers about data-centric llm studies.
Fine-tuning mistral model to accurately time stamp relevant youtube video segment, later labelled using trained IDM.
This repository contains the python package for Helical
[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
[RSS 2024] Learning Manipulation by Predicting Interaction
Official Repository for The Paper, Adversarial-MidiBERT: Symbolic Music Understanding Model Based on Unbias Pre-training and Mask Fine-tuning
The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
[ICML2024] Unified Training of Universal Time Series Forecasting Transformers
Efficient Network Traffic Classification via Pre-training Unidirectional Mamba
Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.
[ECCV 2024] DSMix: Distortion-Induced Sensitivity Map Based Pre-training for No-Reference Image Quality Assessment
Add a description, image, and links to the pre-training topic page so that developers can more easily learn about it.
To associate your repository with the pre-training topic, visit your repo's landing page and select "manage topics."