Stars
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Graphormer is a general-purpose deep learning backbone for molecular modeling.
This is the code for paper " Enforcing Deterministic Constraints on Generative Adversarial Networks for Emulating Physical Systems"
A curated list of awesome System Design (A.K.A. Distributed Systems) resources.
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Chinese version of GPT2 training code, using BERT tokenizer.
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
PyTorch implementation of a Variational Autoencoder with Gumbel-Softmax Distribution
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
Pytorch code for "Multilingual Offensive Language Identification"
🔥 经典编程书籍大全,涵盖:计算机系统与网络、系统架构、算法与数据结构、前端开发、后端开发、移动开发、数据库、测试、项目与团队、程序员职业修炼、求职面试等
Code for PaperRobot: Incremental Draft Generation of Scientific Ideas
Deal or No Deal? End-to-End Learning for Negotiation Dialogues
An educational resource to help anyone learn deep reinforcement learning.
This is official Pytorch code and datasets of the paper "Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News", EMNLP 2020.
Video Grounding and Captioning
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
A set of scripts to grab public datasets from resources related to arXiv
This is a list of open-source projects at Microsoft Research NLP Group
EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts (AAAI 2019)
a Knowledgeable Stylized Integrated Text Generation Platform
BLEURT is a metric for Natural Language Generation based on transfer learning.