Skip to content
View HuiLi's full-sized avatar

Block or report HuiLi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Dataset

19 repositories

The Resources for "Natural Language to Logical Form" ; "自然语言转逻辑形式"研究资料收集。

214 31 Updated May 2, 2022

A large annotated semantic parsing corpus for developing natural language interfaces.

HTML 1,717 326 Updated Jul 23, 2023
128 3 Updated May 27, 2023

ChatGPT 中文语料库 对话语料 小说语料 客服语料 用于训练大模型

891 140 Updated May 15, 2024

📝 An Awesome Collection of Chinese Legal Dataset and Relevant Resources. 致力于收集全面的中文法律数据源

855 81 Updated Jun 20, 2023

万卷1.0多模态语料

557 28 Updated Oct 20, 2023

中文图书语料MD5链接

Python 217 23 Updated Jan 31, 2024
Python 133 19 Updated Jun 20, 2024

PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems.

Python 65 2 Updated Dec 20, 2023

A Chinese National Medical Licensing Examination dataset and large languge model benchmarks

Python 60 8 Updated Dec 2, 2023

OLAPH: Improving Factuality in Biomedical Long-form Question Answering

Python 39 4 Updated Sep 10, 2024

ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Python 106 3 Updated Jul 18, 2024

A benchmark for few-shot evaluation of foundation models for electronic health records (EHRs)

Jupyter Notebook 162 17 Updated Feb 17, 2025

A Large-Scale In-the-wild Dataset for Plant Disease Segmentation

Python 29 3 Updated Mar 25, 2025

A reading list on LLM based Synthetic Data Generation 🔥

1,238 71 Updated Feb 20, 2025

VizNet is a repository providing real-world datasets that enable, among other things, (re)running empirical studies with higher ecological validity

Jupyter Notebook 81 22 Updated Jan 5, 2023

This repo includes introduction, code and dataset of our paper Deep Sequence Learning with Auxiliary Information for Traffic Prediction (KDD 2018).

Python 231 80 Updated Jul 15, 2020