feat: add 5 China authoritative data sources (AM batch 2026-05-06)#211
Merged
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom May 6, 2026
Merged
Conversation
Add 5 new Chinese authoritative data sources spanning research, health regulation, nutrition, scientific terminology, and science popularization: - china-cngbdb: 国家基因库生命大数据平台 (China National GeneBank DataBase) Unified life science big data platform run by China National GeneBank (CNGB, Shenzhen); hosts CNSA sequence archive, nucleotide/protein databases, literature, samples, and bioinformatics tools. - china-cmde: 国家药品监督管理局医疗器械技术审评中心 (Center for Medical Device Evaluation, NMPA) — technical review authority for Class II/III and imported medical devices; publishes registration, review-status, technical-review guidance, and IVD/innovation-device records. - china-cns: 中国营养学会 (Chinese Nutrition Society) — national learned society (est. 1945), publisher of the authoritative Chinese Dietary Reference Intakes (DRIs), Dietary Guidelines for Chinese Residents, and population nutrition survey results. - china-termonline: 术语在线 (TermOnline — China National Terminology Service Platform) — operated by CNCTST (全国科学技术名词审定委员会); 500,000+ standardized Chinese scientific/technical terms with English equivalents across 100+ disciplines. - china-cstm: 中国科学技术馆 (China Science and Technology Museum) — China's national comprehensive science museum under CAST; publishes science literacy indicators, mobile/digital science museum data, and science education resources. All sources verified: websites return 200/202/302, no blacklist or existing-website duplicates, make check passes (700 unique IDs).
mingcha-dev
approved these changes
May 6, 2026
Collaborator
mingcha-dev
left a comment
There was a problem hiding this comment.
明察 QA Review — PR #211 APPROVED ✅
Checklist
- ✅ CI 三项全绿(secrecy / schema / validate)
- ✅ 保密(body + 5 文件内容)
- ✅ ID 去重(5 新 ID 全库唯一)
- ✅ 缩写冲突排查:
china-cns(中国营养学会,cnsoc.org)vs 已有china-cnsa(国家航天局,cnsa.gov.cn)— 子串匹配但不同机构不冲突- cngbdb / cmde / cstm / termonline 均无其他冲突
- ✅ 域名去重
- ✅ URL + title:
- cstm: 中国科学技术馆 ✓
- cmde: [202] SPA 无 title,whois Registrant = 国家药品监督管理局医疗器械技术审评中心 ✓(官方域名权威)
- cns: 中国营养学会官网 ✓
- cngbdb: CNGBdb ✓
- termonline: 术语在线 ✓
- ✅ Domains kebab-case(2-3/文件)
- ✅ Tags 15/文件,无空格 / 乱码
覆盖价值
- cngbdb:国家基因库(华大 BGI),生命组学大数据
- cmde:医疗器械审评中心(NMPA 下属,与 nifdc/nmpa 形成体系)
- cns:营养学会(膳食指南权威)
- cstm:科学技术馆(科普数据)
- termonline:全国科学技术名词审定委员会术语平台(首个术语库源)
Merge 🚀
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 5 new Chinese authoritative data sources across research, health regulation, nutrition, scientific terminology, and science popularization domains.
New Sources
china-cngbdbchina-cmdechina-cnschina-termonlinechina-cstmWhy these sources
Checks
scripts/check-blacklist.sh)make checkpasses — 700 unique IDs, domain consistency OKCloses part of the '中国优先 上午批次' daily contribution schedule.