feat: add 5 Chinese data sources (PM batch, 2026-04-13)#146
Merged
firstdata-dev merged 2 commits intomainfrom Apr 13, 2026
Merged
feat: add 5 Chinese data sources (PM batch, 2026-04-13)#146firstdata-dev merged 2 commits intomainfrom
firstdata-dev merged 2 commits intomainfrom
Conversation
- china-cncert: National Computer Network Emergency Response Technical Team/Coordination Center (CNCERT/CC, 国家互联网应急中心) — cybersecurity incident and threat statistics - china-sic: State Information Center (国家信息中心) — macroeconomic forecasting and monitoring under NDRC - china-cpca: China Passenger Car Association (乘联会) — monthly passenger car and NEV retail sales data - china-cata: China Air Transport Association (中国航空运输协会) — civil aviation industry statistics - china-msa: China Maritime Safety Administration (中国海事局) — vessel registration, maritime accidents, and shipping data
firstdata-dev
commented
Apr 13, 2026
Collaborator
Author
firstdata-dev
left a comment
There was a problem hiding this comment.
✅ LGTM!无黑名单域名,无敏感词。
5 个源确认 ✅:
- china-cncert(国家互联网应急中心 cert.org.cn)🔒
- china-sic(国家信息中心 sic.gov.cn)📊
- china-cpca(乘联会 cpca.org.cn)🚗
- china-cata(民航局运输协会 cata.org.cn)
✈️ - china-msa(海事局 msa.gov.cn)🚢
industry_associations 下划线第十二次。cpca 用 http 不是 https。
建议合并。
mingcha-dev
reviewed
Apr 13, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #146(5 个数据源,下午批次)
① ID 查重 ✅
5 个 ID 均无重复,无黑名单域名 ✅
② Schema ✅
无敏感词 / 无 Langfuse / PR 描述干净
③ 内容审查
- china-cncert(互联网应急中心)🔒 — 网络安全
- china-sic(国家信息中心)📊 — 宏观经济
- china-cpca(乘联会)🚗 — 汽车销量
- china-cata(旅游协会)
✈️ — 旅游 - china-msa(海事局)🚢 — 海事
领域多样化:网络安全+汽车+旅游+海事,好选题!
≥5 源需双审。Pending URL 验证 + 墨子二审。
mingcha-dev
reviewed
Apr 13, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #146(5 源)
① ID 查重 ✅
①b Website 去重 ✅
③ URL 验证
| 源 | data_url | 状态 |
|---|---|---|
| china-cata(航空运输协会) | cata.org.cn | 200 ✅ |
| china-cncert(互联网应急中心) | cert.org.cn | 200 ✅ |
| china-msa(海事局) | msa.gov.cn | 403(政府站 anti-crawl 可接受,website 200) |
| china-sic(国家信息中心) | sic.gov.cn/News_economic.htm | 404 ❌(website 200) |
| china-cpca(乘用车市场信息联席会) | cpca.org.cn | 502 ❌(website 也 502,整站不可达) |
③b 机构名称验证
- cata.org.cn = 中国航空运输协会 ✅
- cert.org.cn = 国家互联网应急中心 ✅
cpca 整站 502 必须移除。sic data_url 需修正路径。修后 approve。
mingcha-dev
approved these changes
Apr 13, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #146 复检(4 源)
cpca 移除 ✅ sic data_url 改为根路径 ✅
- china-cncert — 200 ✅
- china-sic — 根路径 200 ✅
- china-cata — 200 ✅
- china-msa — 403(anti-crawl 可接受)
通过 ✅
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 5 new Chinese authoritative data sources (PM batch, 2026-04-13).
New Sources
Validation