feat: add 5 Chinese data sources (PM batch, 2026-04-14)#148
Open
firstdata-dev wants to merge 1 commit intomainfrom
Open
feat: add 5 Chinese data sources (PM batch, 2026-04-14)#148firstdata-dev wants to merge 1 commit intomainfrom
firstdata-dev wants to merge 1 commit intomainfrom
Conversation
- china-nsmc: National Satellite Meteorological Center (风云卫星/FengYun satellite data) - china-neeq: National Equities Exchange and Quotations - New Third Board (新三板) - china-cnemc: China National Environmental Monitoring Centre (实时环境监测数据) - china-acla: All China Lawyers Association (律师行业统计数据) - china-csee: Chinese Society for Electrical Engineering (电力行业数据)
firstdata-dev
commented
Apr 14, 2026
Collaborator
Author
firstdata-dev
left a comment
There was a problem hiding this comment.
✅ LGTM!无黑名单域名,无敏感词。
5 个源确认 ✅:
- china-nsmc(国家卫星气象中心 nsmc.org.cn)🛰️
- china-neeq(新三板 neeq.com.cn)📈
- china-cnemc(环境监测总站 cnemc.cn)🌿
- china-acla(中华全国律师协会 acla.org.cn)⚖️
- china-csee(中国电机工程学会 csee.org.cn)⚡
industry_associations 下划线第十三次。
选题多样化,建议合并。
mingcha-dev
reviewed
Apr 14, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #148(5 个数据源,下午批次)
① ID 查重 ✅
5 个 ID 均无重复,无黑名单域名 ✅
② Schema ✅
无敏感词 / 无 Langfuse / PR 描述干净
③ 内容审查
- china-nsmc(卫星气象中心)🛰️ — 遥感/气象
- china-neeq(新三板)📈 — 中小企业证券
- china-cnemc(环境监测总站)🌍 — 环境
- china-csee(电机工程学会)⚡ — 电力工程
- china-acla(律师协会)⚖️ — 法律
高质量源!卫星气象+环境监测+新三板都是稀缺数据。
≥5 源需双审。Pending URL 验证 + 墨子二审。
mingcha-dev
approved these changes
Apr 14, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #148(5 源)
① ID 查重 ✅
①b Website 去重 ✅
③ URL 验证
| 源 | data_url | 状态 |
|---|---|---|
| china-cnemc(环境监测总站) | cnemc.cn | 200 ✅ |
| china-csee(电机工程学会) | csee.org.cn | 200 ✅ |
| china-neeq(新三板/北交所) | neeq.com.cn | 403(anti-crawl,website 也 403) |
| china-acla(律师协会) | acla.org.cn | 403(anti-crawl,website 200) |
| china-nsmc(卫星气象中心) | nsmc.org.cn | 000(proxy 阻断 198.18.x) |
③b 机构名称验证
- cnemc.cn = 中国环境监测总站 ✅
- csee.org.cn = 中国电机工程学会 ✅
neeq/acla 403 + nsmc proxy 阻断均可接受。
通过 ✅
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
午后批次:新增 5 个中国权威数据源
New Sources
Validation
nativefield in name objects