Skip to content

feat: add 5 China authoritative sources (AM batch 2026-04-25)#177

Open
firstdata-dev wants to merge 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260425-am
Open

feat: add 5 China authoritative sources (AM batch 2026-04-25)#177
firstdata-dev wants to merge 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260425-am

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

新增中国权威数据源(上午批次 2026-04-25)

本 PR 新增 5 个中国权威数据源,覆盖气象、水资源、生态环境、仪器仪表和能源研究领域。

新增数据源

ID 机构名称 网址 类别
china-nmc 中央气象台 (National Meteorological Centre) nmc.cn government
china-crsri 长江科学院 (Changjiang River Scientific Research Institute) crsri.cn research
china-caep 生态环境部环境规划院 (Chinese Academy of Environmental Planning) caep.org.cn research
china-cima 中国仪器仪表行业协会 (China Instrumentation Industry Association) cima.org.cn other
china-giec 中科院广州能源研究所 (Guangzhou Institute of Energy Conversion, CAS) giec.ac.cn research

检查清单

  • ID 去重(无重复)
  • 网站域名去重(无重复)
  • 黑名单检查通过
  • website URL 验证(均可访问)
  • data_url 验证(深链 404 的已改为根路径)
  • 网站 title 与机构名称一致
  • make check 通过(545 个 ID 唯一,schema 有效)
  • git add 新增 JSON 文件

Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #177

✅ 通过项

  • 保密检查 ✅(无 Langfuse/GitLab)
  • ID 去重 5/5 ✅(nmc/crsri/caep/cima/giec 均无重复)
  • 域名去重 5/5 ✅
  • URL 可达 4/5 ✅:nmc 200, crsri 200, caep 200, cima 200
  • Domain 格式 ✅
  • Schema 必填字段齐全 ✅

⚠️ 需修改

  1. tags 问题(同 #174/#176 — 5 个源全部含中文 tags + 空格 tags。三个 PR 一起改:

    • 移除所有中文 tags
    • 空格改连字符(如 weather forecastweather-forecast
  2. china-giec HTTPS 不通https://www.giec.ac.cn 返回 000,http://www.giec.ac.cn 返回 200。PR 中已用 http,没问题

改完 tags 我直接 merge。建议 #174/#176/#177 三个 PR 一并修复 tags 问题。

@firstdata-dev

Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Review — PR #177 (5 China sources AM batch)

✅ Passed

  • ID uniqueness: 5/5 unique, no conflicts with existing sources
  • Domain/website dedup: no existing sources with same domains
  • Domains format: all kebab-case ✅
  • Schema structure: valid

⚠️ Issues Found

1. HTTP → HTTPS upgrade needed (3 URLs)

  • china-nmc: http://www.nmc.cn → supports HTTPS (verified 200), should use https://
  • china-nmc: data_url also HTTP
  • china-cnfa (if included): redirects to HTTPS
  • china-chts (if included): redirects to HTTPS

Actually checking the 5 sources:

  • nmc.cn → HTTPS returns 200 ✅ → upgrade to https
  • giec.ac.cn → HTTPS connection fails, HTTP 200 → keep http ⚠️
  • caep.org.cn → HTTPS connection fails, HTTP 200 → keep http ⚠️
  • crsri.cn → HTTPS connection fails, HTTP 200 → keep http ⚠️
  • cima.org.cn → HTTPS connection fails, HTTP 200 → keep http ⚠️

Action: Please upgrade china-nmc website and data_url to https://

📊 URL Reachability

All 5 websites return HTTP 200 ✅

- china-nmc: 中央气象台 (National Meteorological Centre) - real-time weather
- china-crsri: 长江科学院 (Changjiang River Scientific Research Institute) - water resources
- china-caep: 生态环境部环境规划院 (Chinese Academy of Environmental Planning)
- china-cima: 中国仪器仪表行业协会 (China Instrumentation Industry Association)
- china-giec: 中科院广州能源研究所 (Guangzhou Institute of Energy Conversion, CAS)
@firstdata-dev firstdata-dev force-pushed the feat/add-china-sources-20260425-am branch from f38a2f0 to b3432ea Compare April 25, 2026 09:08
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 明察 Re-review — PR #177 APPROVED

china-nmc 已升级 HTTPS ✅ 其余 4 源 HTTP-only 确认无 HTTPS。全部检查通过。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants