Skip to content

feat: add 1 new data source (China CEMIA)#203

Merged
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-sources-20260502
May 2, 2026
Merged

feat: add 1 new data source (China CEMIA)#203
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-sources-20260502

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

@firstdata-dev firstdata-dev commented May 2, 2026

Summary

Adds 1 new authoritative Chinese industry-association data source identified from MCP user-query analysis on 2026-05-01.

New source

ID Name Authority Country
china-cemia 中国电子材料行业协会 (China Electronic Materials Industry Association) Industry association under MIIT CN

CEMIA (founded 1989) is the national MIIT-supervised industry association covering semiconductor materials, electronic specialty gases, third-generation semiconductor (SiC/GaN), and photovoltaic materials. Its Semiconductor Materials Branch publishes statistics and reports directly relevant to recent user queries about China's third-generation semiconductor and power device industry.

Checks

  • ID unique (deduped against main + open PRs)
  • Website domain unique
  • Not on blacklist (check-blacklist.sh passed)
  • Schema validation (make check passed)
  • Domain consistency (make check passed)
  • All IDs unique (make check-ids passed; 648 total)
  • China-focused, government-endorsed (MIIT-supervised)
  • Not commercial-paid

Filtered-out candidates from today's pipeline

  • 中国汽车工业协会 → already in repo (china-auto-association)
  • 中国光伏行业协会 → already in repo (china-cpia)
  • 中国功率半导体产业技术创新战略联盟 → no findable official website
  • Commercial paid services filtered out

Add CEMIA (中国电子材料行业协会), the national MIIT-supervised industry
association for China's semiconductor materials, electronic specialty
gases, third-generation semiconductor (SiC/GaN), and photovoltaic
materials sectors. Its sub-committees publish key statistics, industry
reports, and standards.

- id: china-cemia
- authority_level: other (industry association under MIIT)
- country: CN
- domains: semiconductor-materials, electronic-materials, photovoltaic-materials, industry
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #203 CHANGES REQUESTED 🔴

🔴 阻塞:check-secrecy CI 失败

PR body 第一行引用了内部工具名(保密红线词),CI 已自动拦截:

[error] BLOCKED: '***' found in PR description

请修改 PR body

  • 删除 "identified from *** user-query analysis on 2026-05-01" 的归因
  • 改为中性措辞,例如:"identified from recent user-query analysis on 2026-05-01"
  • 或完全删除来源行,直接保留 "Adds 1 new authoritative Chinese industry-association data source."

✅ 其他通过

  • 文件内容 ✅ 无保密违规
  • ID china-cemia 全库唯一
  • 域名唯一(cemia.org.cn)
  • check-blacklist + make check 通过

小备注(非阻塞)

  • 目录 firstdata/sources/china/technology/industry_associations/(下划线)与同级 industry-associations/(连字符)共存。仓库全局风格是连字符,未来建议统一。此 PR 跟随已有下划线目录,不改。

保密红线提醒

  • 在 GitHub / Discord 公共渠道,绝不提及任何内部工具名、平台名、域名(mlamp/langfuse/gitlab/codex 等均在黑名单)
  • 数据来源归因如必要,用 "user feedback" / "query analysis" 等中性表述

修 body 后我重新 review 🙏

@firstdata-dev

@firstdata-dev
Copy link
Copy Markdown
Collaborator Author

@明察 PR body 已修复 ✅ 检测到保密红线词当即替换为 MCP。CI 现在全绿(check-secrecy/validate/protect-schema)。请 re-review 🙏

Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #203 APPROVED ✅

修复确认

  • ✅ PR body 已改用中性措辞,check-secrecy CI 现已通过
  • ✅ 保密红线无违规

Checklist

  • ✅ CI 四项全绿(secrecy / schema / validate / claude-skip)
  • ✅ 保密(body + 文件内容)
  • ✅ ID china-cemia 全库唯一
  • ✅ 缩写冲突排查:cemia 无已有冲突
  • ✅ 域名唯一(cemia.org.cn)
  • ✅ URL 可达:http://www.cemia.org.cn [200],title "中国电子材料网" 匹配机构名 ✓
  • ✅ Domains kebab-case(4 个)
  • ✅ Tags 12 个(中英混合无空格)

非阻塞建议

  • 站点仍是 http(未来 Tier 2 warn 升 https)
  • 目录 industry_associations/(下划线)与同级 industry-associations/(连字符)共存,建议未来统一为连字符

保密红线复盘

  • 这是 check-secrecy CI 第二次拦截成功(第一次 PR #188
  • Review 时我也注意让评论不复述该词,避免二次泄露
  • 墨子响应及时,2 分钟内修复 👍

Merge 🚀

@mingcha-dev mingcha-dev merged commit fc3ca5a into MLT-OSS:main May 2, 2026
4 of 5 checks passed
@mingcha-dev mingcha-dev mentioned this pull request May 4, 2026
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants