Skip to content

feat: support device init once and reinit#10

Merged
pfwang80s merged 4 commits into
mainfrom
wangpf/tidy-init-reinit
May 20, 2026
Merged

feat: support device init once and reinit#10
pfwang80s merged 4 commits into
mainfrom
wangpf/tidy-init-reinit

Conversation

@pfwang80s
Copy link
Copy Markdown
Contributor

@pfwang80s pfwang80s commented May 20, 2026

Context

data-platform now treats device init and reinit as separate operations:

  • InitDevice: first-time device provisioning only.
  • ReinitDevice: refresh credentials/config for an already-initialized device.

Related platform PR: archebase/data-platform#222

What Changed

  • Added DeviceInitService.ReinitDevice to the SDK proto and regenerated Swift protobuf/gRPC bindings.
  • Updated the Swift control-plane transport and ArchebaseDeviceInitializer so initDevice and reinitDevice call distinct RPCs.
  • Kept the Qiongche public facade semantics unchanged: saveConfigAndInit(configString:) still performs init-or-reinit, falling back to ReinitDevice when the server reports DATA_GATEWAY_DEVICE_ALREADY_INITIALIZED.
  • Updated Qiongche readiness so a successful remote init/reinit invalidates the previous local ready state until endpoints/config/state all commit again.
  • Shared the init/reinit remote config fetch path between the generic initializer and the Qiongche provisioner.
  • Updated README guidance and focused tests for explicit init/reinit behavior and Qiongche fallback/recovery behavior.
  • Fixed iOS simulator smoke execution by preserving dependency Swift settings, passing simulator deployment target explicitly, propagating device-init integration env, normalizing simulator single-slash URLs, and failing early when the local init endpoint is missing.

Reviewer Follow-Up

  • Generic ArchebaseDeviceInitializer.reinitDevice now works even when the local config is missing, so callers have an explicit recovery path after remote init succeeds but local persistence does not complete.
  • Qiongche now clears its ready-state marker after remote init/reinit success and before local writes; isReadyToUpload() requires a valid state whose endpoint hash matches the endpoint file, preventing stale credentials from being reported ready after partial local persistence failure.
  • Added a regression test proving Qiongche can recover after remote success plus local endpoint persistence failure, and that it reports not-ready during the failed intermediate state.
  • Added early DGW_*_INIT_ENDPOINT validation to Scripts/simulator_smoke.sh local-mode execution and covered it with a script-shape test.
  • Reverted the personal Xcode xcuserdata scheme-order change and added xcuserdata/ to .gitignore.
  • Kept InitDeviceRequest / ReinitDeviceRequest as-is in this PR because changing that proto surface must be synchronized with data-platform; this is not safe to do SDK-only.

Validation

Latest reviewer-fix validation on commit b47add7:

  • bash -n Scripts/simulator_smoke.sh
  • swift test --filter DeviceInitializerTests:6 tests passed
  • swift test --filter QiongcheDataGatewaySDKTests:32 tests passed
  • swift test --filter LocalStackHarnessTests/simulatorSmokeScriptSkipsPackageUpdatesForCachedDependencies:1 test passed
  • git diff --check
  • swift test:222 tests passed; real/local gated integration tests skipped without their env

Earlier ACK validation for this branch before the reviewer follow-up:

  • ACK SwiftPM direct real OSS e2e: 221 tests passed, log: .local/e2e/sdk-ack-20260520-4d6c8bc-203555/swift-direct-realoss.log
  • ACK public-path SwiftPM real OSS e2e: 3 tests passed, log: .local/e2e/sdk-ack-20260520-4d6c8bc-203555/swift-public-path-realoss.log
  • ACK iOS simulator smoke real OSS e2e: 3 selected tests passed, log: .local/e2e/sdk-ack-20260520-4d6c8bc-203555/simulator-smoke-realoss.log

Risks and Rollout

  • SDK ReinitDevice calls require the companion data-platform server rollout; deploy after archebase/data-platform#222.
  • Callers using the generic initializer should keep strict semantics: initDevice remains create-only, while reinitDevice is the explicit recovery/rotation path.
  • Rollback by reverting this PR and continuing to use the previous SDK behavior.

Checklist

  • PR title uses Conventional Commits.
  • Tests and docs are updated or marked not applicable.
  • Public Rust APIs have English documentation comments. N/A, Swift SDK repo.
  • PR description matches the final implementation.

Copy link
Copy Markdown
Contributor

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按前面约定的 review 口径提交 inline comments:重点看了 init/reinit 状态机、proto 设计、脚本验证路径,以及是否有过度包装/可精简的重复实现。主要风险是通用初始化路径在“服务端已 init、但本地配置丢失/未落盘”时缺少恢复入口。

Comment thread Sources/DataGatewayClient/FilePreparation.swift Outdated
Comment thread Sources/DataGatewayClient/QiongcheDeviceProvisioner.swift Outdated
Comment thread protos/data_gateway.proto
Comment thread Scripts/simulator_smoke.sh
Comment thread Sources/DataGatewayClient/QiongcheDeviceProvisioner.swift Outdated
@zz-jason
Copy link
Copy Markdown
Contributor

Review summary

这次按前面约定的口径看了 correctness、contract/proto、测试证明、脚本验证路径,以及是否有过度设计或可精简代码。

主要结论:

  1. 需要优先修复恢复路径:通用 ArchebaseDeviceInitializer 在“服务端 InitDevice 已成功,但响应丢失/本地配置未落盘/本地配置被删”后会卡住:再次 init 得到 already initialized,reinit 又要求本地配置存在。建议允许本地缺失时显式 reinit,或在本地缺失且远端 already initialized 时 fallback 到 ReinitDevice,并补回归测试。
  2. 穹彻 init-or-reinit 需要补失败恢复证明saveConfigAndInit 的 reinit 会轮换远端 credential,但本地 endpoint/config/state 是后续分步写入。建议补 reinit 成功后本地任一步写入失败的恢复测试,证明再次提交配置可以恢复,且失败期间不会把旧 credential 误判为 ready。
  3. proto 可以精简ReinitDeviceRequestInitDeviceRequest 字段完全一致。若没有明确未来差异,建议复用 request message 或改成共享 DeviceInitRequest,避免生成代码膨胀和后续契约漂移。这个需要和 data-platform 侧 proto 同步调整。
  4. 脚本验证入口需要更早失败:simulator smoke 默认会跑 device-init 流程,但本地模式入口没有强校验 DGW_LOCAL_INIT_ENDPOINT。建议在脚本入口就校验,避免拖到 xcodebuild 测试阶段才失败。
  5. 代码精简点FilePreparation.swiftQiongcheDeviceProvisioner.swift 各自维护了一套 remoteConfig + DeviceInitRemoteMode,建议抽 package-private helper 复用,减少错误映射、sdkVersion/platform、response 校验后续漂移。
  6. repo hygiene:PR 带了个人 Xcode xcuserdata scheme 排序变更,建议移除,并考虑忽略 xcuserdata/

补充观察:SDK 的设备 init/reinit 路径没有被通用 RetryExecutor 自动重试包住;这次主要风险不是 retry executor,而是一次性 init 成功后 SDK 本地无配置时缺少恢复入口。

@pfwang80s pfwang80s merged commit f3dd9a1 into main May 20, 2026
2 checks passed
@pfwang80s pfwang80s deleted the wangpf/tidy-init-reinit branch May 20, 2026 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants