Skip to content

CSVインポートとGTFSインポートを並行実行#1525

Merged
TinyKitten merged 1 commit into
devfrom
feature/parallel-csv-gtfs-import
May 15, 2026
Merged

CSVインポートとGTFSインポートを並行実行#1525
TinyKitten merged 1 commit into
devfrom
feature/parallel-csv-gtfs-import

Conversation

@TinyKitten
Copy link
Copy Markdown
Member

@TinyKitten TinyKitten commented May 15, 2026

概要

CSV インポートと GTFS インポートを並行実行する構造に再編し、両方の合計所要時間を短縮する。

変更の種類

  • バグ修正
  • 新機能
  • データの修正・追加
  • リファクタリング
  • ドキュメント
  • CI/CD
  • その他

変更内容

  • import_csv から CREATE EXTENSION および create_table.sql の実行を create_schema として分離
  • main.rscreate_schema を先行実行した後、GTFS インポートを tokio::spawn でバックグラウンド起動し、CSV インポートと並行実行
  • CSV 完了時点でサーバ起動・ヘルスチェック通過(従来挙動を維持)
  • CSV 完了後に GTFS の JoinHandle を待機し、integrate_gtfs_to_stationsANALYZE を実行
  • GTFS 失敗時の挙動は従来通り warn ログのみで継続

テスト

  • cargo fmt --all -- --check が通ること
  • cargo clippy -- -D warnings が通ること
  • cargo test(SQLX_OFFLINE=true)が通ること

関連Issue

スクリーンショット(任意)

Summary by CodeRabbit

  • Chores
    • 起動時のデータベーススキーマ準備処理を最適化し、CSV および GTFS データインポートの並列実行に対応
    • ヘルスチェック判定をデータベースのテーブル件数に基づいた実装に変更し、サービスの実際の準備状態をより正確に反映

Review Change Stack

CREATE EXTENSION と create_table.sql の実行を create_schema として
import_csv から分離し、スキーマ作成後に CSV と GTFS のインポートを
tokio::spawn で並行起動する。

GTFS はバックグラウンドのまま、CSV 完了時点でサーバ起動・ヘルスチェック
通過。CSV 完了後に GTFS の完了を待って integrate_gtfs_to_stations と
ANALYZE を実行する。GTFS 失敗時の挙動は従来通り warn のみ。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TinyKitten TinyKitten self-assigned this May 15, 2026
@github-actions github-actions Bot added feature 要望対応や課題解決 deploy-dev and removed feature 要望対応や課題解決 labels May 15, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

📝 Walkthrough

Walkthrough

Startup flow refactored to separate schema initialization from import logic, enabling parallel CSV/GTFS imports. Health check now monitors database station count rather than assuming serving state. Environment configuration centralized with validation and fallback defaults.

Changes

Startup initialization and parallel import refactoring

Layer / File(s) Summary
Import API contract
stationapi/src/import.rs
New create_schema() function handles pre-import setup (extensions, schema). import_csv() documentation updated to require prior schema initialization; extension creation responsibility removed.
Environment configuration parsing
stationapi/src/main.rs
Helper functions parse PORT, HOST, METRICS_HOST, METRICS_PORT, DATABASE_MAX_CONNECTIONS (1–1000 range), and DISABLE_GRPC_WEB with fallback defaults and validation.
Database connection and health monitoring
stationapi/src/main.rs
Connection pool created with configurable max connections; station_api_service_status() task monitors database station count to dynamically toggle health serving state instead of fixed initial serving.
Startup sequence
stationapi/src/main.rs
Tracing initialized, Prometheus metrics listener started, create_schema() executed before imports.
Parallel import orchestration
stationapi/src/main.rs
GTFS spawned as background task; CSV executed synchronously. After CSV, awaits GTFS result and conditionally integrates data and runs ANALYZE. GTFS failures do not block server startup; errors converted to strings to avoid Send trait issues.
Server assembly
stationapi/src/main.rs
gRPC server built with conditional tonic-web support; QueryInteractor, StationApiServer, and reflection registered; service bound and listening.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

feature

Poem

🐰 スキーマ先に、データ後から
並列で駆け抜けるイミポート
ヘルスチェックはDB次第
スタートアップ、賢くなりました

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed PR のタイトルは、主要な変更内容(CSV と GTFS インポートの並行実行)を明確に要約しており、チェンジセットの中心的な目的と完全に一致しています。
Description check ✅ Passed PR の説明は、テンプレートの主要セクション(概要、変更の種類、変更内容、テスト)を適切に埋めており、技術的な詳細も十分に記述されています。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/parallel-csv-gtfs-import

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
stationapi/src/import.rs (1)

41-67: 💤 Low value

.parse() による String への変換は不要

String::from_utf8_lossy()Cow<str> を返すため、.parse::<String>() を経由する必要はありません。.to_string() または .into_owned() で直接 String に変換できます。

♻️ 修正案
-    let create_sql: String = String::from_utf8_lossy(&create_sql_content).parse()?;
+    let create_sql: String = String::from_utf8_lossy(&create_sql_content).into_owned();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@stationapi/src/import.rs` around lines 41 - 67, The create_schema function is
converting the file bytes to a String via String::from_utf8_lossy(...).parse()
which is unnecessary and forces a parse Result; replace the parse step by
converting the Cow<str> directly to an owned String (use .into_owned() or
.to_string()) for the variable create_sql (derived from
create_sql_content/create_sql_path) so sqlx::raw_sql receives a String without
using .parse().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@stationapi/src/import.rs`:
- Around line 41-67: The create_schema function is converting the file bytes to
a String via String::from_utf8_lossy(...).parse() which is unnecessary and
forces a parse Result; replace the parse step by converting the Cow<str>
directly to an owned String (use .into_owned() or .to_string()) for the variable
create_sql (derived from create_sql_content/create_sql_path) so sqlx::raw_sql
receives a String without using .parse().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cf57f538-10f7-4b96-ba09-988a6fe2d030

📥 Commits

Reviewing files that changed from the base of the PR and between b16e16a and 2f84aa8.

📒 Files selected for processing (2)
  • stationapi/src/import.rs
  • stationapi/src/main.rs

@TinyKitten TinyKitten merged commit 7bb5aa0 into dev May 15, 2026
11 checks passed
@TinyKitten TinyKitten deleted the feature/parallel-csv-gtfs-import branch May 15, 2026 16:13
@TinyKitten TinyKitten mentioned this pull request May 17, 2026
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant