Skip to content

Conversation

@qustavo
Copy link
Contributor

@qustavo qustavo commented Nov 11, 2025

Although some advice against allowing programs to auto-create databases (due to security policies and separation of concerns) it is also convenient to have the ability to do so, this can come handy for testing environments.
Based on the above this PR introduces the ARKD_PG_DB_AUTOCREATE flag that toggles database creation when it does not exist.
Closes #754

Summary by CodeRabbit

  • New Features

    • Added an environment option to enable automatic PostgreSQL database creation when the target DB is missing (disabled by default).
    • DB connections can optionally auto-create the database and retry on initial missing-database errors.
  • Documentation

    • Configuration docs updated to describe the new auto-create option and default behavior.
    • Development environment example updated to include the new setting.
  • Tests

    • Test configurations updated to exercise the new auto-create flag.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 11, 2025

Walkthrough

Adds a Postgres auto-create option: new config field and env var, wires the flag into store initialization, updates OpenDb signature and Postgres helpers to detect a missing database and create it before reconnecting, and updates tests, dev env, and README.

Changes

Cohort / File(s) Summary
Documentation
README.md
Added ARKD_PG_DB_AUTOCREATE env var to configuration table (default: false).
Configuration
internal/config/config.go
Added PostgresAutoCreateDB bool to Config; added PostgresAutoCreateDB = "PG_DB_AUTOCREATE" constant; LoadConfig() reads the bool; event/data store init arrays now include the autoCreate flag.
PostgreSQL Utilities
internal/infrastructure/db/postgres/utils.go
Changed OpenDb(dsn string)OpenDb(dsn string, autoCreate bool); added connectDB (Ping + detect 3D000 error and optionally call createDB) and createDB (parse DSN, connect without target DB, CREATE DATABASE); added imports and logging.
Service Layer
internal/infrastructure/db/service.go
Event/data store initialization now expect [dsn, autoCreate] (len==2); validate second element is bool; call OpenDb(dsn, autoCreate) in pre-/post-migration branches and return explicit errors on invalid config.
Tests
internal/infrastructure/db/service_test.go
Test configs updated to supply the additional boolean flag (false) in EventStoreConfig and DataStoreConfig; minor formatting tweak.
Environment
envs/arkd.dev.env
Added ARKD_PG_DB_AUTOCREATE=true.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Cfg as Config
    participant Svc as Service
    participant Utils as PostgresUtils
    participant DB as Postgres

    App->>Cfg: LoadConfig()
    Cfg-->>App: Config{PostgresAutoCreateDB}

    App->>Svc: Init stores with [dsn, autoCreate]

    Svc->>Utils: OpenDb(dsn, autoCreate)
    Utils->>DB: PingContext()
    alt DB exists
        DB-->>Utils: OK
        Utils-->>Svc: *sql.DB
    else DB missing and autoCreate=true
        DB-->>Utils: error (3D000)
        Utils->>Utils: createDB(dsn)
        Utils->>DB: connect to server (no db) -> CREATE DATABASE <name>
        DB-->>Utils: Created
        Utils->>DB: PingContext() retry
        DB-->>Utils: OK
        Utils-->>Svc: *sql.DB
    else DB missing and autoCreate=false
        DB-->>Utils: error (3D000)
        Utils-->>Svc: error
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review internal/infrastructure/db/postgres/utils.go for correct DSN parsing, proper quoting/escaping of database identifiers in CREATE DATABASE, SQL injection risks, and accurate handling of Postgres error code 3D000.
  • Verify resource management (DB connections) and retry logic in connectDB/createDB to avoid leaks or infinite retries.
  • Check internal/infrastructure/db/service.go for robust config length/type validation and clear, consistent error messages.
  • Confirm internal/config/config.go correctly reads and propagates the new flag, and that tests reflect the new config shape.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Postgres DB autocreation' clearly and concisely describes the main change: adding automatic database creation functionality for PostgreSQL.
Linked Issues check ✅ Passed The PR implements the core requirement from issue #754 by adding PostgreSQL database auto-creation functionality via the ARKD_PG_DB_AUTOCREATE environment variable with proper configuration and error handling.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing PostgreSQL database auto-creation: configuration wiring, environment variables, database helper functions, and related test updates. No extraneous modifications detected.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/infrastructure/db/service.go (1)

135-200: Use the event-store config when opening the event DB

In the postgres branch you’re validating and unpacking config.DataStoreConfig. That breaks setups where the event store uses its own DSN/flag (and even rejects mixed backends like event-store postgres + data-store sqlite). Switch the length/type checks to config.EventStoreConfig, and adjust the error text accordingly.

-		if len(config.DataStoreConfig) != 2 {
-			return nil, fmt.Errorf("invalid data store config for postgres")
+		if len(config.EventStoreConfig) != 2 {
+			return nil, fmt.Errorf("invalid event store config for postgres")
 		}
 
-		dsn, ok := config.DataStoreConfig[0].(string)
+		dsn, ok := config.EventStoreConfig[0].(string)
 		if !ok {
 			return nil, fmt.Errorf("invalid DSN for postgres")
 		}
 
-		autoCreate, ok := config.DataStoreConfig[1].(bool)
+		autoCreate, ok := config.EventStoreConfig[1].(bool)
 		if !ok {
 			return nil, fmt.Errorf("invalid autocreate flag for postgres")
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b50d6cd and 2aba831.

📒 Files selected for processing (4)
  • README.md (1 hunks)
  • internal/config/config.go (5 hunks)
  • internal/infrastructure/db/postgres/utils.go (1 hunks)
  • internal/infrastructure/db/service.go (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/infrastructure/db/service.go (1)
internal/infrastructure/db/postgres/utils.go (1)
  • OpenDb (26-39)

@qustavo qustavo force-pushed the pg_db_autocreate branch 2 times, most recently from f75f3ec to 1c440e0 Compare November 11, 2025 20:03
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2aba831 and 1c440e0.

📒 Files selected for processing (1)
  • internal/infrastructure/db/postgres/utils.go (1 hunks)

@altafan
Copy link
Collaborator

altafan commented Nov 21, 2025

@qustavo sorry to get back to you so late. I tested your changes and i got an error at startup:


Screenshot 2025-11-21 at 02 12 43

Please sync your branch with master, then you can test by running make run-wallet and make run in 2 different tabs, like i did.

Consider that, at the moment, the e2e tests on CI run against arkd configured with an sqlite db and a kv event db, therefore they are not reliable for your changes even if everything is green.

We planned for the next week to change the setup for the testing env and move to an arkd configured with pg db and event db.

@qustavo
Copy link
Contributor Author

qustavo commented Nov 21, 2025

@altafan thanks for taking the time to review this. Could you provide your ARKD_ env values to be able to reproduce the same environment as yours?

@altafan
Copy link
Collaborator

altafan commented Nov 25, 2025

i ran arkd with make run that uses these env vars

qustavo and others added 6 commits November 25, 2025 20:29
This adds the ARKD_PG_DB_AUTOCREATE flag to toggle auto db creation on
postgres. Database autocreation will only be enabled when this flag is
set to true or 1.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Gustavo Chaín <me@qustavo.cc>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
internal/infrastructure/db/postgres/utils.go (2)

23-65: OpenDb/connectDB flow is sound; fix small comment typo

The lazy sql.Open + PingContext + conditional auto-create flow looks correct, and the recursion with autoCreate=false avoids infinite loops. The only nit is the inline comment on Line 56 still referring to pingDB instead of connectDB, which could confuse future readers.

You can update it as:

-			// Recursively call pingDB now that the DB exists but set autoCreate false to avoid
+			// Recursively call connectDB now that the DB exists but set autoCreate false to avoid

67-107: Handle concurrent auto-creation when the database already exists

createDB correctly derives the DB name from the URL and uses pq.QuoteIdentifier to build a safe CREATE DATABASE statement. One edge case: if multiple arkd instances start at the same time with autoCreate=true, they can race to create the same DB. The loser will see a “database already exists” error and fail startup even though the target DB is now available.

To make this more robust, treat the Postgres “duplicate database” error (code 42P04) as a non-fatal success:

-	if _, err := rootDB.ExecContext(ctx, query); err != nil {
-		return err
-	}
+	if _, err := rootDB.ExecContext(ctx, query); err != nil {
+		var dbErr *pq.Error
+		if errors.As(err, &dbErr) && dbErr.Code == "42P04" {
+			log.Infof("Postgres database %s already exists, continuing startup", dbName)
+			return nil
+		}
+		return err
+	}

This keeps behavior strict for other errors but avoids unnecessary failures when another instance has just created the DB.

internal/config/config.go (1)

71-78: PostgresAutoCreateDB wiring from env → config → repoManager looks correct

The PostgresAutoCreateDB flag is consistently defined, read from PG_DB_AUTOCREATE, and propagated into both the event and main Postgres store configs. This keeps auto-create behavior explicitly opt‑in while reusing a single flag for both stores, which is a reasonable tradeoff; no functional issues spotted here.

Also applies to: 155-169, 337-401, 644-667

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c440e0 and 4326b37.

📒 Files selected for processing (5)
  • README.md (1 hunks)
  • envs/arkd.dev.env (1 hunks)
  • internal/config/config.go (5 hunks)
  • internal/infrastructure/db/postgres/utils.go (1 hunks)
  • internal/infrastructure/db/service.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/infrastructure/db/service.go
  • README.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-08T11:50:25.008Z
Learnt from: altafan
Repo: arkade-os/arkd PR: 733
File: cmd/arkd/commands.go:601-606
Timestamp: 2025-10-08T11:50:25.008Z
Learning: In the market hour configuration (cmd/arkd/commands.go and related files), when roundMinParticipantsCount or roundMaxParticipantsCount is set to 0, the system uses default values from environment variables ARKD_ROUN_MIN/MAX_PARTICIPANTS_COUNT. Therefore, it's acceptable that these values cannot be explicitly cleared back to 0 after being set to a non-zero value.

Applied to files:

  • envs/arkd.dev.env
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Scan
  • GitHub Check: unit tests
  • GitHub Check: integration tests
🔇 Additional comments (1)
envs/arkd.dev.env (1)

13-13: New ARKD_PG_DB_AUTOCREATE flag in dev env looks consistent

The variable name matches the PG_DB_AUTOCREATE key and enables auto-create only for this dev profile; no issues from a config perspective.

@qustavo
Copy link
Contributor Author

qustavo commented Nov 25, 2025

Thanks for reviewing @altafan. Turns that in postgres if you do not specify a database name in the DSN, you will by default connect to a database named after the user name that you are connecting as, hence the database "root" does not exit.
I also updated the envs/ to include ARKD_PG_DB_AUTOCREATE

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
internal/infrastructure/db/postgres/utils.go (3)

23-39: OpenDb: auto-create wiring looks good; consider error wrapping with %w

The new autoCreate flag and delegation to connectDB are clear and preserve existing behavior when autoCreate is false. To keep the underlying error chain available to callers, consider wrapping with %w instead of %v in both fmt.Errorf calls:

- return nil, fmt.Errorf("failed to open postgres db: %v", err)
+ return nil, fmt.Errorf("failed to open postgres db: %w", err)

- return nil, fmt.Errorf("unable to establish connection with db: %v", err)
+ return nil, fmt.Errorf("unable to establish connection with db: %w", err)

This lets higher layers still inspect driver/SQL errors via errors.Is / errors.As.


41-65: connectDB recursion and comments: behavior is fine, small docstring nit

The connectDB logic (ping → detect missing DB → auto-create → recurse with autoCreate=false) is straightforward and bounded (at most one recursive call), which is good.

Minor polish: the inline comment at Line 56 still mentions pingDB, but the function is named connectDB. Updating that comment will reduce confusion for future readers:

- // Recursively call pingDB now that the DB exists but set autoCreate false to avoid
- // unlikely but possible infinite recursion.
+ // Recursively call connectDB now that the DB exists but set autoCreate=false to avoid
+ // unlikely but possible infinite recursion.

67-109: Based on my verification:

  1. PostgreSQL SQLSTATE 42P04 is correct: This is the confirmed error code for duplicate_database in all current PostgreSQL versions.

  2. Race condition scenario is real: arkd with autoCreate=true on multiple concurrent instances can indeed race to create the same database, causing the second instance to hit a duplicate_database error.

  3. Error handling pattern is idiomatic: The suggested use of errors.As(err, &dbErr) followed by dbErr.Code == "42P04" is the standard Go 1.13+ pattern for extracting and checking PostgreSQL driver-specific errors from lib/pq.

  4. Mitigation is sound: Treating the duplicate_database error as non-fatal and allowing the process to continue is a pragmatic and correct approach to make database creation idempotent.

The review comment's technical suggestion is accurate and would genuinely improve resilience. All error codes, patterns, and reasoning check out.


createDB: good quoting/guards; consider handling concurrent creators gracefully

The createDB helper is on the right track:

  • It correctly restricts auto-create to URL-style DSNs.
  • It validates that a non-empty DB name is present.
  • It now uses pq.QuoteIdentifier(dbName) in the CREATE DATABASE statement, which avoids injection and handles special characters.

One robustness gap: if multiple arkd instances start with autoCreate=true, they can race. A second instance may hit the 3D000 "DB does not exist" error, call createDB, and then see CREATE DATABASE fail because another instance created the DB first. In that case you can safely treat the error as non-fatal and proceed to connectDB again.

A targeted tweak inside createDB would make this resilient:

-    if _, err := rootDB.ExecContext(ctx, query); err != nil {
-        return err
-    }
+    if _, err := rootDB.ExecContext(ctx, query); err != nil {
+        // If another instance created the DB in the meantime, treat it as success.
+        var dbErr *pq.Error
+        if errors.As(err, &dbErr) && dbErr.Code == "42P04" { // duplicate_database
+            log.Infof("Database %s already exists, continuing startup", dbName)
+            return nil
+        }
+        return err
+    }

This way auto-creation becomes idempotent across multiple concurrent processes, avoiding a hard startup failure when a race occurs.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4326b37 and 85468e3.

📒 Files selected for processing (1)
  • internal/infrastructure/db/postgres/utils.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Scan
  • GitHub Check: integration tests
  • GitHub Check: unit tests

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/infrastructure/db/service_test.go (1)

173-174: Configuration updated correctly to match new API signature.

The addition of the boolean flag aligns with the updated OpenDb(dsn, autoCreate) signature. However, the test only exercises the auto-create disabled (false) path.

Consider adding a test case that verifies the auto-create functionality when set to true:

 		{
 			name: "repo_manager_with_postgres_stores",
 			config: db.ServiceConfig{
 				EventStoreType:   "postgres",
 				DataStoreType:    "postgres",
 				EventStoreConfig: []interface{}{pgEventDns, false},
 				DataStoreConfig:  []interface{}{pgDns, false},
 			},
 		},
+		{
+			name: "repo_manager_with_postgres_stores_autocreate",
+			config: db.ServiceConfig{
+				EventStoreType:   "postgres",
+				DataStoreType:    "postgres",
+				EventStoreConfig: []interface{}{pgEventDns, true},
+				DataStoreConfig:  []interface{}{pgDns, true},
+			},
+		},

Note: The new test case would need databases that don't exist initially to properly test the auto-create feature.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85468e3 and a2a89df.

📒 Files selected for processing (1)
  • internal/infrastructure/db/service_test.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Scan
  • GitHub Check: unit tests
  • GitHub Check: integration tests

@altafan
Copy link
Collaborator

altafan commented Dec 5, 2025

I apologize again for the slow review process, but thank you for your contribution @qustavo!

@altafan altafan merged commit e8d8120 into arkade-os:master Dec 5, 2025
5 checks passed
@qustavo qustavo deleted the pg_db_autocreate branch December 5, 2025 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DB: Create DB if not exists

2 participants