Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"pq: transaction error" when pointing migrate to redshift #90

Closed
sumits opened this issue Aug 6, 2018 · 11 comments
Closed

"pq: transaction error" when pointing migrate to redshift #90

sumits opened this issue Aug 6, 2018 · 11 comments

Comments

@sumits
Copy link

sumits commented Aug 6, 2018

Thanks for all the work for this migration tool, we are really hoping to add to our tools chest. I am trying to us migrate with redshift database and getting a strange error "pq: unexpected transaction status idle" when trying to run the simplest migration. I have tried running the SQL in my migrations file directly on redshift and that works fine. I have also tried pointing go-migrate to postgres database and that works well too. any pointers to make progress on this issue are greatly appreciated.

  • migrate version - 3.4.0
  • contents of migrations folder
    • 00001_CREATE_FIRST_TABLE.down.sql (DROP TABLE cities)
    • 00001_CREATE_FIRST_TABLE.up.sql (CREATE TABLE cities( cityid integer not null, city varchar(100) not null, state char(2) not null))
  • command - ./migrate -database redshift://user:pass@redshift.vpc.clypd.com:5439/go_migrate_test -path ./migrations -verbose up 1
  • output error
2018/08/06 17:02:38 Start buffering 1/u CREATE_FIRST_TABLE
2018/08/06 17:02:39 error: transaction commit failed in line 0:  (details: pq: unexpected transaction status idle)

please let me know if any more info is needed

@sumits sumits changed the title "pq: unexpected transaction status idle error" with redshift "pq: transaction error" when pointing migrate to redshift Aug 6, 2018
@dhui
Copy link
Member

dhui commented Aug 7, 2018

What happens when you run your migration SQL statement manually using lib/pq?

@sumits
Copy link
Author

sumits commented Aug 7, 2018

thanks @dhui, I am able to execute that query via lib/pq directly within a transaction with same redshift connection info

	db, err := sql.Open("postgres", ds.DataSourceName())
	if err != nil {
		panic(fmt.Sprintf("open %v", err))
	}
	defer db.Close()

	fmt.Println("# creating table in tx")

	tx, err := db.BeginTx(context.Background(), &sql.TxOptions{})
	if err != nil {
		panic(fmt.Sprintf("transaction %v", err))
	}

	if _, err = tx.Exec("update schema_migrations set dirty = true where version = 1"); err != nil {
		panic(fmt.Sprintf("insert %v", err))
	}

	if _, err = tx.Exec("create table cities( cityid integer not null, city varchar(100) not null, state char(2) not null)"); err != nil {
		panic(fmt.Sprintf("create %v", err))
	}

	if _, err = tx.Exec("update schema_migrations set dirty = false where version = 1"); err != nil {
		panic(fmt.Sprintf("insert %v", err))
	}

	if err = tx.Commit(); err != nil {
		panic(fmt.Sprintf("commit %v", err))
	}

@sumits
Copy link
Author

sumits commented Aug 7, 2018

@dhui I tried running migrate cli again on an empty schema_migrations table and no other tables. It fails with the 'unexpected transaction status idle' error but there is a value in schema_migrations with version and dirty = true. Only that value is committed.

The error might be coming from here (based on simple grep on error message in lib/pq). it seems like redshift might be thinking transaction is in bad state as trying the same migration on postgres 9.5 works ok. Only semi-related issue on pq is this one

If you can point me to where in go-migrate the migration is applied and where/how transactions are managed, I can try to connect dots

@dhui
Copy link
Member

dhui commented Aug 7, 2018

See the Run() receiver function in the postgres db driver. The redshift db driver uses the postgres DB driver.

To see how the DB driver is used, see the runMigrations() receiver function.

When the schema_migrations table has dirty=true, that means that the last migration did not complete successfully.

I don't know the extent to which lib/pq supports redshift and how much of the Postgres network protocol Redshift supports.

Only the 8.x version of the PostgreSQL query tool psql is supported.

@sumits
Copy link
Author

sumits commented Aug 8, 2018

thanks @dhui, those pointers help me reproduce the error in my script. the SetVersion method on postgres driver first truncates schema_migrations table. I added a truncate in my list of commands and was able to get the same transaction status idle error message

sequence of sqls executed are now

	tx, err := db.BeginTx(context.Background(), &sql.TxOptions{})
	if err != nil {
		panic(fmt.Sprintf("transaction %v", err))
	}

	if _, err = tx.Exec("truncate schema_migrations"); err != nil {
		panic(fmt.Sprintf("truncate %v", err))
	}

	if _, err = tx.Exec("update schema_migrations set dirty = true where version = 1"); err != nil {
		panic(fmt.Sprintf("insert %v", err))
	}

	if _, err = tx.Exec("create table cities( cityid integer not null, city varchar(100) not null, state char(2) not null)"); err != nil {
		panic(fmt.Sprintf("create %v", err))
	}

	if _, err = tx.Exec("update schema_migrations set dirty = false where version = 1"); err != nil {
		panic(fmt.Sprintf("insert %v", err))
	}

	if err = tx.Commit(); err != nil {
		panic(fmt.Sprintf("commit %v", err))
	}

Problem - Redshift's Truncate commits the transaction in which it is run, so the next commit call errors out from pq. This is different from Postgres where Truncate is transaction safe. Few notes/question,

  • why does SetVersion truncate schema_migration table each time, I got an impression from the docs that go-migrate will keep list of all migrations applied in db
  • let me know the best path forward. if its helpful I can add a SetVersion on Redshift driver which doesn't use Truncate (replaced with delete), try it out and submit a pr

@dhui
Copy link
Member

dhui commented Aug 8, 2018

Good find @sumits !

SetVersion() atomically sets the successfully applied migration version in the schema_migration table which is in the same DB as the rest of your tables.
The use of TRUNCATE predates me, but my guess is that Postgres didn't support upsert when migrate was being written.

Feel free to open a PR fixing the redshift driver to use DELETE instead of TRUNCATE

@sumits
Copy link
Author

sumits commented Aug 9, 2018

sounds good @dhui, will make the change and try out in couple of days

@sumits
Copy link
Author

sumits commented Aug 12, 2018

@dhui - finally got to this one and started with implementing Redshift's own SetVersion() that would do a DELETE. This approach demands more code, since the Redshift driver now needs access to the wired Postgres's transaction. As a start, Redshift driver will have to embed postgres.Postgres type, slight implementation change of Redshift's Open() method (since it might be better to call embedded Postres's Open method, Postgres's config and conn fields need to be exported, and might be more.

Before going too far with those changes, wanted to discuss the other option of using UPSERT or DELETE in Postgres driver's SetVersion(). It sounds like the schema_migrations table keeps the state of last migration, which is 1 row and as far as I know postgres TRUNCATE is better performant on large tables - delete shouldn't be a problem for 1 row. let me know, if trying UPSERT (or DELETE) is ok

@dhui
Copy link
Member

dhui commented Aug 12, 2018

Thanks for taking the time to fix this issue.

The Redshift db driver is already embedding the Postgres db driver via the Driver interface. I think it's fine to embed the Postgres db driver struct instead of the Driver interface, but I'm hesitant to export the Postgres db driver fields. Once the fields are exported, there's no going back...

I'm all for using INSERT ... ON CONFLICT (e.g. upsert) in Postgres' SetVersion(). I'm not worried about the performance of INSERT ... ON CONFLICT / DELETE vs truncate since the schema_migrations table will always be small. However, it looks like redshift doesn't support upsert, so you'll need to use DELETE and INSERT in a transaction for redshift.

It looks like redshift forked from postgres 8 and had diverged due to different goals, so it makes sense to maintain a separate redshift db driver for migrate. e.g. re-write the redshift db driver to not depend on the postgres db driver. If the redesigned DB driver supports composability, then we can refactor the redshift driver to use some shared logic/functionality of the postgres driver. A similar albeit different example is CockroachDB, which uses the postgres wire protocol but has a different query language, hence a different and independent db driver.

@sumits
Copy link
Author

sumits commented Aug 14, 2018

@dhui, I agree that keeping Redshift separate is the cleanest as redshift is not keeping up with all postgres developments and doesn't make sense to call them same. I will take a look at the cockroach driver and see what writing a new driver takes. I also saw guidelines somewhere in this repo while browsing, so that should help too

@dhui
Copy link
Member

dhui commented Aug 14, 2018

See the comments in database/driver.go for info about implementing a db driver.

For starters, I'd:

  1. Copy the postgres db driver and use that as a starting point. I don't think the CockroachDB driver has much in common w/ the postgres one.
  2. Copy the postgres db driver tests
    • Use the postgres:8 image (actually 8.4.22) since that's the closest official image to 8.0.2 (redshift's fork)
  3. Make any necessary modifications
    • Open()
    • SetVersion()
    • Lock()/Unlock() - if you want to run migrations in parallel from multiple hosts w/o any issues

andrei-m added a commit to andrei-m/migrate that referenced this issue Nov 3, 2018
TRUNCATE commits the current transaction, which breaks the expection of
the 'Commit()' that follows.

See:
golang-migrate#90
https://docs.aws.amazon.com/redshift/latest/dg/r_TRUNCATE.html
andrei-m added a commit to andrei-m/migrate that referenced this issue Nov 3, 2018
This addresses golang-migrate#90 . The
exported Redshift object no longer exports an embedde 'Driver' however,
so some more work is needed to make this backwards compatible.
@dhui dhui closed this as completed Jan 12, 2019
FPiety0521 pushed a commit to FPiety0521/Golang-SQL that referenced this issue May 24, 2023
TRUNCATE commits the current transaction, which breaks the expection of
the 'Commit()' that follows.

See:
golang-migrate/migrate#90
https://docs.aws.amazon.com/redshift/latest/dg/r_TRUNCATE.html
FPiety0521 pushed a commit to FPiety0521/Golang-SQL that referenced this issue May 24, 2023
This addresses golang-migrate/migrate#90 . The
exported Redshift object no longer exports an embedde 'Driver' however,
so some more work is needed to make this backwards compatible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants