Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: could not serialize access due to XXX (SQLSTATE 40001) #29

Closed
oxyno-zeta opened this issue Jun 1, 2022 · 7 comments
Closed

ERROR: could not serialize access due to XXX (SQLSTATE 40001) #29

oxyno-zeta opened this issue Jun 1, 2022 · 7 comments

Comments

@oxyno-zeta
Copy link

oxyno-zeta commented Jun 1, 2022

Hello @ucirello ,

I recently had those kind of errors:

  • ERROR: could not serialize access due to concurrent update (SQLSTATE 40001)
  • ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)

This is failing on the begin transaction part but don't know why...
But I succeed by creating a repro-case.

Here is the code for the repro-case:

package main

import (
	"fmt"
	"strings"

	"cirello.io/pglock"
	"gorm.io/driver/postgres"
	"gorm.io/gorm"
)

func main() {
	// Connect to database
	gormDb, err := gorm.Open(postgres.Open("host=localhost port=5432 user=postgres dbname=postgres password=postgres sslmode=disable"))
	// Check if error exists
	if err != nil {
		panic(err)
	}

	// Initialize pg lock
	// Get sql db
	sqlDb, err := gormDb.DB()
	// Check if error exists
	if err != nil {
		panic(err)
	}
	// Create pglock client
	c, err := pglock.UnsafeNew(sqlDb)
	// Check if error exists
	if err != nil {
		panic(err)
	}

	// Now create a lot of go routine with a lock acquire inside
	for i := 0; i < 99; i++ {
		go func() {
			for {
				lock, err := c.Acquire("lock-name")
				if err != nil {
					if strings.Contains(err.Error(), "could not serialize access due to") {
						// Debug here
						fmt.Println("here")
					}

					panic(err)
				}

				fmt.Println("got the lock !")

				err = lock.Close()
				if err != nil {
					panic(err)
				}
			}
		}()
	}
	<-make(chan bool, 1)
	return
}

With this go mod main modules:

require (
	cirello.io/pglock v1.8.1-0.20211117154543-39de3558537f
	gorm.io/driver/postgres v1.3.1
	gorm.io/gorm v1.23.3
)

PG server is running on version 13.6.

Sometimes, it is long to appear, sometimes not.

This is relating to #26 I think.

Regards,

Oxyno-zeta

EDIT: Added the PG server version

@oxyno-zeta
Copy link
Author

oxyno-zeta commented Jun 1, 2022

As a workaround for the moment, I've implemented a retry around the Acquire method:

func internalAcquireWithContext(ctx context.Context) (*pglock.Lock, error) {
	var lastSerializeError error
	// Initialize counter
	counter := 0

	// Loop until the max retry is reached
	for counter < 1000 {
		// Acquire lock
		ll, err := l.s.cl.AcquireContext(ctx, l.name)
		// Check error
		if err != nil {
			// Check if it is a transaction serialize error
			if strings.Contains(err.Error(), "could not serialize access due to") {
				// Yes, so increment and retry
				counter++

				lastSerializeError = err

				continue
			}

			// No, abort here
			return nil, err
		}

		// Return lock
		return ll, nil
	}

	// By default in this case, returning the transaction serialize error
	return nil, lastSerializeError
}

Tell me if I can help more than this.

Oxyno-zeta

@ucirello
Copy link
Collaborator

ucirello commented Jun 2, 2022

hi @oxyno-zeta -- thanks for opening this issue. I couldn't repro this in my local environment. Do you have more information about the version of the PostgreSQL that you are using?

@oxyno-zeta
Copy link
Author

Hello,

yes I totally forgot this. I've edited the main post with this information. It is a 13.6. I can reproduce it on a 12.X.

That's weird that you cannot reproduce it with this code. I reproduce it at every run... Maybe 2 parallel runs will trigger it ?

Regards,

Oxyno-zeta

ucirello added a commit that referenced this issue Jun 4, 2022
@ucirello
Copy link
Collaborator

ucirello commented Jun 4, 2022

Two interesting aspects:

  1. The current code is checking for SQLSTATE 4001:

    pglock/client.go

    Lines 516 to 528 in 6bab3f7

    func typedError(err error, msg string) error {
    const serializationErrorCode = "40001"
    if err == nil {
    return nil
    } else if err == sql.ErrNoRows {
    return &NotExistError{fmt.Errorf(msg+": %w", err)}
    } else if _, ok := err.(*net.OpError); ok {
    return &UnavailableError{fmt.Errorf(msg+": %w", err)}
    } else if e, ok := err.(*pq.Error); ok && e.Code == serializationErrorCode {
    return &FailedPreconditionError{fmt.Errorf(msg+": %w", err)}
    }
    return &OtherError{err}
    }

  2. This driver was never tested with GORM's version of the PostgreSQL driver.

@oxyno-zeta -- could you please try reproducing the error with github.com/lib/pq?

I have the impression that gorm.io/driver/postgres has a bug in their error parsing logic - and therefore the retry in this library is failing. In any case I see that you are using in your example the UnsafeNew. Well, that's unsafe :) You are not supposed to be doing it unless you really know what you are doing. And I guess your repro case shows the badness of having exposed the UnsafeNew.

In any case, I commited 9190ade. Could you please try running TestIssue29 with the github.com/lib/pq driver and see if the error happens to you?

@oxyno-zeta
Copy link
Author

Hello @ucirello ,

Thanks for checking this. I was afraid that this was coming from Gorm driver (pgx behind the scene) from your previous messages and your test confirms it. I will keep my workaround for the moment and try to see what isn't working with pgx in this specific case.

I strongly think that your UnsafeNew method is a good choice. Hope you won't remove it in the future because of that. The aim in my code was to have only one pool and connection and avoid managing twice the same thing aka "connection to PG". That's why I said that's a good choice. Thanks for this method !

I will close this bug. Thanks for your work and your tests with me. If you need another test or debug from me, just tell me.

Oxyno-zeta

@ucirello
Copy link
Collaborator

ucirello commented Jun 5, 2022

@oxyno-zeta -- I believe this version will fix your problems: https://github.com/cirello-io/pglock/releases/tag/v1.9.0

@oxyno-zeta
Copy link
Author

@ucirello : That's perfectly working ! Thanks for this commit and new release !

oxyno-zeta added a commit to oxyno-zeta/golang-graphql-example that referenced this issue Jun 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants