Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] unexpected transaction conflict errors during concurrent inserts on table with SERIAL column #1783

Closed
kmuthukk opened this issue Jul 12, 2019 · 4 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug
Projects

Comments

@kmuthukk
Copy link
Collaborator

[Tested on: yugabyte-1.2.12.0 release.]

The following test inserts unique rows (and in fact, unique values for columns that have a UNIQUE constraint) from two concurrent threads.

But running this program pretty much always errors out with one of these types of errors:

psycopg2.errors.InternalError_: Operation failed. Try again.: Conflicts with higher priority transaction: 21c90dc3-b7f2-492f-9e92-843768091a18

or

psycopg2.errors.SerializationFailure: Error during commit: Operation expired: Transaction expired

or

psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "accounts_pkey"

Test program:

# On CentOS you can install psycopg2 thus:
# 
#  sudo yum install postgresql-libs
#  sudo yum install python-psycopg2
import psycopg2;
from datetime import datetime

from multiprocessing.dummy import Pool as ThreadPool

num_write_threads=2
num_users=1000

def create_table():
  conn = psycopg2.connect("host=localhost dbname=postgres user=postgres port=5433")
  conn.set_session(autocommit=True)
  cur = conn.cursor()
  cur.execute("""DROP TABLE IF EXISTS accounts""");
  cur.execute("""
        CREATE TABLE accounts(
            user_id serial,
            username VARCHAR (50) UNIQUE NOT NULL,
            password VARCHAR (50) NOT NULL,
            email VARCHAR (355) UNIQUE NOT NULL,
            created_on TIMESTAMP NOT NULL,
            last_login TIMESTAMP,
         primary key ( user_id ))
              """)
  print("Created accounts table")
  print("====================")

def load_data_slave(thread_num):
  thread_id = str(thread_num)
  conn = psycopg2.connect("host=localhost dbname=postgres user=postgres port=5433")
  conn.set_session(autocommit=True)
  cur = conn.cursor()

  print("Thread-" + thread_id + ": Inserting %d rows..." % (num_users))
  # "name" column is indexed, and may have duplicates across threads.
  for idx in range(num_users):
    cur.execute("""INSERT INTO accounts(username, password, email, created_on, last_login) VALUES (%s, %s, %s, %s, %s)""",
                ("user-"+thread_id+"-"+str(idx),
                 "user-p-"+thread_id+"-"+str(idx),
                 "user-e-"+thread_id+"-"+str(idx)+"@foobar.com",
                 datetime.utcnow(),
                 datetime.utcnow()))
  print("Thread-" + thread_id + ": Inserted %d rows" %  (num_users))

def load_data():
  pool = ThreadPool(num_write_threads)
  results = pool.map(load_data_slave, range(num_write_threads))

# Main
create_table()
load_data()
@kmuthukk kmuthukk added the area/docdb YugabyteDB core features label Jul 12, 2019
@kmuthukk kmuthukk added this to To do in YSQL via automation Jul 12, 2019
@spolitov
Copy link
Contributor

If I change primary key to (user_id, username) it does not fail. But there are duplicates in user_id:

 user_id | username |  password  |         email         |         created_on         |         last_login
---------+----------+------------+-----------------------+----------------------------+----------------------------
       1 | user-1-0 | user-p-1-0 | user-e-1-0@foobar.com | 2019-07-12 12:33:13.941227 | 2019-07-12 12:33:13.941496
       2 | user-0-0 | user-p-0-0 | user-e-0-0@foobar.com | 2019-07-12 12:33:13.954477 | 2019-07-12 12:33:13.954483
       3 | user-1-1 | user-p-1-1 | user-e-1-1@foobar.com | 2019-07-12 12:33:14.038095 | 2019-07-12 12:33:14.038105
       4 | user-1-2 | user-p-1-2 | user-e-1-2@foobar.com | 2019-07-12 12:33:14.052496 | 2019-07-12 12:33:14.052502
       4 | user-0-1 | user-p-0-1 | user-e-0-1@foobar.com | 2019-07-12 12:33:14.052412 | 2019-07-12 12:33:14.052421
       5 | user-0-2 | user-p-0-2 | user-e-0-2@foobar.com | 2019-07-12 12:33:14.065324 | 2019-07-12 12:33:14.06533
       5 | user-1-3 | user-p-1-3 | user-e-1-3@foobar.com | 2019-07-12 12:33:14.067651 | 2019-07-12 12:33:14.067658
       6 | user-1-4 | user-p-1-4 | user-e-1-4@foobar.com | 2019-07-12 12:33:14.078392 | 2019-07-12 12:33:14.078397
       6 | user-0-3 | user-p-0-3 | user-e-0-3@foobar.com | 2019-07-12 12:33:14.077138 | 2019-07-12 12:33:14.077144
       7 | user-0-4 | user-p-0-4 | user-e-0-4@foobar.com | 2019-07-12 12:33:14.085708 | 2019-07-12 12:33:14.085715```

Do we already support `SERIAL` columns? If yes, then it is a bug in their implementation.

@kmuthukk
Copy link
Collaborator Author

Thx @spolitov .

@kmuthukk
Copy link
Collaborator Author

kmuthukk commented Jul 12, 2019

@hectorgcr , @m-iancu:

  1. Per Postgres docs: << The data types smallserial, serial and bigserial are not true types, but merely a notational convenience for creating unique identifier columns (similar to the AUTO_INCREMENT property supported by some other databases). In the current implementation, specifying: >>

The docs further say that this doesn't automatically add a UNIQUE or NOT NULL constraint.

  1. Even without an explicit UNIQUE constraint on the SERIAL column, I would expect that the values implicitly inserted are unique (even if there end up being some holes).

  2. Do we implicitly create an index for SERIAL columns?

  3. Surprising that if test is modified to change the PRIMARY KEY to (user_id, username), then we end up insert duplicates for "user_id". Is this expected?

regards
Kannan

@kmuthukk
Copy link
Collaborator Author

Per @hectorgcr : << seems to be a bug with nextval implementation. Can’t see how that’s happening since every time we update our sequences table we do a compare and swap to detect concurrent writes. Will look into that >>.

@kmuthukk kmuthukk added the kind/bug This issue is a bug label Jul 12, 2019
@kmuthukk kmuthukk changed the title [YSQL] seeing unexpected transaction conflict errors during concurrent inserts [YSQL] unexpected transaction conflict errors during concurrent inserts on table with SERIAL column Jul 12, 2019
@ndeodhar ndeodhar moved this from To do to In progress in YSQL Jul 17, 2019
yugabyte-ci pushed a commit that referenced this issue Jul 18, 2019
Summary: `rescnt` needed to be set to `0` after the `retry` label in order for the code to increment `last` correctly. Without resetting `rescnt`, we were inserting the same last value into the sequences table. There was also an error in the where expression because we were not setting the `AND` operator.

Test Plan: New java unit tests that concurrently inserts rows into a table with a serial type. This test verifies that there are not repeated sequence values.

Reviewers: mihnea, mikhail, kannan, raju, neha

Reviewed By: raju

Subscribers: george, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D6911
YSQL automation moved this from In progress to Done Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug
Projects
YSQL
  
Done
Development

No branches or pull requests

5 participants