row size limit? getting "Error: disk I/O error" from dqlite-demo #432

nickchen · 2022-11-16T06:40:55Z

is there a row byte size limit?

libuv=1.44.2, libraft=0.15.0, sqlite3=3.39.4, dqlite=1.11.1

running the dqlite-demo with following

server

CGO_LDFLAGS_ALLOW="-Wl,-z,now" CGO_ENABLED=1 go run  dqlite-demo.go --api 127.0.0.1:8001 --db 127.0.0.1:9001 -v

client

# base64 /dev/urandom | head -c 20000000 > file.txt
#  curl -X PUT -T "file.txt" http://127.0.0.1:8001/my-key-1
Error: disk I/O error

The text was updated successfully, but these errors were encountered:

MathieuBordere · 2022-11-16T08:57:11Z

Thanks for your submission, we'll have a look.

Can reproduce it reliably too, so this definitely looks like a bug. The first PUT request to a key seems to succeed, successive writes all fail.

cole-miller · 2022-11-16T22:17:35Z

I can also reproduce this, now trying to figure out which sqlite3 API call is failing.

cole-miller · 2022-11-16T22:56:04Z

It's this one, I think:

dqlite/src/leader.c

Line 343 in fdf7742

req->status = sqlite3_step(req->stmt);

Trying to step into that call but GDB is being difficult...

MathieuBordere · 2022-11-17T15:01:32Z

It's this one, I think:

dqlite/src/leader.c

Line 343 in fdf7742

req->status = sqlite3_step(req->stmt);

Trying to step into that call but GDB is being difficult...

The err is SQLITE_IOERR_SHORT_READ, try to look for that in dqlite's codebase, might give a clue.

cole-miller · 2022-11-17T16:22:43Z

Indeed, seems like we are hitting the wal->n_frames == 0 condition in vfsWalRead.

nickchen · 2022-11-28T17:18:41Z

just checking in to see if there are any progress with this?

cole-miller · 2022-11-28T18:41:42Z

@nickchen, I haven't identified the root cause of the issue yet; I'll try to work on this some more today.

cole-miller · 2022-11-29T18:18:43Z

Just an update @nickchen: I'm focusing on #435 and the possibly-related #419 for now, so I likely won't be able to address this issue until next week at the earliest.

cole-miller · 2022-11-30T21:37:30Z

I have time to work on this after all. Here is a backtrace:

0x7fde77e51c32 vfsWalRead
	src/vfs.c:844
0x7fde77e523a2 vfsFileRead
	src/vfs.c:932
0x7fde77fc4ee7 readDbPage
	./src/pager.c:2978
0x7fde77fc7fce getPageNormal
	./src/pager.c:5551
0x7fde77f49df4 btreeGetPage
	./src/btree.c:2143
0x7fde77f4ba22 btreeGetUnusedPage
	./src/btree.c:2286
0x7fde77f52015 allocateBtreePage
	./src/btree.c:6282
0x7fde77f55baf fillInCell
	./src/btree.c:6750
0x7fde77f55baf sqlite3BtreeInsert
	./src/btree.c:8984
0x7fde78000491 sqlite3VdbeExec
	./src/vdbe.c:5306
0x7fde7800c80e sqlite3Step
	./src/vdbeapi.c:716
0x7fde7800c80e sqlite3_step
	./src/vdbeapi.c:773
0x7fde7800c80e sqlite3_step
	./src/vdbeapi.c:761
0x7fde77e45193 leaderExecV2
	src/leader.c:355
0x7fde77e454ef execBarrierCb
	src/leader.c:393
0x7fde77e4590a raftBarrierCb
	src/leader.c:441
0x7fde77ecc970 applyBarrier
	src/replication.c:1368
0x7fde77ecc970 replicationApply
	src/replication.c:1605
0x7fde77ecded4 appendLeaderCb
	src/replication.c:523
0x7fde77ed463e uvAppendFinishRequestsInQueue
	src/uv_append.c:124
0x7fde77ed4efb uvAppendFinishWritingRequests
	src/uv_append.c:132
0x7fde77ed4efb uvAliveSegmentWriteCb
	src/uv_append.c:256
0x7fde77ee32ce uvWriterPollCb
	src/uv_writer.c:215
0x7fde77ae211d uv__io_poll
	./src/unix/epoll.c:374
0x7fde77acbc87 uv__io_poll
	./src/unix/udp.c:122
0x7fde77acbc87 uv_run
	./src/unix/core.c:389
0x7fde77e4bb2c taskRun
	src/server.c:632
0x7fde77e4bc63 taskStart
	src/server.c:658

What seems to happen is that a large enough value triggers the "overflow" condition in SQLite's fillInCell -- an extra page is needed to hold the row data. On the second attempt to insert, SQLite goes to allocate this overflow page, and wants to recycle a page that it's already allocated (for the previous insert, I assume). That leads to trying to read the existing page's content (not sure why), and SQLite thinks that the latest version of the page in question lives in the WAL -- which is incorrect, because the previous insert caused a checkpoint, so the WAL is empty. So it seems like a case of the WAL index file getting out of sync with the WAL following a checkpoint.

cole-miller · 2022-12-02T16:42:57Z

Apparently I was wrong -- the WAL index does get updated following the checkpoint, and if you inspect it just before the sqlite3_step that causes the error, you can see that mxFrame = 0, reflecting that there are no frames in the WAL. I'm not sure why SQLite is trying to read from the WAL in that case -- need to read over the code again.

cole-miller · 2022-12-06T17:35:22Z

Actually, it seems a bunch of frames are written to the WAL between the end of the checkpoint and the point where the error occurs, as part of the second EXEC, so it shouldn't be empty! But somehow when that vfsWalRead call happens, it doesn't find any frames.

Edit: Ahhh, the frames that are written to the WAL after the checkpoint are all part of a transaction, so they go into the tx field of struct vfsWal. But the "short read" check in vfsWalRead only looks at n_frames, not n_tx, even though the logic later on in that function does allow reading tx frames (which makes sense, because a transaction should be able to read the frames that it's just written). So I think that check should be changed to look at n_tx as well -- @freeekanayaka does that make sense to you?

dqlite/src/vfs.c

Lines 840 to 842 in ee7a3cd

    
           if (w->n_frames == 0) { 
        
           	return SQLITE_IOERR_SHORT_READ; 
        
           }

nickchen · 2022-12-07T20:08:01Z

nice! when's the planned release for this?

cole-miller · 2022-12-07T20:20:12Z

@MathieuBordere are we ready to tag v1.13.0, or would you like to give disk mode some more time to cook?

MathieuBordere · 2022-12-08T06:54:32Z

I would have liked it to simmer a bit longer, we could do a release early next week?

MathieuBordere · 2022-12-12T12:02:36Z

See https://github.com/canonical/dqlite/releases/tag/v1.13.0

cole-miller self-assigned this Nov 16, 2022

cole-miller mentioned this issue Dec 5, 2022

Support building sqlite3 from amalgamation #444

Merged

cole-miller mentioned this issue Dec 6, 2022

vfs: Fix the short read check in vfsWalRead #445

Merged

MathieuBordere closed this as completed in #445 Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

row size limit? getting "Error: disk I/O error" from dqlite-demo #432

row size limit? getting "Error: disk I/O error" from dqlite-demo #432

nickchen commented Nov 16, 2022

MathieuBordere commented Nov 16, 2022 •

edited

cole-miller commented Nov 16, 2022

cole-miller commented Nov 16, 2022

MathieuBordere commented Nov 17, 2022

cole-miller commented Nov 17, 2022

nickchen commented Nov 28, 2022

cole-miller commented Nov 28, 2022

cole-miller commented Nov 29, 2022

cole-miller commented Nov 30, 2022 •

edited

cole-miller commented Dec 2, 2022

cole-miller commented Dec 6, 2022 •

edited

nickchen commented Dec 7, 2022

cole-miller commented Dec 7, 2022

MathieuBordere commented Dec 8, 2022

MathieuBordere commented Dec 12, 2022

row size limit? getting "Error: disk I/O error" from dqlite-demo #432

row size limit? getting "Error: disk I/O error" from dqlite-demo #432

Comments

nickchen commented Nov 16, 2022

server

client

MathieuBordere commented Nov 16, 2022 • edited

cole-miller commented Nov 16, 2022

cole-miller commented Nov 16, 2022

MathieuBordere commented Nov 17, 2022

cole-miller commented Nov 17, 2022

nickchen commented Nov 28, 2022

cole-miller commented Nov 28, 2022

cole-miller commented Nov 29, 2022

cole-miller commented Nov 30, 2022 • edited

cole-miller commented Dec 2, 2022

cole-miller commented Dec 6, 2022 • edited

nickchen commented Dec 7, 2022

cole-miller commented Dec 7, 2022

MathieuBordere commented Dec 8, 2022

MathieuBordere commented Dec 12, 2022

MathieuBordere commented Nov 16, 2022 •

edited

cole-miller commented Nov 30, 2022 •

edited

cole-miller commented Dec 6, 2022 •

edited