Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database corruption when running low on inode #4050

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

Database corruption when running low on inode #4050

monetdb-team opened this issue Nov 30, 2020 · 0 comments
Labels

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2016-08-03 14:10:44 +0200
From: anthonin.bonnefoy
To: SQL devs <>
Version: 11.23.7 (Jun2016-SP1)
CC: frederic.jolliton+monetdb, @njnes, richard.monetdb

Last updated: 2016-12-21 13:08:05 +0100

Comment 22279

Date: 2016-08-03 14:10:44 +0200
From: anthonin.bonnefoy

User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
Build Identifier:

When running low on inode, the wal logging fail

!ERROR: bm_subcommit: commit failed
!ERROR: logger_exit: logger_commit failed
!FATAL: write-ahead logging failure, disk full?

However, on the next start, mserver will fail while processing log

Finished processing logs sql/sql_logs
!ERROR: bm_subcommit: commit failed
!ERROR: logger_exit: logger_commit failed
!ERROR: logger_cleanup: cannot open file sql_logs/sql/log.bak-4
!mvc_init: unable to create system tables
sending process 5448 (database 'nova') the TERM signal
!SQLException:SQLinit:Catalogue initialization failed

And from now on, the database will be unusable.

Freeing inodes before starting the second time will not trigger the corruption.

Reproducible: Always

Steps to Reproduce:

  1. Launch this script

!/bin/bash
set -eu

FARM="/tmp/farm"
NUM_INODE=100000

init_farm() {
pkill -e -9 mserver || true
pkill -e -9 monetdbd || true
while pgrep -f monetdbd; do
sleep 1
done

 sudo umount "$FARM" || true
 rm -rf "$FARM"
 mkdir -p "$FARM"

 sudo mount -t tmpfs -o nr_inodes=$NUM_INODE,size=10G tmpfs "$FARM"

 monetdbd create "$FARM"
 monetdbd start "$FARM"

 while ! pgrep -f monetdbd; do
     sleep 1
 done

 monetdb create nova
 monetdb release nova

 mclient nova -s "create schema sact;" 2> /dev/null || true

}

init_farm

mserver_pid=$(pidof mserver5)

rm -rf "$FARM/filler"
mkdir "$FARM/filler"

for (( i = 0; i < $((NUM_INODE - 10000)); i++ )); do
echo "" > "$FARM/filler/$i"
done

line="9GJ3152\t1467287703373954\t1467287703759937\t3\t3\t10\t\t62438190489824\t116350668306\t3\t0\t3232238295\t\t3223098188\t\t\t\t55460\t443\t0\t0\t0\t0\t52\t52\t\t46\tEthernet/IPv4/TCP\t21ae5e54-637e-405f-99f5-41b93d9d769a\t258\t132\t4\t2\t0\t0\t1\t1\t0\t0\t0\t0\t0\t1\t1\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t274\t75076\t1\t150529\t22658979841\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t605119\t366169004161\t\n"
rm -f /tmp/data
for (( i = 0; i < 10; i++ )); do
echo -e -n $line >> /tmp/data
done

create_query=""
for (( i = 0; i < 10000; i++ )); do
table="sact.test_$i"

 create_query="$create_query CREATE TABLE $table (toto1 TEXT, toto2 BIGINT, toto3 BIGINT, toto4 SMALLINT, toto5 SMALLINT, toto6 INT, toto7 INT, toto8 BIGINT, toto9 BIGINT, toto10 INT, toto11 INT, toto12 BIGINT, toto13 HUGEINT, toto14 BIGINT, toto15 HUGEINT, toto16 BIGINT, toto17 HUGEINT, toto18 INT, toto19 INT, toto20 SMALLINT, toto21 SMALLINT, toto22 SMALLINT, toto23 SMALLINT, toto24 INT, toto25 INT, toto26 TEXT, toto27 INT, toto28 TEXT, toto29 UUID, toto30 BIGINT, toto31 BIGINT, toto32 BIGINT, toto33 BIGINT, toto34 BIGINT, toto35 BIGINT, toto36 BIGINT, toto37 BIGINT, toto38 BIGINT, toto39 BIGINT, toto40 BIGINT, toto41 BIGINT, toto42 BIGINT, toto43 BIGINT, toto44 BIGINT, toto45 BIGINT, toto46 BIGINT, toto47 BIGINT, toto48 BIGINT, toto49 BIGINT, toto50 BIGINT, toto51 INT, toto52 INT, toto53 BIGINT, toto54 BIGINT, toto55 HUGEINT, toto56 BIGINT, toto57 BIGINT, toto58 HUGEINT, toto59 BIGINT, toto60 BIGINT, toto61 HUGEINT, toto62 BIGINT, toto63 BIGINT, toto64 HUGEINT, toto65 BIGINT, toto66 BIGINT, toto67 HUGEINT, toto68 BIGINT, toto69 BIGINT, toto70 HUGEINT, toto71 BIGINT, toto72 BIGINT, toto73 HUGEINT, toto74 BIGINT, toto75 BIGINT, toto76 HUGEINT, toto77 UUID);"
 create_query="$create_query COPY INTO $table FROM '/tmp/data' DELIMITERS '\t','\n','\"' NULL AS '<NULL>';"

 if [[ $mserver_pid != $(pidof mserver5) ]]; then
     echo "Got a crash"
     exit 1
 fi

 if [[ $((i % 4)) == 0 ]]; then
     echo "Launching create at $i"
     mclient nova -s "$create_query" > /dev/null
     create_query=""
 fi

done

  1. Once the script exit on commit failure, a launch of mclient nova will corrupt the database.

Comment 22314

Date: 2016-08-19 17:14:40 +0200
From: @njnes

seems fixed in the jun2016 version. Could you verify

Comment 22328

Date: 2016-08-29 14:30:22 +0200
From: Frédéric Jolliton <<frederic.jolliton+monetdb>>

I've reproduced the steps given by Anthonin, and the Jun2016 shows the same behavior, i.e. the database no longer starts and the log contains:

2016-08-29 12:27:23 MSG merovingian[18388]: database 'nova' (21357) has exited with exit status 1
2016-08-29 12:27:23 ERR control[18388]: !monetdbd: an internal error has occurred 'database 'nova' appears to shut itself down after starting, check monetdbd's logfile for possible hints'

The compiled version:

MonetDB 5 server v11.23.8 (64-bit, 64-bit oids, 128-bit integers)
This is an unreleased version
Copyright (c) 1993-July 2008 CWI
Copyright (c) August 2008-2016 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Found 15.6GiB available memory, 4 available cpu cores
Libraries:
libpcre: 8.39 2016-06-14 (compiled with 8.38)
openssl: OpenSSL 1.0.2h 3 May 2016 (compiled with OpenSSL 1.0.2h 3 May 2016)
libxml2: 2.9.4 (compiled with 2.9.4)
Compiled by: fjolliton@localhost (x86_64-pc-linux-gnu)
Compilation: gcc -O3 -fomit-frame-pointer -pipe -Werror -Wall -Wextra -W -Werror-implicit-function-declaration -Wpointer-arith -Wdeclaration-after-statement -Wundef -Wformat=2 -Wno-format-nonliteral -Winit-self -Winvalid-pch -Wmissing-declarations -Wmissing-format-attribute -Wmissing-prototypes -Wold-style-definition -Wpacked -Wunknown-pragmas -Wvariadic-macros -fstack-protector-all -Wstack-protector -Wpacked-bitfield-compat -Wsync-nand -Wjump-misses-init -Wmissing-include-dirs -Wlogical-op -Wunreachable-code -D_FORTIFY_SOURCE=2
Linking : /usr/bin/ld -m elf_x86_64

Comment 22334

Date: 2016-08-29 15:16:03 +0200
From: Frédéric Jolliton <<frederic.jolliton+monetdb>>

Changing version.

Comment 24623

Date: 2016-10-25 15:13:45 +0200
From: MonetDB Mercurial Repository <>

Changeset 2c63e1fc405d made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=2c63e1fc405d

Changeset description:

We need an extra logical reference to the catalog bats.
This should fix bug #4050.

Comment 24625

Date: 2016-10-25 15:16:35 +0200
From: Richard Hughes <<richard.monetdb>>

*** Bug #3988 has been marked as a duplicate of this bug. ***

Comment 24626

Date: 2016-10-25 15:26:59 +0200
From: @sjoerdmullender

I (hopefully) fixed this bug in the Jun2016 branch. Can you test?

Comment 24763

Date: 2016-12-08 10:18:40 +0100
From: @sjoerdmullender

I'm assuming this was fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant