Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Database dump failure #248

Closed
traviscb opened this Issue · 11 comments

5 participants

Travis Brooks Tibor Simko Samuele Kaplun Jan Åge Lavik Ludmila Marian
Travis Brooks
Collaborator

Originally on 2010-08-16

As I mentioned on the last EVO, I'm taking some dumps of the db on dev and beta. Tibor suggested that max_allowed_packet was the culprit for having an unlcean dump of the db.

This is true for dev, I believe, in that I just finally took a snapshot of the db there with no errors, after raising max_allowed_packet in /etc/my.cnf to 500M. Still loading that on my machine here to make sure I am right.

However, on beta, I set max_allowed_packet to 500M and then 1G (on pcudssw1502)and still have a problem:

2010-08-16 20:15:01 --> Task #6337 started.
2010-08-16 20:15:01 --> Reading parameters started
2010-08-16 20:15:01 --> Reading parameters ended
2010-08-16 20:15:01 --> Database dump started
2010-08-16 20:15:01 --> ... writing /opt/tbrooks/data-dumps//inspirehep-dbdump-2010-08-16_20:15:01.sql
2010-08-16 20:24:53 --> ERROR: mysqldump exit code is 768.
2010-08-16 20:24:54 --> Task #6337 finished. [ERROR]

I can't find anything obvious about 768 error codes from MySQL.

This is not the same problem that dev was giving when doing the dumps that failed, so this may be a different issue, but note that the dump file gets to about 15G and ends with:


--
-- Dumping data for table `rnkCITATIONDATA`
--

LOCK TABLES `rnkCITATIONDATA` WRITE;
/*!40000 ALTER TABLE `rnkCITATIONDATA` DISABLE KEYS */;
Tibor Simko
Owner

Originally on 2010-08-16

You seem to have changed one value only; the [mysqldump] section value
still stayed low.

PCUDSSW1502> grep max_a /etc/my.cnf
max_allowed_packet = 1G
max_allowed_packet = 160M
Travis Brooks
Collaborator

Originally on 2010-08-16

Thanks, good point I missed that. but after fixing:


[tbrooks@pcudssw1502 ~]$ grep packet /etc/my.cnf
max_allowed_packet = 1G
max_allowed_packet = 1G

I still get:

2010-08-16 20:15:01 --> Task #6337 started.
2010-08-16 20:15:01 --> Reading parameters started
2010-08-16 20:15:01 --> Reading parameters ended
2010-08-16 20:15:01 --> Database dump started
2010-08-16 20:15:01 --> ... writing /opt/tbrooks/data-dumps//inspirehep-dbdump-2010-08-16_20:15:01.sql
2010-08-16 20:24:53 --> ERROR: mysqldump exit code is 768.
2010-08-16 20:24:54 --> Task #6337 finished. [ERROR]
Tibor Simko
Owner

Originally on 2010-08-16

Apparently fixed by setting max_allowed_packet on the client side
as well; a fresh DB dump is available at
/opt/cds-invenio/var/log/inspirehep-dbdump-2010-08-16_23:28:32.sql.
Closing the ticket, let's wait with further analysis in case problem
reappears.

Tibor Simko tiborsimko closed this
Travis Brooks
Collaborator

Originally on 2010-08-17

For future reference, note that this is set in the client side, not in the args to mysqldump, but in my.cnf on the application server (1506 as opposed to 1502, the db server):

[tbrooks@pcudssw1506:tbrooks/data-dumps]$ more /etc/my.cnf  

<snip>

[mysqldump]
max_allowed_packet = 1G
Tibor Simko
Owner

Originally on 2010-08-17

Yes, that's exactly what I did.

Samuele Kaplun
Collaborator

Originally on 2011-02-03

Mmh. I have the same issue in OpenAIRE. I tried to fix it setting 1G in any instance of max_allowed_packet without success. Moreover it is the first time that this errore happens (I think) and OpenAIRE has only 3 records :-) So I think it's an error unrelated to max_allowed_packet.

Could it be due to AFS?

P.s. I found this ticket by Googling around for MySQL exit code 768... We seem to be almost the only one on the web falling on this issue :-)

Tibor Simko
Owner

Originally on 2011-04-01

After further adjustments of MySQL parameters, the dumper has been
performing well for the past few weeks, so I'm closing this ticket.

Jan Åge Lavik
Collaborator

Originally on 2011-08-12

This issue seem to be back on Inspire, as last nights dbdump failed in this exact way. Has anyone any further information about this issue?

Ludmila Marian
Collaborator

Originally on 2011-08-12

dbdumped failed on CDS as well last night, but with a different exit code: 1280, which I don't know what it is, and did not find any useful hints googleing it.
On the other hand, 2 nights ago, dbdumped failed on CDS with exit code 768 (as the original ticket reports). I modified my.cnf (as suggested above) but I am not convinced this was causing the problem.. it might be AFS playing tricks on us :(

Tibor Simko
Owner

Originally on 2011-08-15

Replying to [comment:13 lmarian]:

I modified my.cnf (as suggested above) but I am not convinced this was causing the problem.. it might be AFS playing tricks on us :(

I had increased max_allowed_packet on INSPIRE boxes because of the big citation dictionaries, where ~1GB of blob data is stored in one row. This definitely helped to remove the dumping blocker at the time, although the dump error re-appeared from time to time afterwards.

On CDS, the dumps were working fine in the past even with smaller max_allowed_packet value, because there are no big blobs such as citation dictionaries. So it should not be necessary to tweak this concrete parameter. Still, some of the tables are huge, so we may need to tweak some other buffer parameters.

I'll try to look at this problem closer now that it reoccurs more frequently again. FWIW, dumping to local file system also caused troubles on INSPIRE when I was testing this one year ago, although this may have been before max_allowed_packet and buffer tweaks. I'll have a look.

Tibor Simko
Owner

Originally on 2011-08-16

So I've had a look and both CDS and INSPIRE dumping problems were due to disk space issue. The dump files are bigger than 33G now, so there was not enough room on the dump partition to (temporarily) hold three dumps anymore. I have cleaned the disk space last night and the dump happened successfully both on CDS and INSPIRE, see for example RT ticket 143960.

To be safer in the future, I have increased the AFS quota for dbdump volume, see RT ticket 144096. Moreover, dbdump will soon check for available space before it runs, so that it won't stop the queue unnecessarily, see #578.

I'm therefore closing this concrete ticket here again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.