Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

packets.go:417: busy buffer, packets.go:436: busy buffer #977

Closed
oranze opened this issue Jul 10, 2019 · 35 comments
Closed

packets.go:417: busy buffer, packets.go:436: busy buffer #977

oranze opened this issue Jul 10, 2019 · 35 comments

Comments

@oranze
Copy link

oranze commented Jul 10, 2019

Issue description

Tell us what should happen and what happens instead

in console output:
Scenarion 1:
`[mysql] 2019/07/10 14:54:46 packets.go:417: busy buffer

[mysql] 2019/07/10 14:54:46 packets.go:417: busy buffer

[mysql] 2019/07/10 14:54:47 packets.go:436: busy buffer

[mysql] 2019/07/10 14:54:47 packets.go:417: busy buffer

commands out of sync. You can't run this command now

[mysql] 2019/07/10 14:54:47 packets.go:436: busy buffer

[mysql] 2019/07/10 14:54:47 packets.go:417: busy buffer

commands out of sync. You can't run this command now

[mysql] 2019/07/10 14:54:47 packets.go:436: busy buffer

[mysql] 2019/07/10 14:54:47 packets.go:417: busy buffer

commands out of sync. You can't run this command now

[mysql] 2019/07/10 14:54:47 packets.go:436: busy buffer

[mysql] 2019/07/10 14:54:47 packets.go:417: busy buffer

[mysql] 2019/07/10 14:54:48 packets.go:436: busy buffer

[mysql] 2019/07/10 14:54:48 packets.go:417: busy buffer
sql: expected 4 arguments, got 2
sql: expected 4 arguments, got 2
sql: expected 4 arguments, got 2
**Scenario 2:**[mysql] 2019/07/10 14:58:53 packets.go:417: busy buffer

[mysql] 2019/07/10 14:59:11 packets.go:417: busy buffer

[mysql] 2019/07/10 14:59:43 packets.go:417: busy buffer`

Example code

Scenario 1:
I got a http message to API, create another goroutine (and close a first one to let free http connection) and do something with message, try to get data from MySQL.. all fine if I mak a tests with <100 messages and all bad if I got more then 100 messages per second. Output in the top...

Scenario 2
ok just like a test I create new logic with separate DB connection for each incoming message, works better but expensive by connections resources... and one buffer error. Output in top

Results:

Some part of messages was processed, some (with errors in output, no)... if we get first error in console any other Querys not work, if we try to use ping method before send - hi return nil... All errors handled by mysql golang driver and just printed with log. but not returned to upper level (????!!!)
So like a result impossible to detect error and establish new connection.
And yes all fine if you have <100 requests and <100 DB querys.

P.S.
I have found issue (dated by 2015) about same problems in multi routines mode, but this absolutely not explained why we cant handle error on app level and why we still get :417 error in each 1 goroutine=1 DB connector mode. Any Ideas how to solve this?

If possible, please enter some example code here to reproduce the issue.

Error log

If you have an error log, please paste it here.

Configuration

Driver version (or git SHA):
last

Go version: run go version in your console
1.12.6

Server version: E.g. MySQL 5.6, MariaDB 10.0.20

Server OS: E.g. Debian 8.1 (Jessie), Windows 10
Windows 10

@methane
Copy link
Member

methane commented Jul 10, 2019

Please show a reproducible example. It's very likely your application's problem.

For example, you may defer rows.Close() but you re-use the trx or conn before the rows.Close() is actually called. But this is not only the case.

@oranze
Copy link
Author

oranze commented Jul 10, 2019

Please show a reproducible example. It's very likely your application's problem.

For example, you may defer rows.Close() but you re-use the trx or conn before the rows.Close() is actually called. But this is not only the case.

All querys without transactions, Simple Select's.

/*
Create new connector to DB
*/
func CreateConnector() ( *sql.DB, error ) {
	var err error

	db, err = sql.Open( DB_DRIVER, confO.Get("db.user").String()+ ":"+ confO.Get("db.password").String() +"@tcp("+ confO.Get("db.host").String() +")/" )
	if err != nil{
		e.FailOnError( errors.New( "ERROR" ), fmt.Sprintf( "No Database connection. Error:\n%v", err) )
	} else{
		db.SetConnMaxLifetime( DB_ConnMaxLifetime )
		db.SetMaxIdleConns( DB_MaxIdleConns )
		db.SetMaxOpenConns( DB_MaxOpenConns )
		e.NoticeMessage( "Connected to: "+ confO.Get("db.host").String() )
	}

	return db, err
}

func PingDB (){
	err := db.Ping()

	if err != nil {
		_, err = CreateConnector()
		if err != nil {
			sleep( 5 )
			PingDB()
		}
	}
}

func DBGetTable_RACE_ById( push_id int64 ) ( DBTable_xtc_wars_RACE, error ){
	var TableResult = DBTable_xtc_wars_RACE{}

	// todo cache
	PingDB()
	rows, err := db.Query("SELECT R.id, R.db_race_race_id, T.country_code FROM x.RACE AS R JOIN x.MEETING AS M ON M.id = R.meeting_id JOIN x.TRACK AS T ON M.track_id = T.id WHERE R.id=? LIMIT 1;", push_id )
	if err != nil {
		fmt.Println( err )
		return TableResult, err
	}

	// Handle errors. First pattern
	defer func(){
		err := rows.Close()
		if err != nil {
			e.FailOnError( err, "can't make `rows.Close()." )
		}
	}()

	for rows.Next(){
		err := rows.Scan( &TableResult.id, &TableResult.db_race_race_id, &TableResult.country_code )
		if err != nil {
			e.FailOnError( err, "can't make `rows.Next()." )
			return TableResult, err
		}
	}

	// Handle errors. second pattern
	if err := rows.Err(); err != nil {
		e.FailOnError( err, "Rows error." )
	}

	if err := rows.Close(); err != nil {
		e.FailOnError( err, "can't make `rows.Close()." )
	}

	return TableResult, err
}

@methane
Copy link
Member

methane commented Jul 11, 2019

reproducible means I can easily run the example and see the issue.
I can not even compile your program. I don't have schema and data too.
It is very far from reproducible.

@oranze
Copy link
Author

oranze commented Jul 11, 2019

Good day,
Sorry, later I will prepear runneble example

@pfwgbe
Copy link

pfwgbe commented Aug 8, 2019

I've built a reproducer for our problems.

mysql_busy_buffer_reproducer.zip (had to ZIP the file because Github doesn't accept .go, sorry about that)

This is the output I'm seeing with USE_MYMYSQL set to false:

2019/08/08 12:41:11 here we go
2019/08/08 12:41:11 Driver: mysql Connection string: DB_USER:DB_PASS@tcp(DB_HOST)/
2019/08/08 12:41:12 can't insert dummy data: Error 1062: Duplicate entry '1' for key 'PRIMARY'
2019/08/08 12:41:12 195 seq: 123/2172
2019/08/08 12:41:12 844 seq: 123/2172
2019/08/08 12:41:12 734 seq: 123/2172
2019/08/08 12:41:12 309 seq: 123/2174
2019/08/08 12:41:12 522 seq: 123/2174
2019/08/08 12:41:12 159 seq: 123/2174
2019/08/08 12:41:12 721 seq: 123/2177
2019/08/08 12:41:12 563 seq: 123/2177
2019/08/08 12:41:12 264 seq: 123/2175
2019/08/08 12:41:12 213 seq: 123/2177
2019/08/08 12:41:12 640 seq: 123/2175
[mysql] 2019/08/08 12:41:12 packets.go:446: busy buffer
[mysql] 2019/08/08 12:41:12 packets.go:427: busy buffer
2019/08/08 12:41:12 25 seq: 123/2176
2019/08/08 12:41:12 923 seq: 123/2177
2019/08/08 12:41:12 685 seq: 123/2177
[mysql] 2019/08/08 12:41:12 packets.go:467: busy buffer
2019/08/08 12:41:12 took 1 tries to update counter
[mysql] 2019/08/08 12:41:12 packets.go:427: busy buffer
2019/08/08 12:41:12 210 seq: 123/2177
[mysql] 2019/08/08 12:41:12 packets.go:427: busy buffer
2019/08/08 12:41:12 can't find sequence: driver: bad connection

This is the complete output before the example code terminates.

While with USE_MYMYSQL set to true, this is the output, which matches what we'd expect:

[...]
Initialization and long output of the same as below omitted
[...]
2019/08/08 12:42:46 190 seq: 123/2577
2019/08/08 12:42:46 took 1 tries to update counter
2019/08/08 12:42:46 took 2 tries to update counter
2019/08/08 12:42:46 647 seq: 123/3137
2019/08/08 12:42:46 678 seq: 123/2977
2019/08/08 12:42:46 took 2 tries to update counter
2019/08/08 12:42:46 693 seq: 123/3110
2019/08/08 12:42:46 took 1 tries to update counter
2019/08/08 12:42:46 527 seq: 123/3151
2019/08/08 12:42:46 took 1 tries to update counter
2019/08/08 12:42:46 took 2 tries to update counter
2019/08/08 12:42:46 724 seq: 123/3166
2019/08/08 12:42:46 535 seq: 123/3133
2019/08/08 12:42:46 took 4 tries to update counter
2019/08/08 12:42:46 163 seq: 123/2719
2019/08/08 12:42:46 exiting successfully

I assume this is somewhat related to the re-use of connections, since the problem shows up less if we either decrease the number of concurrent calls to TouchSequence, increase the number of idle connections and total connections or set the maximum connection lifetime to some very short value so that the time frame in which a connection has a chance of getting re-used is very small.

We're using MySQL 10.0.38 and Galera 25.3.25 FWIW.

Can I provide anything else to aid in debugging this?

@methane
Copy link
Member

methane commented Aug 8, 2019

	// Initialize DB
	if _, err = db.Exec(`USE ` + TARGET_DB); err != nil {
		log.Fatalln(`can't select database:`, err)
	}

You must not do it. It execute "USE" statement on only one connection. All other connections runs without on DB.

@oranze
Copy link
Author

oranze commented Aug 8, 2019

Good day, One of us team now try to create reproducible simple code example with this bug.

And we not use any "USE ..." query. All query's have database.table

// Initialize DB
if _, err = db.Exec(USE + TARGET_DB); err != nil {
log.Fatalln(can't select database:, err)
}
You must not do it. It execute "USE" statement on only one connection. All other connections runs without on DB.

@pfwgbe
Copy link

pfwgbe commented Aug 8, 2019

	// Initialize DB
	if _, err = db.Exec(`USE ` + TARGET_DB); err != nil {
		log.Fatalln(`can't select database:`, err)
	}

You must not do it. It execute "USE" statement on only one connection. All other connections runs without on DB.

The USE and the two statements below that are just for initializing a demo DB. Once the DB has been initialized, they can be omitted. Our issue is independent from that. As @oranze mentioned, the actual queries against the DB all refer to the table as DB.TABLE.

@methane
Copy link
Member

methane commented Aug 8, 2019

I fixed your sample program:
https://gist.github.com/methane/1d84b452c1d9e5d2fb3ec23825b41bc5

And I don't see "busy buffer".

@oranze
Copy link
Author

oranze commented Aug 8, 2019

`I fixed your sample program:
https://gist.github.com/methane/1d84b452c1d9e5d2fb3ec23825b41bc5

And I don't see "busy buffer".`

Please provide how you get start example script? with or without race's detector?
Panic get starting on "big amount of traffic-requests and goroutines"

@oranze
Copy link
Author

oranze commented Aug 8, 2019

@methane and what exactly you have changed in example? Can't find diff

@methane
Copy link
Member

methane commented Aug 8, 2019

  • I used TABLE_NAME instead of TABLE_SEQUENCE
  • Removed db.Exec(USE + TARGET_DB)
  • Use connstr = fmt.Sprintf(%s:%s@tcp(%s)/%s, DB_USER, DB_PASS, DB_HOST, TARGET_DB)

That's all.

"busy buffer" is not root cause of the problem. Other bugs may cause "busy buffer".

@methane
Copy link
Member

methane commented Aug 8, 2019

You wrote

Driver version (or git SHA):
last

But there are no logs in packets.go:417 or packets.go:436 at master HEAD nor v1.4.1.
So you failed to use "last" version.

@pfwgbe
Copy link

pfwgbe commented Aug 8, 2019

I'm seeing the problem with v1.3.0-101-g877a977.

@methane
Copy link
Member

methane commented Aug 8, 2019

Then you should try v1.4.1 and master HEAD.

@pfwgbe
Copy link

pfwgbe commented Aug 8, 2019

Unless I'm mistaken, 877a977 is the HEAD of master. That's what I have as the last commit on my machine (the stuff after the -101- in the output of git describe I've shown above).

git show v1.4.1 gives me 72cd26f, which is from Nov last year. I'll give it a spin.

EDIT: v1.4.1 shows the same error. To be on the safe side, I rebuilt the test code with go clean; go build.

@methane
Copy link
Member

methane commented Aug 8, 2019

See these lines. Do you think these lines show logs?

* Command Packets *

}

@pfwgbe
Copy link

pfwgbe commented Aug 8, 2019

No, of course not. The error locations in the subject may not be current anymore. As is visible from the output of my test run, the busy buffer log messages come from packets.go:427, packets.go:446 and packets.go:467.

@methane
Copy link
Member

methane commented Aug 8, 2019

Please provide how you get start example script? with or without race's detector?

$ go build mysql_busy_buffer_reproducer.go
$ ./mysql_busy_buffer_reproducer  # dozen times.

@pfwgbe
Copy link

pfwgbe commented Aug 8, 2019

I've just built a Docker environment to try to make this more reproducible. Curiously enough, the issue doesn't show up when both reproducer and MySQL DB run in Docker containers on my local machine. The DB server we're using is in a DC that's reached through a VPN, with higher latencies.

I'll try to get a completely local reproducer working. Thank you for your patience.

@oranze
Copy link
Author

oranze commented Oct 9, 2019

The problem still exist. in most case it can be easy reproduced with some not last mariaDB. Ping method just not work and call panic method

@methane
Copy link
Member

methane commented Oct 9, 2019

@oranze Would you provide reproducible example?

@oranze
Copy link
Author

oranze commented Oct 9, 2019

if you try to call Ping() method before call Insert you will get panic.
os: CentOS Linux release 7.6.1810 (Core)
DB: 5.5.60-MariaDB
link to playgroundhttps://play.golang.org/p/dWNIk7UBG1e

@methane
Copy link
Member

methane commented Oct 9, 2019

It is not a reproducible example. Please add schema, and remove everything relating to HTTP.

@methane
Copy link
Member

methane commented Oct 9, 2019

And please paste all panic message.

@oranze
Copy link
Author

oranze commented Oct 9, 2019

And please paste all panic message.

It's like it is. one small app to save http incoming data. Sorry but I don't have so many time to create especially predesigned examples. App is short and if http in is a problem you can just cut it and left only one function and move it to main()

@oranze
Copy link
Author

oranze commented Oct 9, 2019

CREATE TABLE IF NOT EXISTS `actions` (
  `id` bigint(64) NOT NULL,
  `UNIT` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `CURR_TIME` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `UNIT_ID` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `SENSOR_NAME` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `SENSOR_VALUE` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT ''
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

@methane
Copy link
Member

methane commented Oct 9, 2019

It's like it is.

What do you mean? If there is a panic, there should be a detailed stacktrace at least.

@methane
Copy link
Member

methane commented Oct 9, 2019

I can not confirm this is really an issue of this project. It would be an issue in your application.

@methane methane closed this as completed Oct 9, 2019
@oranze
Copy link
Author

oranze commented Oct 9, 2019

It's like it is.

What do you mean? If there is a panic, there should be a detailed stacktrace at least.

I mean I haven't a free time to cut the code. Panic report I will send today later

@oranze
Copy link
Author

oranze commented Oct 9, 2019

methane closed this 1 minute ago
perfect, for what to fix bugs... easer just close the thread )))
"this is not a bug, this is a feature" )))))))

@methane
Copy link
Member

methane commented Oct 9, 2019

perfect, for what to fix bugs

There is no useful information that indicates it's a driver's bug, but not your application's bug.
Nothing we can do without information. Please use the user community to get support.

@inkeliz
Copy link

inkeliz commented Aug 25, 2021

I hit that issue with the following code:

	channel = make(chan Data, 1024)
        go func() {
         // ...
        // Creates listeners of "channel"
        }
	
	query, _ := db.Prepare(`SELECT DISTINCT columns FROM table`) // Really large table
	
	resp, err := query.Query(nil)
	if err != nil {
		panic(err)
	}
	
	for resp.Next() {
		d := Data{}
		
		if err := resp.Scan(&d.ID); err != nil {
			panic(err)
		}
		channel <- d  /// << Here is the issue!
	}
	
	if resp.Err() != nil {
		panic(err)
	}

That causes the busy buffer. The issue is the channel <- d. Once it fills the channel (and the consumption of the channel is quite slow), it causes the busy buffer for some reason. So, seems the MySQL give up of provide data and closes it.

I fix that by increase the size of the channel (say channel := make(chan Data, 20000000)), so it fits all the data, and it's possible to read the channel at same time. ;)


I'm not sure if there's any other fix, or if it's a bug or not.

@AndreyMashukov
Copy link

Hi everyone, I faced with same issue this week

packets.go:325: busy buffer

but problem was in memory, my server did not have enough RAM memory, please use htop and you will see how it depends on RAM

Just resize (add more RAM) your server and you will fix it

@methane
Copy link
Member

methane commented Mar 19, 2024

In most case, "busy buffer" is just a noise.
Real issue is happend before "busy buffer" is shown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants