sync.Pool leaks #146

WinstonPrivacy · 2019-10-26T13:35:37Z

We've noticed memory leaks with KCP-Go. We have traced this down to sync.Pool leaks by instrumenting with counters for each of the calls to Get() and Put(), ie:

					// recycle the recovers
					xmitBuf.Put(r)
					// Temporary counter
					atomic.AddUint64(&Framesput, 1)

Over time, about 5-8% of byte buffers are not being returned to the sync.Pool and at 1500 bytes each, this adds up.

Are you seeing the same thing? Any thoughts on how to fix this?

The text was updated successfully, but these errors were encountered:

xtaci · 2019-10-26T13:39:26Z

runtime will recycle the rest

WinstonPrivacy · 2019-10-26T14:36:45Z

That doesn't appear to be the case. We are seeing blocks allocated by sync.pool and held by kcp-go which are never released.

Here's an example comparison of two heap dumps, taken about 60 minutes apart:

The sync pool has grown by 13Mb here. 9Mb of this was allocated by output(). Not shown are the other 4Mb which was allocated by Input().

WinstonPrivacy · 2019-10-26T14:41:48Z

As an additional experiment, I shut down all smux/kcp-go sessions. Afterwards, the memory remains allocated by sync pool (28Mb total):

xtaci · 2019-10-26T14:54:34Z

sync.Pool will not return memory to system immediately

WinstonPrivacy · 2019-10-26T15:08:27Z

We have clocked this for up to 24 hours and it never returns the memory. That actually makes sense because the byte buffers are not put back.

…

On Sat, Oct 26, 2019, 9:54 AM xtaci ***@***.***> wrote: sync.Pool will not return memory to system immediately — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AJ7Z6VOSHUTFVRCSRLQQRK3DA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECKJWNQ#issuecomment-546609974>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3ANIESIUAOKPWNBXESTQQRK3DANCNFSM4JFNJXOA> .

xtaci · 2019-10-27T12:08:20Z

https://github.com/xtaci/kcp-go/blob/master/fec.go#L30
FEC will hold some data

WinstonPrivacy · 2019-10-27T19:25:10Z

I can confirm that ~90% of the sync.pool leaks are coming from that file. The remaining portion is coming from segment leaks (kcp.go). Any thoughts on a best approach?

…

On Sun, Oct 27, 2019 at 7:08 AM xtaci ***@***.***> wrote: https://github.com/xtaci/kcp-go/blob/master/fec.go#L30 FEC will hold some data — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AP3EV3V2FPZHLFVVKTQQWADZA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECK42AY#issuecomment-546688259>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AJNPG5M2P6MRKEEW3LQQWADZANCNFSM4JFNJXOA> .

xtaci · 2019-10-28T02:54:43Z

it's a fixed buffer, and the logic of retransmission, no thoughts on that

WinstonPrivacy · 2019-10-28T13:13:08Z

Shouldn't all packets be returned to the sync.Pool after they are acknowledged?

…

On Sun, Oct 27, 2019 at 9:54 PM xtaci ***@***.***> wrote: it's a fixed buffer, and the logic of retransmission, no thoughts on that — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AMYFH4BUL2XHANBGB3QQZH7TA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECLRGEA#issuecomment-546771728>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AIOE2NKMFVYD4RGBX3QQZH7TANCNFSM4JFNJXOA> .

WinstonPrivacy · 2019-10-28T18:44:50Z

I was able to capture a better heap dump. The severity of the leak is pretty bad... the process can only run for about 4-6 hours before it crashes due to running out of memory.

It looks like the FECEncoder and Decoder are being held on to primarily along with the sync.Pool buffers. I am trying to trace it down further and it appears that updater is calling kcp:flush(). That in turn is opening a new UDP session in some cases. Maybe it is trying to send frames to a closed connection?

xtaci · 2019-10-29T03:58:55Z

FEC contains a fixed-size sliding window(shardsize * 3), it will not return nor grow, i'v added a timeout policy to purge this window, don't know it will help or not.

if your process crashed after 4 hrs, it's probably because your code has some problem,
kcptun uses this kcp-go for long running routers, it means they will run over months.

WinstonPrivacy · 2019-10-29T18:10:36Z

We're running with 1Gb RAM, which means this process can use at most half of that, so that's probably why we're crashing. It is possible that our code is interacting with the package in an unexpected way, but we hadn't had a problem until we rebased off of the current master. The previous code (running from late last year) did not have any leaks. We were able to improve things quite a bit by 1) flushing, then closing the Snappy Writer (these stay resident because they aren't being closed... see their docs) when CompStreams are closed and 2) Returning untransmitted byte slices back to the sync pool after UDPSession.updater() is closed. The process can now run for about 9-10 hours under load without crashing.

…

On Mon, Oct 28, 2019 at 10:59 PM xtaci ***@***.***> wrote: FEC contains a fixed-size sliding window(shardsize * 3), it will not return nor grow, i'v added a timeout policy to purge this window, don't know it will help or not. if your process crashed after 4 hrs, it's probably because your code has some problem, kcptun uses this kcp-go for long running routers, it means they will run over a month. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AP4INDWRYXOEXKAFKLQQ6YILA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECPE6BI#issuecomment-547245829>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AKHMUYSNG6GDHRUL4DQQ6YILANCNFSM4JFNJXOA> .

xtaci · 2019-10-30T03:06:34Z

how many concurrent connections on this 1GB ram server?

xtaci · 2019-10-30T03:09:48Z

and have you invoked sess.Close()

WinstonPrivacy · 2019-10-30T03:14:04Z

Between 20-40.

…

On Tue, Oct 29, 2019, 10:10 PM xtaci ***@***.***> wrote: and have you invoked sess.Close() — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AMU63SMB52EDRF2QO3QRD3IRA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECSYYNY#issuecomment-547720247>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AP6L3N4FZBSRKLP2XLQRD3IRANCNFSM4JFNJXOA> .

xtaci · 2019-10-30T03:16:35Z

a recent change is to allocate a goroutine for each session.updater, so a possible leak is you didn't invoke sess.Close to close the goroutine

WinstonPrivacy · 2019-10-30T04:12:44Z

I've casually checked this but it's a good suggestion. I'll add some counters to ensure that we are closing all sessions.

…

On Tue, Oct 29, 2019 at 10:16 PM xtaci ***@***.***> wrote: a recent change is to allocate a goroutine for each session.updater, so a possible leak is you didn't invoke sess.Close to close the goroutine — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AJIGKAOR7K5R5YLSIDQRD4B5A5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECSZCMQ#issuecomment-547721522>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AIJSF4IS22NUBNZ7CLQRD4B5ANCNFSM4JFNJXOA> .

WinstonPrivacy · 2019-10-30T20:07:28Z

We're closing sessions properly... the counts are always exactly what they should be (I found your snmp counters, which is very helpful).

Is it possible that there is some kind of underlying Golang thing happening here? We are using sync.pool elsewhere in our application.

Edit: This seems doubtful. We're creating separate sync pools and I've seen other implementations which do the same thing, so I don't think that can be the problem.

WinstonPrivacy · 2019-10-30T21:07:10Z

After a preliminary analysis, your latest commits look very good. After 5minutes, we're seeing a 1.5Mb reduction in RAM as a result of cleaning up timed out FEC packets. 20% of FEC packets was typical before, now it's down to 4%.

I will continue to watch and report back!

WinstonPrivacy · 2019-10-31T16:24:06Z

This latest commit definitely helped a lot. We're still experiencing a significant memory leak but I've been able to run for 13h under heavy load now, about 3x longer than before.

xtaci · 2019-11-01T03:12:56Z

glad to hear, how about setting like GOGC=20 to recycle aggresively?

WinstonPrivacy · 2019-11-01T11:48:25Z

Well, GC isn't the issue. I manually trigger GC=100 several times and we still see leaks, even hours after all objects have been closed and deleted. In fact, I have been stripping out large sections of our code to narrow the issue down. All UDPsessions are correctly closed. The leaks come primarily from output() and send(), which indicates that the xmitBuffer is not being emptied. Is it possible that broken connections could be resulting in buffers left in the transmission queue?

…

On Thu, Oct 31, 2019, 10:13 PM xtaci ***@***.***> wrote: glad to hear, how about setting like GOGC=20 to recycle aggresively? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AILDVIXIF5PVU6Y3J3QROND7A5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECZ3ZGQ#issuecomment-548650138>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AMUMXUXYQWMF6PNHDLQROND7ANCNFSM4JFNJXOA> .

xtaci · 2019-11-01T12:46:22Z

theoretically, xmitBuf will be recycled automatically, yes it's true a closed session may left unacknowledge data not being returned to xmitBuf, but it will be recycled by runtime eventually.
setting a lower GOGC may mitgate this, but I don't know why this is not working. Did you find goroutine leaking in tests?

xtaci · 2019-11-01T12:51:55Z

though I can have the segments to return to xmitBuf when closing, but I want to figure out why
I mean if ,by accident, UDPSession(s) were held by some data structure in you program, leaking is inevitable.

WinstonPrivacy · 2019-11-01T13:36:32Z

No, there are no leaking goroutines. If the frames are not put back to the pool because xmitbuf was shut down, they will be pinned forever. That is exactly what we're seeing. Would it be possible to implement and idle timeout, such that if there is no activity for 60 seconds, the buffer is flushed and the UDP session is closed? I attempted to do this in the read and write functions, but this was unsuccessful.

…

On Fri, Nov 1, 2019, 7:52 AM xtaci ***@***.***> wrote: though I can have the segments to return to xmitBuf when closing, but I want to figure out why — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AO7VWABR77YQ5DF5ULQRQQ7FA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC22WMQ#issuecomment-548776754>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3ALB6LOWIRUZQH72XITQRQQ7FANCNFSM4JFNJXOA> .

xtaci · 2019-11-01T13:54:44Z

no, normally, UDPSession shouldn't close itself in any condition, like what socket() does, the only way to close is to check the errors returned from Read() and Write(), including timeout errors, so , a program MUST implement keepalive mechanism to guarantee the session will be closed when there's no error returned from Read() and Write()， it's exactly what smux's keepalive does.

btw, purging buffers when calling Close() is ok, since Close() has resource releasing semantic like in libc close(), but I don't know if this change could work for your scenario.

WinstonPrivacy · 2019-11-01T14:04:29Z

That makes sense. I was able to reduce the memory leaks by flushing the snappy writer in CompDtream.Close(). We do call Close() in all situations, so putting any remaining frames back into the pool at that time sounds like it would work.

…

On Fri, Nov 1, 2019, 8:54 AM xtaci ***@***.***> wrote: no, normally, UDPSession shouldn't close itself in any condition, like what socket() does, the only way to close is to check the errors returned from Read() and Write(), including timeout errors, so , a program MUST implement keepalive mechanism to guarantee the session will be closed when there's no error returned from Read() and Write()， it's exactly what smux's keepalive does. btw, Puring buffers when calling Close() is ok, since Close() has resource releasing semantic like in libc close(), but I don't know if this change could work for your scenario. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AMQOXALHLUF6A2N4P3QRQYKXA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC27NHQ#issuecomment-548796062>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3ALAWCOG6KXFK2Q2GILQRQYKXANCNFSM4JFNJXOA> .

xtaci · 2019-11-01T14:29:49Z

try v5.4.14

WinstonPrivacy · 2019-11-01T15:06:54Z

Will do!

WinstonPrivacy · 2019-11-01T18:41:29Z

It seems a bit better at this point but I have yet to really stress test it. I have added some debug messages to your patch and can see that more segments are being returned to the sync pool, however after shutting everything down and GC'ing a few times, there still appears to be pinned objects. Here's a heap dump:

I will hit this harder later to see how long the process can run without crashing.

WinstonPrivacy · 2019-11-04T04:00:27Z

Yes, and as a result, uncork() doesn't actually remove the []byte buffers from UDPSession. Furthermore, you don't need to pass txqueue in as you already have access to it via the original *UDPSession object in the signature. Removing txqueue from the function parameter and using the original reference results in uncork() actually removing the buffer from UDPSession. Whether that improves things or not, I am still currently testing.

…

On Sun, Nov 3, 2019 at 9:37 PM xtaci ***@***.***> wrote: slice are passed by value by default — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3ANG3GKXCFEAIMYRPH3QR6KJDA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC6GBKA#issuecomment-549216424>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3ALVWW4OIZOFSNUHX6DQR6KJDANCNFSM4JFNJXOA> .

WinstonPrivacy · 2019-11-04T04:05:50Z

Looks like this resolved most of the problem. I ran the process for about 3 hours, which would typically result in about 40Mb of RAM leaked to the sync pool. Now it is down to just 4Mb. Half of this is in the sync.pool, which as a global variable should be expected to remain behind for awhile. The other is in newFECEncoder.

Relevant code snippet illustrating removing the function parameter:

func (s *UDPSession) tx() {
	if s.xconn == nil || s.xconnWriteError != nil {
		s.defaultTx()
		return
	}

	// x/net version
	nbytes := 0
	npkts := 0
	for len(s.txqueue) > 0 {
		if n, err := s.xconn.WriteBatch(s.txqueue, 0); err == nil {

xtaci · 2019-11-04T04:25:08Z

changing tx() parameters is another story, but s.txqueue should be just set to nil to recycle immediately

xtaci · 2019-11-04T04:36:54Z

https://github.com/xtaci/kcp-go/releases/tag/v5.4.16

WinstonPrivacy · 2019-11-04T13:18:00Z

If you set the txqueue to nil, it won't update the original because it's a copy. Also, you still have to put the segments back to the pool or they will be pinned forever. When you use sync.pool, you are taking responsibility for memory management... Setting to nil will not free them.

…

On Sun, Nov 3, 2019, 10:25 PM xtaci ***@***.***> wrote: change tx() parameters is another story, but s.txqueue should be just set to nil to recycle immediately — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AMBKOVFCPIPOGNIC4LQR6P25A5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC6HJSI#issuecomment-549221577>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AJ5WKO3XH4BI2QAH73QR6P25ANCNFSM4JFNJXOA> .

WinstonPrivacy · 2019-11-04T18:40:30Z

Just saw the latest commit. Wouldn't it be easier and probably better design to just eliminate the parameter? There is no code path in which a tx buffer is sent in that is not the same one as *UDPSession.txqueue. Here's what I'm using and it clears out almost everything when the UDPSessions are closed:

func (s *UDPSession) tx() {
	if s.xconn == nil || s.xconnWriteError != nil {
		s.defaultTx()
		return
	}

	// x/net version
	nbytes := 0
	npkts := 0
	for len(s.txqueue) > 0 {
		if n, err := s.xconn.WriteBatch(s.txqueue, 0); err == nil {
			for k := range s.txqueue[:n] {

				nbytes += len(s.txqueue[k].Buffers[0])
				xmitBuf.Put(s.txqueue[k].Buffers[0])
				// TODO: Record Put
				atomic.AddUint64(&Framesput, 1)
			}
			npkts += n
			s.txqueue = s.txqueue[n:]
		} else {			
			// compatibility issue:
			// for linux kernel<=2.6.32, support for sendmmsg is not available
			// an error of type os.SyscallError will be returned
			if operr, ok := err.(*net.OpError); ok {
				if se, ok := operr.Err.(*os.SyscallError); ok {
					if se.Syscall == "sendmmsg" {
						s.xconnWriteError = se
						s.defaultTx()
						return
					}
				}
			}
			s.notifyWriteError(errors.WithStack(err))
			break
		}
	}

	atomic.AddUint64(&DefaultSnmp.OutPkts, uint64(npkts))
	atomic.AddUint64(&DefaultSnmp.OutBytes, uint64(nbytes))
}

WinstonPrivacy · 2019-11-05T00:09:34Z

Looks like we're still leaking FEC Decoder packets. Here's a chart showing how RAM usage directly correlates to packets which weren't Put() back in the sync.Pool:

The leak is still pretty severe - between 10-30% of total Get()'s are never returned, based on traffic conditions. Some are still present when the UDPSession is closed but cleaning those up amounts for only a small percentage of the total.

Any chance there is a race condition which could be overwriting the decoder.rx queue?

WinstonPrivacy · 2019-11-05T04:00:34Z

Cleaning up fecDecoder.rx packets after update() closes reduces the total leak by half or more, so I think that calling Close() was contributing to the leak.

Unfortunately, this still doesn't explain how so many packets are still being lost in the decode() logic. I am beginning to suspect a race condition which results in a conflict when updating the decoder.rx entries.

xtaci · 2019-11-05T04:46:39Z

I don't want to discuss parameters changing in tx() now, it's a long story.
And let's focus on xmitBuf now.

xtaci · 2019-11-05T04:48:56Z

FEC will occupy 3x(datashard+parityshard) * mtuLimit at ANY time, this is how FEC works-- recovery data from previous data and parity.

xtaci · 2019-11-05T06:28:27Z

and, Yes, we can recycle FEC shards after close()

WinstonPrivacy · 2019-11-05T14:45:36Z

There seems to be a race between update() and close(). I have had better results by cleaning up the shards when update loop returns. Is there any possibility that decode() can be hit concurrently?

…

On Tue, Nov 5, 2019, 12:28 AM xtaci ***@***.***> wrote: and, Yes, we can recycle FEC shards after close() — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#146?email_source=notifications&email_token=AIXT3AOK6XZGFYOJJ2ATDZLQSEHBNA5CNFSM4JFNJXOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDBXQZQ#issuecomment-549681254>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIXT3AK6XPIGTXVDXIRDHD3QSEHBNANCNFSM4JFNJXOA> .

WinstonPrivacy · 2019-11-05T23:25:14Z

Great news, I found the root cause(s).

In decode(), there are two cases in which []byte slices are not being returned to the pool:

After the call to dec.codec.ReconstructData(), the recovered shards are added to an array and returned to the caller, where they are later Put(). However, the remaining parity shards (30 in our case), are not returned. This is the fix:

			if err := dec.codec.ReconstructData(shards); err == nil {
				for k := range shards[:dec.dataShards] {
					if !shardsflag[k] {
						// recovered data should be recycled
						recovered = append(recovered, shards[k])
					}
				}
				// Recover the extra shards
				for k := range shards[dec.dataShards:] {
					if shards[k] != nil {
						xmitBuf.Put(shards[k])
					}
				}

The above fix resolves most of them, but there is still the case where shard recovery fails. In this case, all of the shards which were taken from the sync.Pool should be returned. This is pretty substantial, as there is always at least one missing data shard and 30 additional parity shards. That amounts to around 45k RAM leak per unrecoverable transmission error.

Here's the code for that, which immediately follows the code above:

			} else {
				// Couldn't reconstruct. Recover all shards.
				for k := range shards {
					if shards[k] != nil && !shardsflag[k] {
						xmitBuf.Put(shards[k])
					}
				}
			}

I have added some code to clean up after updater() and monitor() close. I prefer to have it here as it catches situations where a UDPSession enters an error state and closes, even if the client doesn't explicitly call Close().

func (s *UDPSession) updater() {
	defer func() {
		if s.txqueue != nil && len(s.txqueue) > 0 {
			for i, _ := range s.txqueue {
				if len(s.txqueue[i].Buffers) > 0 {
					xmitBuf.Put([]byte(s.txqueue[i].Buffers[0]))
				}
			}
		}

		// Clean FECDecoder queue
		if s.fecDecoder != nil && len(s.fecDecoder.rx) > 0 {
			s.fecDecoder.rx = s.fecDecoder.freeRange(0, len(s.fecDecoder.rx), s.fecDecoder.rx)
		}

		s.fecDecoder = nil
		s.fecEncoder = nil
	}()

func (l *Listener) monitor() {
	defer func() {
		for _, s := range l.sessions {
			// Free FECDecoder frames
			if s.fecDecoder != nil && len(s.fecDecoder.rx) > 0 {
				s.fecDecoder.freeRange(0, len(s.fecDecoder.rx), s.fecDecoder.rx)
			}

			s.fecDecoder = nil
			s.fecEncoder = nil
		}

	}()

Finally, the above cleanup code introduces a possible race condition, where a UDPSession is aborted while decode() is still in progress. This can cause a panic in freeRange. This is resolved by simply checking for a zero-length slice at the beginning:

func (dec *fecDecoder) freeRange(first, n int, q []fecElement) []fecElement {
	if len(q) == 0 {
		// Prevents panic on race condition when a session is unexpectedly closed and
		// updater() or monitor() cleans up the rx queue while decode() is in progress.
		return q
	}

I'm still testing the above changes, but thought I would paste them here for your review.

xtaci · 2019-11-06T03:18:20Z

EDIT: looks like
dec.rx = dec.freeRange(first, numshard, dec.rx)
[first, numshards] contains datashard + parityshard

xtaci · 2019-11-06T04:26:00Z

and even if parityshards were not put back while reconstructing, the rxlimit rule and expire rule can still recycle them.

WinstonPrivacy · 2019-11-06T14:12:18Z

Sorry, it looks like a message was deleted and I'm not following. Can you clarify?

WinstonPrivacy · 2019-11-06T14:13:18Z

Also, we ran almost 12 hours without a major memory leak with the changes above. I don't think they are perfect though because I'm seeing more dropped smux connections.

xtaci · 2019-11-06T14:31:00Z

		if numDataShard == dec.dataShards {
			// case 1: no loss on data shards
			dec.rx = dec.freeRange(first, numshard, dec.rx)
		} else if numshard >= dec.dataShards {
			// case 2: loss on data shards, but it's recoverable from parity shards
			for k := range shards {
				if shards[k] != nil {
					dlen := len(shards[k])
					shards[k] = shards[k][:maxlen]
					copy(shards[k][dlen:], dec.zeros)
				} else {
					shards[k] = xmitBuf.Get().([]byte)[:0]
				}
			}
			if err := dec.codec.ReconstructData(shards); err == nil {
				for k := range shards[:dec.dataShards] {
					if !shardsflag[k] {
						// recovered data should be recycled
						recovered = append(recovered, shards[k])
					}
				}
			}
			dec.rx = dec.freeRange(first, numshard, dec.rx)  
		}

check:
dec.rx = dec.freeRange(first, numshard, dec.rx)
[first, numshards] contains datashard + parityshard

and even if parityshards were not put back while reconstructing, the rxlimit rule and expire rule can still recycle them.

WinstonPrivacy · 2019-11-06T22:41:02Z

Yes, I put counters on all of those. freeRange won't return the 30 shards at the end, because they were taken from the sync.pool up above.

xtaci · 2019-11-07T05:29:51Z

I know your point now, there are some problem with reedsolomon codes library, the parity shards recovery is unstable, sometimes it will allocate a new buffer with incorrect capacity, sometimes not. so the only way is not to preallocate buffer to parityshards.

xtaci · 2019-11-07T06:44:48Z

https://github.com/xtaci/kcp-go/releases/tag/v5.4.19

WinstonPrivacy · 2019-11-09T19:51:08Z

Interesting. I will try this out in preparation for our next release and let you know how it goes!

WinstonPrivacy · 2019-11-10T19:51:20Z

Just tried this version and it doesn't resolve the memory leak... if anything, it might be worse than before.

xtaci · 2019-11-11T03:06:24Z

another way is to use Reconstruct instead of ReconstructData to make parity recovery fully controllable.
but I've tested , the length might be smaller than 1500. we should contact klauspost to fix that.

WinstonPrivacy · 2019-11-11T20:34:32Z

I'm trying to narrow down the problem now. The issue is that RAM is under control but kcp sessions seem to be breaking at a much higher rate... possibly ReconstructData is re-using the allocated []byte buffers instead of copying them? If that's the case, then the new code would potentially result in sync.Pool allocating them for some other session to use, resulting in a conflict.

WinstonPrivacy · 2019-11-11T20:45:08Z

Looks like that is the problem. When I comment out the code which Puts() the parity shards back in the sync.Pool, KCP sessions are stable again. So perhaps three possible solutions:

Don't Get() the parity shards from the sync.Pool.
Implement Reconstruct to be sync.Pool friendly
Modify reedsolomon.Encoder to copy bytes in a manner friendlier to sync.Pool

WinstonPrivacy · 2019-11-12T16:12:45Z

Shame on me, I think your fix may have actually worked after all. After reading the reedsolomon docs, I saw that it allocates []byte slices when the missing shards are nil. I went back to my code and saw that I was still putting those back in the sync.Pool (a change made from an earlier version). Removing that has restored stability to the connections and the leak rate has gone down quite a bit.

Still not zero, but getting closer.

WinstonPrivacy · 2019-11-13T16:04:34Z

So far so good! Sync pool leaks are down to < 1%, which allows us to run for a few days before restarting.

WinstonPrivacy · 2019-11-20T22:38:18Z

Can't rule out all memory leaks but things are a lot better now. Am closing out this issue.

WinstonPrivacy closed this as completed Nov 20, 2019

sync.Pool leaks #146

sync.Pool leaks #146

Comments

WinstonPrivacy commented Oct 26, 2019

xtaci commented Oct 26, 2019

WinstonPrivacy commented Oct 26, 2019

WinstonPrivacy commented Oct 26, 2019

xtaci commented Oct 26, 2019

WinstonPrivacy commented Oct 26, 2019 via email

xtaci commented Oct 27, 2019

WinstonPrivacy commented Oct 27, 2019 via email

xtaci commented Oct 28, 2019

WinstonPrivacy commented Oct 28, 2019 via email

WinstonPrivacy commented Oct 28, 2019

xtaci commented Oct 29, 2019 • edited Loading

WinstonPrivacy commented Oct 29, 2019 via email

xtaci commented Oct 30, 2019

xtaci commented Oct 30, 2019

WinstonPrivacy commented Oct 30, 2019 via email

xtaci commented Oct 30, 2019

WinstonPrivacy commented Oct 30, 2019 via email

WinstonPrivacy commented Oct 30, 2019 • edited Loading

WinstonPrivacy commented Oct 30, 2019

WinstonPrivacy commented Oct 31, 2019

xtaci commented Nov 1, 2019

WinstonPrivacy commented Nov 1, 2019 via email

xtaci commented Nov 1, 2019

xtaci commented Nov 1, 2019 • edited Loading

WinstonPrivacy commented Nov 1, 2019 via email

xtaci commented Nov 1, 2019 • edited Loading

WinstonPrivacy commented Nov 1, 2019 via email

xtaci commented Nov 1, 2019

WinstonPrivacy commented Nov 1, 2019

WinstonPrivacy commented Nov 1, 2019

WinstonPrivacy commented Nov 4, 2019 via email

WinstonPrivacy commented Nov 4, 2019

xtaci commented Nov 4, 2019 • edited Loading

xtaci commented Nov 4, 2019 • edited Loading

WinstonPrivacy commented Nov 4, 2019 via email

WinstonPrivacy commented Nov 4, 2019

WinstonPrivacy commented Nov 5, 2019

WinstonPrivacy commented Nov 5, 2019

xtaci commented Nov 5, 2019

xtaci commented Nov 5, 2019 • edited Loading

xtaci commented Nov 5, 2019

WinstonPrivacy commented Nov 5, 2019 via email

WinstonPrivacy commented Nov 5, 2019

xtaci commented Nov 6, 2019 • edited Loading

xtaci commented Nov 6, 2019 • edited Loading

WinstonPrivacy commented Nov 6, 2019

WinstonPrivacy commented Nov 6, 2019

xtaci commented Nov 6, 2019

WinstonPrivacy commented Nov 6, 2019

xtaci commented Nov 7, 2019

xtaci commented Nov 7, 2019

WinstonPrivacy commented Nov 9, 2019

WinstonPrivacy commented Nov 10, 2019

xtaci commented Nov 11, 2019 • edited Loading

WinstonPrivacy commented Nov 11, 2019

WinstonPrivacy commented Nov 11, 2019 • edited Loading

WinstonPrivacy commented Nov 12, 2019

WinstonPrivacy commented Nov 13, 2019

WinstonPrivacy commented Nov 20, 2019

xtaci commented Oct 29, 2019 •

edited

Loading

WinstonPrivacy commented Oct 30, 2019 •

edited

Loading

xtaci commented Nov 1, 2019 •

edited

Loading

xtaci commented Nov 1, 2019 •

edited

Loading

xtaci commented Nov 4, 2019 •

edited

Loading

xtaci commented Nov 4, 2019 •

edited

Loading

xtaci commented Nov 5, 2019 •

edited

Loading

xtaci commented Nov 6, 2019 •

edited

Loading

xtaci commented Nov 6, 2019 •

edited

Loading

xtaci commented Nov 11, 2019 •

edited

Loading

WinstonPrivacy commented Nov 11, 2019 •

edited

Loading