Opening crafting interface causes server crash if storage network too large #1077

Open
ab9rf opened this Issue Mar 23, 2015 · 16 comments

Comments

Projects
None yet
5 participants
@ab9rf

ab9rf commented Mar 23, 2015

We have a fairly mature world with a large storage network (~250,000 items, ~5000 distinct item types). Attempting to open a crafting interface in this world causes a BufferOverflowException at appeng.core.sync.packets.PacketMEInventoryUpdate.appendItem(PacketMEI
nventoryUpdate.java:176). This is an exception forced when the inventory packet exceeds 2 megabytes, which is likely simply the case for our environment. I've replicated it four times now: it happens reliably and consistently as soon as anyone tries to open a ME crafting interface.

This is with rv2 beta 14.

@ab9rf

This comment has been minimized.

Show comment
Hide comment
@ab9rf

ab9rf Mar 23, 2015

Additional information: hacking the class file for PacketMEInventoryUpdate to increase this limit to 16 megabytes avoids the crash.

ab9rf commented Mar 23, 2015

Additional information: hacking the class file for PacketMEInventoryUpdate to increase this limit to 16 megabytes avoids the crash.

@thatsIch

This comment has been minimized.

Show comment
Hide comment
@thatsIch

thatsIch Mar 23, 2015

Member

Can we have the crashlog please,
2mb limit should be more than enough.

Member

thatsIch commented Mar 23, 2015

Can we have the crashlog please,
2mb limit should be more than enough.

@ab9rf

This comment has been minimized.

Show comment
Hide comment
@ab9rf

ab9rf Mar 23, 2015

Crash log.

You say that the 2mb limit "should be more than enough" but it's obviously not since patching the class file to increase it to 16 megabytes resolved the issue. I'd rather not have to do this with each new release, even if it fairly trivial to do.

ab9rf commented Mar 23, 2015

Crash log.

You say that the 2mb limit "should be more than enough" but it's obviously not since patching the class file to increase it to 16 megabytes resolved the issue. I'd rather not have to do this with each new release, even if it fairly trivial to do.

@thatsIch

This comment has been minimized.

Show comment
Hide comment
@thatsIch

thatsIch Mar 23, 2015

Member

This indicates a bug, imagine your server shipping 16mb data every second. This would be pretty nuts.

Member

thatsIch commented Mar 23, 2015

This indicates a bug, imagine your server shipping 16mb data every second. This would be pretty nuts.

@ab9rf

This comment has been minimized.

Show comment
Hide comment
@ab9rf

ab9rf Mar 23, 2015

I don't really need to imagine that, as I have traffic monitoring set up and know what the network bandwidth is. Typical is around 1 meg per second per connected player, most of which is chunk data. It spikes briefly to around 10 megs per second when a player joins. The server still has plenty of reserve capacity even at that level.

I agree that sending 2 meg packets every second all the time would be less than ideal but it would not kill my server. However, this particular packet is only sent when a crafting interface is open. Even with the modification I've made neither the server nor the clients experience any identifiable ill effects as a result of increasing this limit, and the server no longer crashes.

I would agree that this is a design flaw (it should probably use multiple smaller packets rather than one ginormous one) but the bug is that the server crashes if you make the mistake of allowing your ME storage network to grow larger than the AE developers have arbitrarily decided is "reasonable".

ab9rf commented Mar 23, 2015

I don't really need to imagine that, as I have traffic monitoring set up and know what the network bandwidth is. Typical is around 1 meg per second per connected player, most of which is chunk data. It spikes briefly to around 10 megs per second when a player joins. The server still has plenty of reserve capacity even at that level.

I agree that sending 2 meg packets every second all the time would be less than ideal but it would not kill my server. However, this particular packet is only sent when a crafting interface is open. Even with the modification I've made neither the server nor the clients experience any identifiable ill effects as a result of increasing this limit, and the server no longer crashes.

I would agree that this is a design flaw (it should probably use multiple smaller packets rather than one ginormous one) but the bug is that the server crashes if you make the mistake of allowing your ME storage network to grow larger than the AE developers have arbitrarily decided is "reasonable".

@thatsIch

This comment has been minimized.

Show comment
Hide comment
@thatsIch

thatsIch Mar 23, 2015

Member

I think we will release a hotfix and patch it up properly in the future

Member

thatsIch commented Mar 23, 2015

I think we will release a hotfix and patch it up properly in the future

@yueh

This comment has been minimized.

Show comment
Hide comment
@yueh

yueh Mar 23, 2015

Member

To put some numbers into "reasonable".
2Mb is enough to hold about 150k-300k distinct items (without NBT) and this is before compressing it.
The amount mostly depends on the stacksize, 300k with stacks only < 255 items and 150k for only > 4294967295.

The problem starts with NBT data being used more and more (which can probably be considered a design flaw in other mods).
There is no real limit, so anyone can just attach a 2Mb string to an item. so just 8 distinct items is enough to trigger an error with a packet of 16Mb.
Even with splitting it into different packets can at some points trigger them again (except if we start splitting also NBT data in multiple packets)

Member

yueh commented Mar 23, 2015

To put some numbers into "reasonable".
2Mb is enough to hold about 150k-300k distinct items (without NBT) and this is before compressing it.
The amount mostly depends on the stacksize, 300k with stacks only < 255 items and 150k for only > 4294967295.

The problem starts with NBT data being used more and more (which can probably be considered a design flaw in other mods).
There is no real limit, so anyone can just attach a 2Mb string to an item. so just 8 distinct items is enough to trigger an error with a packet of 16Mb.
Even with splitting it into different packets can at some points trigger them again (except if we start splitting also NBT data in multiple packets)

@KamranMackey

This comment has been minimized.

Show comment
Hide comment
@KamranMackey

KamranMackey Mar 23, 2015

Contributor

@ab9rf Which class file did you do this in?

Contributor

KamranMackey commented Mar 23, 2015

@ab9rf Which class file did you do this in?

@mSt3in

This comment has been minimized.

Show comment
Hide comment
@mSt3in

mSt3in Mar 23, 2015

I didn't know that the entire inventory was sent to the client when opening a terminal.

Wouldn't it be possible to just send a list of ItemIDs and quanities...then on mouse-over, send the NTB for just that item? Wait, How are quantity changes to AE storage being sent to the client while the terminal is open now?

mSt3in commented Mar 23, 2015

I didn't know that the entire inventory was sent to the client when opening a terminal.

Wouldn't it be possible to just send a list of ItemIDs and quanities...then on mouse-over, send the NTB for just that item? Wait, How are quantity changes to AE storage being sent to the client while the terminal is open now?

@ab9rf

This comment has been minimized.

Show comment
Hide comment
@ab9rf

ab9rf Mar 23, 2015

@KamranMackey I used a Java class file editor to change constant 199 in file appeng/core/sync/packets/PacketMEInventoryUpdate.class (from rv2-beta-14, md5sum b3d9566f2c7a2c173d06e3b6267e546d) from 2097152 to 16777216. Fortunately, constant 199 is not used anywhere else in this class, so it's not necessary to create a new constant and patch the bytecode.

Obviously, do not rely on the constant number being the same from compilation to compilation; if you're not hacking the original rv2-beta-14 you should decompile the class first with javap.

ab9rf commented Mar 23, 2015

@KamranMackey I used a Java class file editor to change constant 199 in file appeng/core/sync/packets/PacketMEInventoryUpdate.class (from rv2-beta-14, md5sum b3d9566f2c7a2c173d06e3b6267e546d) from 2097152 to 16777216. Fortunately, constant 199 is not used anywhere else in this class, so it's not necessary to create a new constant and patch the bytecode.

Obviously, do not rely on the constant number being the same from compilation to compilation; if you're not hacking the original rv2-beta-14 you should decompile the class first with javap.

@ab9rf

This comment has been minimized.

Show comment
Hide comment
@ab9rf

ab9rf Mar 23, 2015

@mSt3in I also had the same idea of sending merely a flag saying "this object has NBT data which I'm not sending you yet", either all the time or only if the object has "a large amount" of NBT data, and then only sending the NBT data when and only when the client actually requests it. The client end of the deserializer would have to create a stub object (in Haskell, I'd just use a thunk, but this is Java so you don't get lazy evaluation for free) for any NBT data that was omitted from the original transaction.

I suspect that doing this for all NBT data would introduce a ton of latency. It might be possible to adaptively learn which attributes are needed up front and which are not, since it'll vary heavily by the mod, although that'd be some fairly clever code.

In the case of my server, it's probably Forestry; Forestry bees, tree saplings, and so forth are just swimming with NBT data. Since the only part that the client needs to render the icon in the interface is the species (to determine the bee color or sapling icon), it's probably the case that most of the Forestry-related NBT data doesn't need to be sent until the item is actually interacted with. This would require, however, either a custom set of serializers that are mod-detail aware, or else an adaptive learning algorithm that notices which attributes are most commonly requested by an inventory interface and learns to send those (and only those) over time.

ab9rf commented Mar 23, 2015

@mSt3in I also had the same idea of sending merely a flag saying "this object has NBT data which I'm not sending you yet", either all the time or only if the object has "a large amount" of NBT data, and then only sending the NBT data when and only when the client actually requests it. The client end of the deserializer would have to create a stub object (in Haskell, I'd just use a thunk, but this is Java so you don't get lazy evaluation for free) for any NBT data that was omitted from the original transaction.

I suspect that doing this for all NBT data would introduce a ton of latency. It might be possible to adaptively learn which attributes are needed up front and which are not, since it'll vary heavily by the mod, although that'd be some fairly clever code.

In the case of my server, it's probably Forestry; Forestry bees, tree saplings, and so forth are just swimming with NBT data. Since the only part that the client needs to render the icon in the interface is the species (to determine the bee color or sapling icon), it's probably the case that most of the Forestry-related NBT data doesn't need to be sent until the item is actually interacted with. This would require, however, either a custom set of serializers that are mod-detail aware, or else an adaptive learning algorithm that notices which attributes are most commonly requested by an inventory interface and learns to send those (and only those) over time.

@yueh

This comment has been minimized.

Show comment
Hide comment
@yueh

yueh Mar 23, 2015

Member

Except that this is not possible as the NBT defines what item it actually is. So we need to request it for the icon, the name (local filtering), tooltip and what not. So just the moment you want to filter, we already need to request every item in the system.

Also it would need to be mod aware and probably asm patch every method to add support for delayed NBT data. Otherwise it will fail left and right (on top of the guaranteed crashes with asm patching)

Member

yueh commented Mar 23, 2015

Except that this is not possible as the NBT defines what item it actually is. So we need to request it for the icon, the name (local filtering), tooltip and what not. So just the moment you want to filter, we already need to request every item in the system.

Also it would need to be mod aware and probably asm patch every method to add support for delayed NBT data. Otherwise it will fail left and right (on top of the guaranteed crashes with asm patching)

@mSt3in

This comment has been minimized.

Show comment
Hide comment
@mSt3in

mSt3in Mar 24, 2015

I follow. Without all the NTB data text-search becomes unusable. Moving the packet size up and up doesn't sounds like a good move either.

In all this time I never played with viewcells. Do they have any impact on the inventory packet sent?

Oh, and just how are quantity changes propagated to an open terminal? Retransmit inventory and re-apply current filter?

...

Feature idea time! dye-able wireless terminals. If separate storage networks in the same base/area are the answer, then it'd be nice to be able to access both and have an easy way to mark which terminal is which.

It would also be nice to have some sort of indication of when a network is reaching the breaking point. Maybe this is some info one could get from the network-tool.

mSt3in commented Mar 24, 2015

I follow. Without all the NTB data text-search becomes unusable. Moving the packet size up and up doesn't sounds like a good move either.

In all this time I never played with viewcells. Do they have any impact on the inventory packet sent?

Oh, and just how are quantity changes propagated to an open terminal? Retransmit inventory and re-apply current filter?

...

Feature idea time! dye-able wireless terminals. If separate storage networks in the same base/area are the answer, then it'd be nice to be able to access both and have an easy way to mark which terminal is which.

It would also be nice to have some sort of indication of when a network is reaching the breaking point. Maybe this is some info one could get from the network-tool.

@yueh

This comment has been minimized.

Show comment
Hide comment
@yueh

yueh Mar 24, 2015

Member

Not sure how the view cells exactly work, but I would be surprised when they are not client based.
So probably no change.

Changes to the inventory will cause a complete resync, but this is actually completely irrelevant for the issue itself. Most system will probably only use a few kb, if not less as we gzip it.
If we change it to a delta packet, it would mostly just reduce the used bandwidth, but nothing against the actual issue as the initial delta will always be the full inventory, so it will always crash.

Maybe a stream based approach might help, but it can also delay the list being fully transfered as well as cause more overhead overall. And it will only work until some mod dev thinks attaching a mp3 with several Mbs as NBT tag is a good idea to share it as item.

Member

yueh commented Mar 24, 2015

Not sure how the view cells exactly work, but I would be surprised when they are not client based.
So probably no change.

Changes to the inventory will cause a complete resync, but this is actually completely irrelevant for the issue itself. Most system will probably only use a few kb, if not less as we gzip it.
If we change it to a delta packet, it would mostly just reduce the used bandwidth, but nothing against the actual issue as the initial delta will always be the full inventory, so it will always crash.

Maybe a stream based approach might help, but it can also delay the list being fully transfered as well as cause more overhead overall. And it will only work until some mod dev thinks attaching a mp3 with several Mbs as NBT tag is a good idea to share it as item.

@thatsIch thatsIch closed this in a1f43ff Mar 25, 2015

thatsIch added a commit that referenced this issue Mar 25, 2015

Merge pull request #1078 from thatsIch/b-1077-hotfix-packet-size
Fixes #1077 Hotfix: Opening a terminal will not crash as fast anymore in systems with many items.

@thatsIch thatsIch reopened this Mar 25, 2015

@thatsIch

This comment has been minimized.

Show comment
Hide comment
@thatsIch

thatsIch Mar 25, 2015

Member

Reopen issue to keep it as a reminder

Member

thatsIch commented Mar 25, 2015

Reopen issue to keep it as a reminder

@yueh

This comment has been minimized.

Show comment
Hide comment
@yueh

yueh Aug 6, 2015

Member

The optimizations suggested in #1621 should also improve this issue.

Member

yueh commented Aug 6, 2015

The optimizations suggested in #1621 should also improve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment