Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drastic Performance Issues with Flat File Block List #3

Closed
aikar opened this issue Jul 13, 2016 · 19 comments
Closed

Drastic Performance Issues with Flat File Block List #3

aikar opened this issue Jul 13, 2016 · 19 comments
Labels

Comments

@aikar
Copy link

aikar commented Jul 13, 2016

The night is dark and full of terrors, and https://github.com/slipcor/TreeAssist/blob/master/src/me/itsatacoshop247/TreeAssist/blocklists/FlatFileBlockList.java is the night.

This implementation suffers from massive performance problems.

  1. IO activity is being done on the main thread. Instead of calling save(file), instead call String yaml = data.saveToString(); and then do the File write Async. One user of this plugin has a 4MB data file, which is small compared to what i fear this would grow to on a larger server.
  2. Saving is done every change. This is extremely ineffecient. Change this to only save to disk once per minute and in onDisale at least
  3. Using string splitting for testing. You should define your own object type that has its own YAML Serialization, with proper equals and hashCode methods, that the YAML file deserializes to on startup so no string manipulation is done at runtime outside of load.
  4. Using the config lists instead of maps for data storage. This requires iterating over every entry in the list to do the comparisons, which as the file gets larger, the performance gets slower. While you can use a list for YAML data storage, you should parse it out of the config at startup, and store it into a Map<Location, YourRecordObject>. Then isPlayerPlaced becomes a simple return blocks.hasKey(block.getLocation());

Doing this will fix the scalability problems of the FlatFile system, and will be super easy to auto upgrade existing data files.

@FatherWh0
Copy link

FatherWh0 commented Jul 13, 2016

This issue can be seen in my server's timings here:
https://timings.aikar.co/v2/?id=a56cc538fa234636ad1321c3a6c9e8db
Also, I made a vid demonstrating the effect of this issue in-game. If I set Handler Plugin Name: none the issue disappears.
https://youtu.be/67-LCU4u86g

@slipcor
Copy link
Owner

slipcor commented Jul 13, 2016

done: (untested!!)

  • async saving
  • saving every minute and on shutdown

to do:

  • change data layout to list
  • serialization

I might need some help with that. I never serialized stuff :) Any fine tutorials or similar solutions? I am at work right now so finishing and testing this will have to wait 6+ hours

@slipcor
Copy link
Owner

slipcor commented Jul 13, 2016

My first attempt at doing this :P

https://github.com/slipcor/TreeAssist/tree/TreeBlock

I will test this when I am at home, i.e. in 1-2 hrs

@slipcor
Copy link
Owner

slipcor commented Jul 13, 2016

Wrong issue linked in latest commit. Please report back with this version:

v5.10.21 - finish up github issue #4 - tested in a tiny environment. Backups are being made but use with caution!

https://ci2.craftyn.com/job/TreeAssist/

@slipcor
Copy link
Owner

slipcor commented Jul 21, 2016

@aikar sorry man, can you give me a bit more input about the race condition again? I lost my copypasta and the IRC rollback is empty -.-

@SidShakal
Copy link

SidShakal commented Aug 10, 2016

@slipcor I think I found the bit you're looking for in my logs.

(9:23:15 AM) Aikar: slipcor: you got a race condition risk in your save code
...
(9:23:42 AM) Aikar: you need to do the config.set sync, and then call config.saveToString(), then offload the file IO part async
(9:23:56 AM) Aikar: so the async tasks references the result string

@slipcor
Copy link
Owner

slipcor commented Aug 14, 2016

@SidShakal thank you so much!

@SidShakal
Copy link

SidShakal commented Aug 27, 2016

[Whoops. I guess this is what happens when I start writing one day and finish a day or two later: very long post.]

The performance issues are noticeable again, now that the laggy operation (dumping the entire blocklist to a YAML string) has been moved back to the main thread.

My server's blocklist has over 85,000 entries.
Testing with v5.10.23, here are the times I've seen:

  • importing old data.yml upon startup: over five minutes
  • saving the blocklist: somewhere between 4 and 7 seconds (seems to stabilize around 4 seconds after the server has been up for a while)
  • saving the blocklist on shutdown: 3-4 seconds

Actually, wait. I just shut my testing server down using Multicraft's "Stop" button, and it seemed to shut down gracefully. I started it back up, and TreeAssist throws the following exception:

[14:40:57] [Server thread/INFO]: [TreeAssist] Enabling TreeAssist v5.10.
[14:40:57] [Server thread/INFO]: [TreeAssist] debugging: off
[14:41:09] [Server thread/WARN]: org.bukkit.configuration.InvalidConfigurationEx
ception: org.yaml.snakeyaml.error.YAMLException: Could not deserialize object
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.file.YamlCon
figuration.loadFromString(YamlConfiguration.java:56)
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.file.FileCon
figuration.load(FileConfiguration.java:184)
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.file.FileCon
figuration.load(FileConfiguration.java:130)
[14:41:09] [Server thread/WARN]:        at me.itsatacoshop247.TreeAssist.blockli
sts.FlatFileBlockList.initiate(FlatFileBlockList.java:41)
[14:41:09] [Server thread/WARN]:        at me.itsatacoshop247.TreeAssist.TreeAss
ist.onEnable(TreeAssist.java:209)
[14:41:09] [Server thread/WARN]:        at org.bukkit.plugin.java.JavaPlugin.set
Enabled(JavaPlugin.java:292)
[14:41:09] [Server thread/WARN]:        at org.bukkit.plugin.java.JavaPluginLoad
er.enablePlugin(JavaPluginLoader.java:340)
[14:41:09] [Server thread/WARN]:        at org.bukkit.plugin.SimplePluginManager
.enablePlugin(SimplePluginManager.java:405)
[14:41:09] [Server thread/WARN]:        at org.bukkit.craftbukkit.v1_10_R1.CraftServer.loadPlugin(CraftServer.java:362)
[14:41:09] [Server thread/WARN]:        at org.bukkit.craftbukkit.v1_10_R1.CraftServer.enablePlugins(CraftServer.java:322)
[14:41:09] [Server thread/WARN]:        at net.minecraft.server.v1_10_R1.MinecraftServer.t(MinecraftServer.java:412)
[14:41:09] [Server thread/WARN]:        at net.minecraft.server.v1_10_R1.MinecraftServer.l(MinecraftServer.java:377)
[14:41:09] [Server thread/WARN]:        at net.minecraft.server.v1_10_R1.MinecraftServer.a(MinecraftServer.java:332)
[14:41:09] [Server thread/WARN]:        at net.minecraft.server.v1_10_R1.DedicatedServer.init(DedicatedServer.java:271)
[14:41:09] [Server thread/WARN]:        at net.minecraft.server.v1_10_R1.MinecraftServer.run(MinecraftServer.java:535)
[14:41:09] [Server thread/WARN]:        at java.lang.Thread.run(Thread.java:745)
[14:41:09] [Server thread/WARN]: Caused by: org.yaml.snakeyaml.error.YAMLException: Could not deserialize object
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.file.YamlConstructor$ConstructCustomObject.construct(YamlConstructor.java:37)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:182)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructSequenceStep2(BaseConstructor.java:275)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructSequence(BaseConstructor.java:246)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.SafeConstructor$ConstructYamlSeq.construct(SafeConstructor.java:470)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:182)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructMapping2ndStep(BaseConstructor.java:373)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.SafeConstructor.constructMapping2ndStep(SafeConstructor.java:147)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructMapping(BaseConstructor.java:354)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.SafeConstructor$ConstructYamlMap.construct(SafeConstructor.java:489)
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.file.YamlConstructor$ConstructCustomObject.construct(YamlConstructor.java:26)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:182)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.constructDocument(BaseConstructor.java:141)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:127)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:450)
[14:41:09] [Server thread/WARN]:        at org.yaml.snakeyaml.Yaml.load(Yaml.java:369)
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.file.YamlConfiguration.loadFromString(YamlConfiguration.java:54)
[14:41:09] [Server thread/WARN]:        ... 15 more
[14:41:09] [Server thread/WARN]: Caused by: java.lang.IllegalArgumentException: Specified class does not exist ('me.itsatacosh')
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.serialization.ConfigurationSerialization.deserializeObject(ConfigurationSerialization.java:185)
[14:41:09] [Server thread/WARN]:        at org.bukkit.configuration.file.YamlConstructor$ConstructCustomObject.construct(YamlConstructor.java:35)
[14:41:09] [Server thread/WARN]:        ... 31 more

This might deserve its own bug report.
Looking at the data.yml file immediately after that exception showed in the log, it appeared that it wrote a few entries and stopped mid-write.
Unfortunately, within about a minute of the exception being logged, the autosave feature kicked in and overwrote the data.yml with a completely blank file.

I have a backup of my old-format data.yml, so it's not entirely lost.

I've reproduced the issue. I don't know if it's that the YAML output is too big to keep in one string, the write gets cut off, or what, but in any case, it ends up in complete data loss upon attempting to load data.yml upon startup because the YAML parser chokes the instant it sees malformed YAML.

A few months back, I devised a replacement for FlatFileBlockList that basically spread the data out across lots of little files, minimizing the amount of data written at any given time. It worked well enough for my very low traffic server. However, I don't feel it's a good enough fix to propose it be merged into the official plugin.

I've also tried making the blocklist use the loaded data.yml configuration as a map (ConfigurationSection is basically a map), cutting out the cost of translating to and from the list. That works better, but still halts the server for several seconds when saving to string in the main thread as Aikar had suggested.

The problem seems to be that we keep trying to do this in YAML. There's no reason a server admin needs to be able to edit the blocklist by hand, so why should we convert the data to and from text all the time?

In my opinion, using YAML as a blocklist is fine for proof-of-concept, but not really appropriate for long-term use on a live server (or short-term use on a very active server). Having the plugin use a human-readable text file for a list of blocks that gets into the thousands or millions of entries is like using a hammer to twist a screw into place: you may be able do it, but it's not going to be pretty, and it'll take a lot more time and effort than it should.

Today I looked into using Bukkit's built-in Ebean database feature to implement a block, but it seems that it's not flexible enough to allow us to provide the server admin with a choice on whether to use it. There's apparently a fair amount of overhead involved with enabling the Ebeans server, and that is done in the plugin.yml itself, acted on during initialization, before we ever get a chance to find out whether the user has chosen to use the built-in blocklist or not. There are possible workarounds for this, but probably not worth it.

I think the solution is probably to use something like SQLite. Data can be indexed for quick retrieval, doesn't have to be loaded into memory all at once, and can be saved quickly. It solves all the problems that we seem to be having with a YAML blocklist.
CraftBukkit and Spigot come with an SQLite JDBC driver included, so there is no additional dependency needed. The only issue is someone has to learn how to code the thing.

I'll give making an SQLite-based blocklist a try. Shouldn't be too hard. I won't be doing anything asynchronously though, because I'm familiar enough with asynchronous stuff to know that I don't know how to do it well. If it turns out that we need to do things in separate threads, we can figure that out when we get to it.

@slipcor
Copy link
Owner

slipcor commented Aug 29, 2016

@SidShakal can you provide me with the contents of the messed up file by any chance? I wonder where it stops, how many MB / how many characters

@aikar any ideas as to why this could happen?

@SidShakal
Copy link

@slipcor Sure.
broken-data.yml.zip

For quick reference:

# wc --lines --chars --bytes data.yml
 467227 8617984 8617984 data.yml

As for my progress on an SQLite-backed blocklist (not that anyone asked), I've got one that seems operational, but because of various factors (mostly being noobish with regard to SQLite) it does not perform very quickly at the moment. In order to do it right, I'm thinking it'd need to go asynchronous, and that opens a whole can of worms I'm not entirely sure I want to open yet.

I want my contribution to be easy enough to understand that you'll have no problem accepting it. SQLite is shaping up to be pretty complicated.

It's also mega-overkill.

I'm thinking I'll set the SQLite blocklist aside and try another route. If the choke point is the blocklist's serialization to and deserialization from YAML, maybe the solution involves serializing to and deserializing from a binary format instead.

I'm still "meh" on the idea that the whole thing should be in memory at once, but I guess so far that hasn't been a problem for my server Maybe a solution to having the whole thing be in memory at once would be to divide the thing up by region, kind of like how Minecraft does with its world data, and I think how mcMMO does with its blocklist-like functionality.

I don't know exactly how mcMMO does it, nor do I aim to replicate whatever their method is, but I do know their directory tree mimics the Minecraft world data tree. I'd imagine a similar tree would be appropriate for a distributed blocklist:

TreeAssist/
  data/
    world/
      r.0.0.dat
      r.0.1.dat
      ...
    world_nether/
      r.0.0.dat
      ...
    ...

The r.{x}.{z}.dat files are mini-blocklists containing blocks that lie within the region identified by the region coords (x, z).
The idea is for only those mini-blocklists that correspond to regions that have chunks in memory to be in memory at any given time. When a chunk is loaded and it belongs to a region whose mini-blocklist we haven't yet loaded, we load the mini-blocklist. When the last loaded chunk belonging to a region we have loaded a mini-blocklist for is unloaded, we unload our mini-blocklist as well (saving it if it has been modified).
For each loaded mini-blocklist, we keep track of whether it has been modified since the last save. That way, we can save smaller portions.

On small maps this method may still result in most or all of the blocklist being held in memory at all times, I suppose. On dense servers with small maps the division of the blocklist by region might not be a huge advantage, but still, the use of a binary format instead of YAML will probably help. Alternatively, perhaps there could be a special variation for tiny-mapped servers that uses a mini-blocklist granularity other than by-region (possibly by-chunk, or some granularity between chunk and region). But that's probably not important to think about right now.

Okay I'm excited again. Time to read up on binary serialization of Java maps. Could serialize as a list, but better if maps can be serialized directly.

@SidShakal
Copy link

@slipcor Oh! And here's the data.yml before the import:

data.yml-beforeImport.zip

@aikar
Copy link
Author

aikar commented Sep 1, 2016

I don't think switching to binary serialization is necessary. You're on the right track with region-based storage, I was going to suggest that.

The issue is ultimately due to serializing the entire servers data into 1 file, meaning it has to 'deal with' data that may be related a chunk that hasn't been touched in 3 months.

Segmenting into regions (or even chunk based ,but using regions as folder names so like data/r0.0/1.0.yml), would be recommended.

That will also save memory too.

Or an even simpler format would be custom serialization, 1 block per line, stored in world:x,y,z: format

That could be built with a string builder so easily.

@slipcor
Copy link
Owner

slipcor commented Sep 1, 2016

Well yeah the custom one line serialization is what I used before. So going back to that general format could help with the absolute size, but will only help with the symptoms of the file sizes,
Alright so region and chunk based save files will be the solution then.

I checked the data yml, but I didn't find anything obvious about the part where it stopped, neither the line nor the character count seem obvious to me. Sad.

@aikar
Copy link
Author

aikar commented Sep 1, 2016

@slipcor the recommendation to go to simple format is to avoid the overhead of all the yaml generation. there is a lot of overhead in that framework. whereas iteration and string building (with a StringBuilder) would be very little overhead

The other alternative would be to clone the map , then pass the cloned map to the async saving of yaml so there is no risk of the data changing mid save.

Thats likely the simplest fix, as a clone operation should be the fastest option out of all anyways.

As for the storage part, its not about file size, but about reducing "Operating Data" so that even if nothing changed about the current save style (sync toString()), it would have significantly less data to be serializing per region/chunk.

@slipcor
Copy link
Owner

slipcor commented Sep 6, 2016

So either

A) plain text line by line

B) yaml map cloning (of the whole file then)

Both would be saving async and file by file [chunk by chunk]

I think I will go with the stringbuilder and plain java file saving method. I don't like yaml anyways. I will pull something later and nag you again :D

@slipcor slipcor mentioned this issue Sep 8, 2016
slipcor added a commit that referenced this issue Sep 8, 2016
v5.10.24 - implement a new way of saving data files. Actually the system before seems to not have worked properly. Imports old and last versions! - addresses issue #3
@slipcor
Copy link
Owner

slipcor commented Sep 8, 2016

@SidShakal please try https://ci2.craftyn.com/job/TreeAssist/lastSuccessfulBuild/artifact/target/TreeAssist.jar

I tested it locally, it imports old and last version and it seems to work properly for me. Keep your backups though, actually, I think you should use your backups right now, because of the issues we had in the transfer to the last version.

@SidShakal
Copy link

@slipcor When using the data.yml in the attached data.yml.zip, I get the following error on startup:

[17:56:07] [Server thread/ERROR]: Error occurred while enabling TreeAssist v5.10
.24 (Is it up to date?)
java.lang.NullPointerException
        at me.itsatacoshop247.TreeAssist.blocklists.FlatFileBlockList.initiate(F
latFileBlockList.java:195) ~[?:?]
        at me.itsatacoshop247.TreeAssist.TreeAssist.onEnable(TreeAssist.java:209
) ~[?:?]
        at org.bukkit.plugin.java.JavaPlugin.setEnabled(JavaPlugin.java:292) ~[s
pigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at org.bukkit.plugin.java.JavaPluginLoader.enablePlugin(JavaPluginLoader
.java:340) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at org.bukkit.plugin.SimplePluginManager.enablePlugin(SimplePluginManage
r.java:405) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at org.bukkit.craftbukkit.v1_10_R1.CraftServer.loadPlugin(CraftServer.java:362) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at org.bukkit.craftbukkit.v1_10_R1.CraftServer.enablePlugins(CraftServer.java:322) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at net.minecraft.server.v1_10_R1.MinecraftServer.t(MinecraftServer.java:412) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at net.minecraft.server.v1_10_R1.MinecraftServer.l(MinecraftServer.java:377) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at net.minecraft.server.v1_10_R1.MinecraftServer.a(MinecraftServer.java:332) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at net.minecraft.server.v1_10_R1.DedicatedServer.init(DedicatedServer.java:271) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at net.minecraft.server.v1_10_R1.MinecraftServer.run(MinecraftServer.java:535) [spigot-1.10.2.jar:git-Spigot-90f61bc-83a9dbd]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

However, running it with the data.yml from the previously-attached (data.yml-beforeImport.zip)[https://github.com/slipcor/TreeAssist/files/449747/data.yml-beforeImport.zip] seems to load fine.

@SidShakal
Copy link

@slipcor It still ties up the main thread for several seconds when saving.

@slipcor
Copy link
Owner

slipcor commented Oct 30, 2016

Then I have no idea what to do about it. I did two separate ways of doing it on another thread and in a different way of saving it.

I looked at your data yml and I am out of ideas. I see nothing wrong :(

@slipcor slipcor closed this as completed Feb 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants