Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A huge amount of memory is consumed when saving a large database #3364

Open
m417z opened this issue May 15, 2024 · 5 comments
Open

A huge amount of memory is consumed when saving a large database #3364

m417z opened this issue May 15, 2024 · 5 comments
Labels
bug The issue describes a bug. It does not mean the bug has been reproduced by a developer.

Comments

@m417z
Copy link
Contributor

m417z commented May 15, 2024

Operating System

Windows 11

x64dbg Version

Mar 27 2024

Describe the issue

I was playing with my XFG Marker Plugin, and while I experienced significant db saving/loading delays after using it in the past, this time it was pretty extreme, causing my system to nearly freeze. It's probably because this time I used it with a large binary, about 15 MB. Still, x64dbg consumed several GBs of RAM while trying to save the database, there must be some opportunities for optimization.

Steps to reproduce

  1. Install the XFG Marker Plugin.
  2. Download the latest version of Windows.UI.Xaml.dll or use yours at C:\Windows\System32\Windows.UI.Xaml.dll, open it in x64dbg.
  3. Download symbols (not sure if required for reproduction).
  4. Press Ctrl+Shift+X to activate the XFG Marker Plugin. It can freeze the debugger for around 10 seconds.
  5. Close the debugger. Beware that it can consume all available RAM and freeze the computer, save all documents.

Attachments

No response

Edit: Found this issue: #2829. Perhaps JSON isn't the best format for this, have you considered a format that's more optimized for performance such as SQLite?

@m417z m417z added the bug The issue describes a bug. It does not mean the bug has been reproduced by a developer. label May 15, 2024
@mrexodia
Copy link
Member

The whole database is stored in std::unordered_map containers, so that will use quite a bit of RAM if you add millions of labels/xrefss/whatever. The JSON (de)serialization can most definitely be optimized with something like RapidJSON and a SAX-like parser, but this just requires a lot of work so I never went through with it...

@m417z
Copy link
Contributor Author

m417z commented May 15, 2024

After analysis, it's not too bad: It starts with 492 MB (probably due to symbols), then goes to ~5 GB during analysis, then drops to 2.9 GB when it's done, after xrefs and comments are added. Probably can be optimized, but can be worked with.

The more serious problem is when the DB is saved on close - memory usage goes to 10 GB and above, at which point I just killed x64dbg.

@mrexodia
Copy link
Member

Yeah, the main reason is the jansson library. It essentially creates the whole JSON file as a DOM in memory and this causes the explosion. The solution would be to directly stream the JSON to the file, but it requires quite a lot of work and changes to the abstraction I did...

@m417z
Copy link
Contributor Author

m417z commented May 15, 2024

Why does it have to be JSON?

I went with .json.gz for Winbindex and it's fairly inefficient. I'm not changing it for two reasons: it's already implemented and at this point barely worth the effort (will mainly make scheduled updates faster), and it allows the data to be easily accessed with standard tools, e.g.:

curl -s https://winbindex.m417z.com/data/by_filename_compressed/windows.ui.xaml.dll.json.gz | gzip -d | jq

But for x64dbg, it seems to me that having JSON, or any other specific format, has little benefit as long as x64dbg works well with it.

@mrexodia
Copy link
Member

It has been like this for 9 years, I see no reason to change it honestly (if for backwards compatibility reasons alone)... The compression with lz4 makes it quite reasonable on-disk, pretty similar to how much memory it uses for the in-memory structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue describes a bug. It does not mean the bug has been reproduced by a developer.
Projects
None yet
Development

No branches or pull requests

2 participants