Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some analysis about memory allocator in json-c #552

Open
dota17 opened this issue Mar 11, 2020 · 3 comments
Open

Some analysis about memory allocator in json-c #552

dota17 opened this issue Mar 11, 2020 · 3 comments

Comments

@dota17
Copy link
Member

dota17 commented Mar 11, 2020

When json-c parses a json string, it needs to request the memory for many structs/string with malloc()/calloc() func. Especially, if the json file is larger, the request frequencies is more.

But, malloc()/calloc() function have much disadvantages, like having low running speed, needing free memory, and causing memory fragmentation. I think we don't need to care about memory free or memory fragmentation, because the parsing process in short time will not bring serious problem of memory fragmentation. Memory request speed is the key factor for performance optimization. I parsed a 631k .json coming from the miloyip/nativejson-benchmark and got the results as follows:
parse time: 18.657ms

object request frequency
json_object 11968
lh_table 1961
lh_entry 1961
Araylist 1049
Arraylist->size 1049
json->o.c_string.str.ptr 1466

As we can see, json_object, lh_table, lh_entry, Araylist, Arraylist->size and json->o.c_string.str.ptr have many allocation frequencies. Take json_object as example, it requests memory in 11968 times and costs 4.379ms. So, memory allocation costs much time throughout the hole parse process. And if we want to improve performance of json-c,we could try to reduce the time required for memory requests.

  • Slab allocator is a memory management mechanism intended for the efficient memory allocation of objects. It is very suitable for the json_object, lh_table, lh_entry, Araylist. However, when I try using this, I couldn't find or include the linux/slab.h. I think it's not open to the user. Maybe I use it in wrong way? In addition, we need to create/destroy a kmem_cache for every struct. But, like memory pool, we just need to create one memory block at the begin. What's your opinion on this? We can discuss the implementation details later.

  • Memory pool is a good choice. I used imiskolee/mempool to request memory for json_object and the parse time is reduced. Maybe we could find a suitable opensource memory pool or just write one. This is a technical job.

  • At last, as request: json_init_library #540 mentioned, many libraries support wrapping a custom function for malloc/calloc and let the user choose to solve memory optimization problem. cJson and jsnsson are the examples. It is easy to be implemented.

Welcome everyone's comments. ha-ha

@hawicz
Copy link
Member

hawicz commented Mar 11, 2020

linux/slab.h, or anything from linux will not be usable due to incompatible licenses.
imiskolee/mempool is also unusable for the same reason. Also, if it really came from nginx, then it would seem to be stolen code that is being illegally relicensed. Finally, given the lack of activity and sensible commit history, that looks like a toy project, not an actual usable allocator, and wouldn't be appropriate to use even if the licensing were ok.

I suspect something got missed in your analysis. Are all arrays in your sample data less than 32 entries? If not then there should be more allocations for "Arraylist->size" (which I assume is actually for "Arraylist->array") than for "Arraylist". Similarly, if any object has more than 16 fields, the lh_entry count should be more than lh_table.

Also, any analysis of performance seems rather incomplete without details about what else is taking up time. If memory allocation is only 23% of the time (4.379/18.657), then wouldn't it make more sense to focus on the other 77% of the time spent instead?

@dota17
Copy link
Member Author

dota17 commented Mar 12, 2020

Yes, imiskolee/mempool is inappropriate and I just take it as an example. And, if needed, I suggest we could find a suitable opensource memory pool or just write one. The general idea is that request a large memory and allocate memory block for every object.

I missed the realloc indeed before, and update the data as follows:
parse time: 18.657ms

object request frequency
json_object 11968
lh_table 1961
lh_entry 1961
Araylist 1049
Arraylist->array 1051
json->o.c_string.str.ptr 1466

Memory allocate time of json_object: 4.379ms
Memory allocate time of alls : about 7ms
The memory allocation is 37.5% of the time (7/18.657). Currently I only analyzed the content related to memory allocation. And time spent in memory allocation is not very huge.

hawicz added a commit that referenced this issue May 25, 2020
…ed at https://github.com/json-c/json-c/wiki/Proposal:-struct-json_object-split

The current changes split out _only_ json_type_object, and thus have a number of hacks
 to allow the code to continue to build and work.

Originally  mentioned in issue #535.
When complete, this will probably invalidate #552.
This is likely to cause notable conflicts in any other significant un-merged
changes, such as PR#620.
@hawicz
Copy link
Member

hawicz commented Jun 20, 2020

Fyi, I just pushed commit e26a119, which significantly improves the memory used for use cases that have a lot of small arrays. That, plus the changes from the json_object-split branch which I just merged in, might help to make it less useful to use a slab allocator.
I don't currently have any plans to investigate use of other allocators besides malloc, but it still seems worth checking out, so I'm leaving this ticket open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants