Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActualSize much less than real file size. #86

Closed
alhah opened this issue Oct 16, 2018 · 26 comments
Closed

ActualSize much less than real file size. #86

alhah opened this issue Oct 16, 2018 · 26 comments

Comments

@alhah
Copy link

alhah commented Oct 16, 2018

The language of MMKV

Java

The version of MMKV

v1.0.10

The platform of MMKV

Android

What's the issue?

OOM caused by mmkv file size too large.
loading [***] with file size 134217728, oldActualSize 6788724, newActualSize 6788839

Only 1000+ key-values occupies 130M.
What's the possible reason for actualSize much less than file size, multi-process read & write in single process mode?

@lingol
Copy link
Collaborator

lingol commented Oct 16, 2018

It's possible. It's not about how many items you store, but how big your items are.

@alhah
Copy link
Author

alhah commented Oct 16, 2018

Should actualSize equals file size?

@lingol
Copy link
Collaborator

lingol commented Oct 16, 2018

No.

@lingol
Copy link
Collaborator

lingol commented Oct 17, 2018

And I don't recommend storing big data inside MMKV, at lease not in current release.
Checkout the FAQ for more information.

@alhah
Copy link
Author

alhah commented Oct 17, 2018

Thank you. We didn't stored big date.
I check the big mmkv file with hex friend, only first 4M is valid data, remaining 130M is "0".

@lingol
Copy link
Collaborator

lingol commented Oct 17, 2018

That's weird. Can you send a reproducible demo project to me? guoling@tencent.com

@alhah
Copy link
Author

alhah commented Oct 17, 2018

I don't know the reproduce step, I got the file from online gray version.

The first line is like this, the actual size is same with valid data size.
69B93F00 A6DE021C 31303634 32373733 37396C61 73745F75

The last line is like this, which means “ "mRealTime":false}] ”, seems like the end of a string set.
226D5265 616C5469 6D65223A 66616C73 657D5D00 0000

I am checking the value size key by key.

@lingol
Copy link
Collaborator

lingol commented Oct 17, 2018

By the way, why are you using MULTI_PROCESS_MODE in a single process App?

@alhah
Copy link
Author

alhah commented Oct 17, 2018

It's a multi process App, I mean maybe we used SINGLE_PROCESS_MODE in some MULTI_PROCESS_MODE cases,

By the way, the last value mentioned before is actually a Json String of List, is there any possible the calculate of Json size occurred error?

@lingol
Copy link
Collaborator

lingol commented Oct 17, 2018

Json is just a String, nothing special. Is there any chance that there're some big items stored in MMKV and deleted afterward?

@alhah
Copy link
Author

alhah commented Oct 17, 2018

I am not very sure of that, need further check, what's the exact definition of "big"?

@lingol
Copy link
Collaborator

lingol commented Oct 17, 2018

Items with size of 10K+ is consider big.

@alhah
Copy link
Author

alhah commented Oct 17, 2018

Do you mean remove big items would not trigger file resize?
I add some log in MiniPBCoder::decodeOneMap:
{
int i = 0;
int j = 0;
while (!m_inputData->isAtEnd()) {
const auto &key = m_inputData->readString();
if (key.length() > 0) {
i++;
auto value = m_inputData->readData();
if (value.length() > 0) {
j += value.length();
dic[key] = move(value);
} else {
dic.erase(key);
}
}
}
MMKVInfo("total %d keys, value size %d", i, j);
}
loading [] with 4176233 size in total, file size is 134217728
loading [
] with crc 4170586551 sequence 4
total 3412 keys, value size 4081561
loaded [***] with 1111 values

@alhah
Copy link
Author

alhah commented Oct 17, 2018

I think I have found the reason, the futureUsage is over estimated:
size_t futureUsage = newSize * std::max<size_t>(8, (m_dic.size() + 1) / 2);

My last value size is 130K, m_dic.size() is 1000+, then the file will resize to 130M, but most of it will never been used.

This means, if I store thousands of keys in a file, it will very easy to cause OOM.

@lingol
Copy link
Collaborator

lingol commented Oct 17, 2018

Oh I see...Let me think about that.

@lingol
Copy link
Collaborator

lingol commented Oct 22, 2018

Fixed with this commit.

@alhah
Copy link
Author

alhah commented Oct 22, 2018

I think it's good enough for me, thank you. How far will this release?

By the way, I found there is no reduce file size operation except "clearAll", if I store a lot of key-values in one file and then delete most of them in some time, it seems will waste some disk space.

Should I avoid this kind of usage as far as possible?

@lingol
Copy link
Collaborator

lingol commented Oct 22, 2018

That's a good suggestion. A trim operation should be supported.

And by the way, the next minor release is not on schedule yet. I'll let you know by commenting on this issue when that happens.

@alhah
Copy link
Author

alhah commented Oct 22, 2018

Waiting for good news, thank you very much!

@lingol
Copy link
Collaborator

lingol commented Nov 14, 2018

Release with v1.0.13, check it out.

@lingol lingol closed this as completed Nov 14, 2018
@alhah
Copy link
Author

alhah commented Nov 14, 2018

Wonderful!

@alhah
Copy link
Author

alhah commented Nov 20, 2018

It seems use average item size to calc future usage does not merge into android code.

@lingol
Copy link
Collaborator

lingol commented Nov 20, 2018

It seems use average item size to calc future usage does not merge into android code.

I can't believe these, looks like there's a mistake on my local git repo.
Will fix soon.

@lingol lingol reopened this Nov 20, 2018
@lingol
Copy link
Collaborator

lingol commented Nov 21, 2018

Fixed with this commit. A new version will likely come out next week.

@alhah
Copy link
Author

alhah commented Nov 21, 2018

Look forward to it, thank you.

@lingol
Copy link
Collaborator

lingol commented Nov 30, 2018

Released with v1.0.14.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants