Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realm file size #709

Closed
matthewcheok opened this issue Jul 30, 2014 · 24 comments
Closed

Realm file size #709

matthewcheok opened this issue Jul 30, 2014 · 24 comments

Comments

@matthewcheok
Copy link

It seems the realm file has a chance of doubling in file size on each app launch. What gives?
Edit: demo project at https://github.com/matthewcheok/Realm-JSON

@tgoyne
Copy link
Member

tgoyne commented Jul 30, 2014

+ [RLMObject createInRealm:withJSONArray:] is adding the updated or created object to the realm, but that's already done by [RLMObject mc_createOrUpdateInRealm:withJSONDictionary:], so each time the data is refreshed all of the objects being updated are duplicated.

@jpsim
Copy link
Contributor

jpsim commented Jul 30, 2014

This one is interesting... I was able to reproduce the issue. The strange part is that it doesn't double in size on every launch, which leads me to think there must be a race condition somewhere.

Also note that no content is duplicated to cause this increase in file size, as there are no duplicate entries visible in the Realm Browser.

Hopefully someone from the core team can look into why these two realm files (with the same entries) are different sizes:

http://static.realm.io/debug/small_realm.zip
http://static.realm.io/debug/large_realm.zip

/cc @kspangsege @rrrlasse @finnschiermer

@astigsen
Copy link
Contributor

My guess is that we somewhere in the binding is holding on to a read
transaction, preventing reuse of the old space.

On Wed, Jul 30, 2014 at 11:36 AM, JP Simard notifications@github.com
wrote:

This one is interesting... I was able to reproduce the issue. The strange
part is that it doesn't double in size on every launch, which leads me to
think there must be a race condition somewhere.

Also note that no content is duplicated to cause this increase in file
size, as there are no duplicate entries visible in the Realm Browser.

Hopefully someone from the core team can look into why these two realm
files (with the same entries) are different sizes:

http://static.realm.io/debug/small_realm.zip
http://static.realm.io/debug/large_realm.zip

/cc @kspangsege https://github.com/kspangsege @rrrlasse
https://github.com/rrrlasse @finnschiermer
https://github.com/finnschiermer


Reply to this email directly or view it on GitHub
#709 (comment).

@kspangsege
Copy link
Contributor

Each Realm file does contain a "map" of what it thinks is free space inside
the file (Group::m_free_positions, Group::m_free_lengths). There is a
chance that the problem is due to a kind of corruption in this map.
Unfortunately we don't have code to verify the consistency of this
free-space map relative to the used space (that which is reachable from the
root node). I'll write this code and see what it tells me.

On Wed, Jul 30, 2014 at 8:45 PM, astigsen notifications@github.com wrote:

My guess is that we somewhere in the binding is holding on to a read
transaction, preventing reuse of the old space.

On Wed, Jul 30, 2014 at 11:36 AM, JP Simard notifications@github.com
wrote:

This one is interesting... I was able to reproduce the issue. The
strange
part is that it doesn't double in size on every launch, which leads me
to
think there must be a race condition somewhere.

Also note that no content is duplicated to cause this increase in file
size, as there are no duplicate entries visible in the Realm Browser.

Hopefully someone from the core team can look into why these two realm
files (with the same entries) are different sizes:

http://static.realm.io/debug/small_realm.zip
http://static.realm.io/debug/large_realm.zip

/cc @kspangsege https://github.com/kspangsege @rrrlasse
https://github.com/rrrlasse @finnschiermer
https://github.com/finnschiermer


Reply to this email directly or view it on GitHub
#709 (comment).


Reply to this email directly or view it on GitHub
#709 (comment).

@matthewcheok
Copy link
Author

Any possible fixes or workarounds? I'm really keen to use Realm in a production app.

@finnschiermer
Copy link

This might not be an error at all.

As soon as multiple realms are in use in the app, timing differences
may lead to different file layouts. There are two causes of wasted space:
Fragmentation and retained versions.

Fragmentation may occur and cause a file size which is significantly larger
than the contained data. For larger sizes the fragmentation usually is a
small problem.
But note, that in the context of our leaf sizes, both the files presented
here must
be considered small .. meaning that significant fragmentation may occur.

Retained versions hold data for realms which are older than the most recent.
As these versions are released they become free data, but the free areas can
only be taken into use by commits occurring after the release.

Even though the application exits and the lock file is removed, different
file sizes may
still result, because the database is not "compacted" when the app exits.
When
the app exits, all retained versions are freed, BUT all the space they
consumed
will have been allocated in the file, scattered around in it, and as there
is no
compaction, the file does not shrink.

So filesize depends very much on the exact operation of the app. Especially
for
small sizes and applications which (at some point in their live) have many
retained versions.

Core limits the number of retained versions to 100 if I remember correctly.
If each of these versions hold changes to most of the data, each version
may take up as much space as the final version.

/Finn

On Wed, Jul 30, 2014 at 8:45 PM, astigsen notifications@github.com wrote:

My guess is that we somewhere in the binding is holding on to a read
transaction, preventing reuse of the old space.

On Wed, Jul 30, 2014 at 11:36 AM, JP Simard notifications@github.com
wrote:

This one is interesting... I was able to reproduce the issue. The
strange
part is that it doesn't double in size on every launch, which leads me
to
think there must be a race condition somewhere.

Also note that no content is duplicated to cause this increase in file
size, as there are no duplicate entries visible in the Realm Browser.

Hopefully someone from the core team can look into why these two realm
files (with the same entries) are different sizes:

http://static.realm.io/debug/small_realm.zip
http://static.realm.io/debug/large_realm.zip

/cc @kspangsege https://github.com/kspangsege @rrrlasse
https://github.com/rrrlasse @finnschiermer
https://github.com/finnschiermer


Reply to this email directly or view it on GitHub
#709 (comment).


Reply to this email directly or view it on GitHub
#709 (comment).

@finnschiermer
Copy link

I didn't remember correctly. There are no limit to the number of retained
versions.

On Thu, Jul 31, 2014 at 10:30 AM, Finn Schiermer Andersen fsa@realm.io
wrote:

This might not be an error at all.

As soon as multiple realms are in use in the app, timing differences
may lead to different file layouts. There are two causes of wasted space:
Fragmentation and retained versions.

Fragmentation may occur and cause a file size which is significantly
larger than the contained data. For larger sizes the fragmentation usually
is a small problem.
But note, that in the context of our leaf sizes, both the files presented
here must
be considered small .. meaning that significant fragmentation may occur.

Retained versions hold data for realms which are older than the most
recent.
As these versions are released they become free data, but the free areas
can
only be taken into use by commits occurring after the release.

Even though the application exits and the lock file is removed, different
file sizes may
still result, because the database is not "compacted" when the app exits.
When
the app exits, all retained versions are freed, BUT all the space they
consumed
will have been allocated in the file, scattered around in it, and as there
is no
compaction, the file does not shrink.

So filesize depends very much on the exact operation of the app.
Especially for
small sizes and applications which (at some point in their live) have many
retained versions.

Core limits the number of retained versions to 100 if I remember correctly.
If each of these versions hold changes to most of the data, each version
may take up as much space as the final version.

/Finn

On Wed, Jul 30, 2014 at 8:45 PM, astigsen notifications@github.com
wrote:

My guess is that we somewhere in the binding is holding on to a read
transaction, preventing reuse of the old space.

On Wed, Jul 30, 2014 at 11:36 AM, JP Simard notifications@github.com
wrote:

This one is interesting... I was able to reproduce the issue. The
strange
part is that it doesn't double in size on every launch, which leads me
to
think there must be a race condition somewhere.

Also note that no content is duplicated to cause this increase in file
size, as there are no duplicate entries visible in the Realm Browser.

Hopefully someone from the core team can look into why these two realm
files (with the same entries) are different sizes:

http://static.realm.io/debug/small_realm.zip
http://static.realm.io/debug/large_realm.zip

/cc @kspangsege https://github.com/kspangsege @rrrlasse
https://github.com/rrrlasse @finnschiermer
https://github.com/finnschiermer


Reply to this email directly or view it on GitHub
#709 (comment).


Reply to this email directly or view it on GitHub
#709 (comment).

@finnschiermer
Copy link

Is there a possibility that the binding might be holding on to old versions,
for example by always creating new realm objects instead of reusing the
latest one?

On Thu, Jul 31, 2014 at 10:30 AM, Finn Schiermer Andersen fsa@realm.io
wrote:

This might not be an error at all.

As soon as multiple realms are in use in the app, timing differences
may lead to different file layouts. There are two causes of wasted space:
Fragmentation and retained versions.

Fragmentation may occur and cause a file size which is significantly
larger than the contained data. For larger sizes the fragmentation usually
is a small problem.
But note, that in the context of our leaf sizes, both the files presented
here must
be considered small .. meaning that significant fragmentation may occur.

Retained versions hold data for realms which are older than the most
recent.
As these versions are released they become free data, but the free areas
can
only be taken into use by commits occurring after the release.

Even though the application exits and the lock file is removed, different
file sizes may
still result, because the database is not "compacted" when the app exits.
When
the app exits, all retained versions are freed, BUT all the space they
consumed
will have been allocated in the file, scattered around in it, and as there
is no
compaction, the file does not shrink.

So filesize depends very much on the exact operation of the app.
Especially for
small sizes and applications which (at some point in their live) have many
retained versions.

Core limits the number of retained versions to 100 if I remember correctly.
If each of these versions hold changes to most of the data, each version
may take up as much space as the final version.

/Finn

On Wed, Jul 30, 2014 at 8:45 PM, astigsen notifications@github.com
wrote:

My guess is that we somewhere in the binding is holding on to a read
transaction, preventing reuse of the old space.

On Wed, Jul 30, 2014 at 11:36 AM, JP Simard notifications@github.com
wrote:

This one is interesting... I was able to reproduce the issue. The
strange
part is that it doesn't double in size on every launch, which leads me
to
think there must be a race condition somewhere.

Also note that no content is duplicated to cause this increase in file
size, as there are no duplicate entries visible in the Realm Browser.

Hopefully someone from the core team can look into why these two realm
files (with the same entries) are different sizes:

http://static.realm.io/debug/small_realm.zip
http://static.realm.io/debug/large_realm.zip

/cc @kspangsege https://github.com/kspangsege @rrrlasse
https://github.com/rrrlasse @finnschiermer
https://github.com/finnschiermer


Reply to this email directly or view it on GitHub
#709 (comment).


Reply to this email directly or view it on GitHub
#709 (comment).

@matthewcheok
Copy link
Author

screen shot 2014-07-31 at 5 03 53 pm

It's not always creating new objects because in the realm browser there are the same number of objects before and after. In my testing (in the above screenshot) I can't consider the file size to be small.

@finnschiermer
Copy link

I'm not that deep into the objective-C binding so I might be asking a really stupid question here: Where in the source code do I find the begin and end write transactions?

And a does the block executed after the GET operation happen on the main thread or in another thread?

@matthewcheok
Copy link
Author

The begin and end transactions are in the category method.  And yes the GET happens on a background thread. Does [RLMRealm defaultRealm] cache the default realm across threads?

On Thu, Jul 31, 2014 at 7:36 PM, Finn Schiermer Andersen
notifications@github.com wrote:

I'm not that deep into the objective-C binding so I might be asking a really stupid question here: Where in the source code do I find the begin and end write transactions?

And a does the block executed after the GET operation happen on the main thread or in another thread?

Reply to this email directly or view it on GitHub:
#709 (comment)

@matthewcheok
Copy link
Author

I tried updating on the main thread and still see the same issue.

On Thu, Jul 31, 2014 at 7:39 PM, Matthew Cheok cheok.jz@gmail.com wrote:

The begin and end transactions are in the category method.  And yes the GET happens on a background thread. Does [RLMRealm defaultRealm] cache the default realm across threads?
On Thu, Jul 31, 2014 at 7:36 PM, Finn Schiermer Andersen
notifications@github.com wrote:

I'm not that deep into the objective-C binding so I might be asking a really stupid question here: Where in the source code do I find the begin and end write transactions?

And a does the block executed after the GET operation happen on the main thread or in another thread?

Reply to this email directly or view it on GitHub:
#709 (comment)

@finnschiermer
Copy link

I must agree that 700Mb+ cannot be considered small.
Realm files start out with a size of 4Kb. They double in size whenever we run out of space,
until they reach 128Mb. Then they grow in increments of 128Mb.

You're right, It does look like a bug.

wrt caching Realm objects, it is my understanding that the objective-c binding uses thread-specific
caches.

@finnschiermer
Copy link

Is the growth in size dependent upon the presence of the .lock file.....
If you manually remove the .lock file between each launch of the app, do you then still see the problem?

@jpsim
Copy link
Contributor

jpsim commented Jul 31, 2014

@finnschiermer the same issue occurs even when deleting .lock files between launches.

@finnschiermer
Copy link

Thanks, JP
:-)

On Thu, Jul 31, 2014 at 10:39 PM, JP Simard notifications@github.com
wrote:

@finnschiermer https://github.com/finnschiermer the same issue occurs
even when deleting .lock files between launches.


Reply to this email directly or view it on GitHub
#709 (comment).

@justjimmy
Copy link

Seeing this issue and kinda glad I'm not the only one. It grows so big that eventually it crashes on launch, no memory.

@philippeauriach
Copy link

Seeing it too ! File is growing a lot, and finally cause my app crash on start (at the Realm creation, no memory)

@jjoelson
Copy link

jjoelson commented Aug 4, 2014

Same issue as @justjimmy and @philippeauriach. Eventually my app just crashes on realm creation and I have to delete the file.

@ajimix
Copy link

ajimix commented Aug 4, 2014

We are having the same issue the file just keeps growing and eventually it crashes because there is no memory.

@timanglade
Copy link
Contributor

As noted on the mailing-list, a fix for this was pushed to master earlier this morning! An official release will be live on our site later today, but you can build from source in the meantime to confirm.

@alazier
Copy link
Contributor

alazier commented Aug 5, 2014

This should now be resolved with the latest release.

@alazier alazier closed this as completed Aug 5, 2014
@justjimmy
Copy link

Great stuff! Thanks heaps. Just installed it and after some initial quick tests, it seems to be fixed.

@jjoelson
Copy link

jjoelson commented Aug 5, 2014

Fixed for me as well.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests