Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document size limit to be increased to 16Mb? #25

Closed
sherry-ummen opened this issue Apr 21, 2015 · 6 comments
Closed

Document size limit to be increased to 16Mb? #25

sherry-ummen opened this issue Apr 21, 2015 · 6 comments

Comments

@sherry-ummen
Copy link

Hello,

Is it possible to increase the document size limit from 1Mb to 16Mb like how mongodb has? If not then what is the complication?

Thanks

@mbdavid
Copy link
Owner

mbdavid commented Apr 21, 2015

Hello @sherry-ummen! It's easy to change this limit from 1Mb to 16Mb, just need change here:

https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Document/BsonDocument.cs#L14

But...

The question is: why your document are so big? I think 1Mb a realllllly big document, and I always try keep under 100Kb.
Big document consume too memory (all document must be loaded into memory) and are too slow in read and write operations. Remember: LiteDB is an embedded database so this memory consume are from "client".

  • Can you split your document in more documents? You can use DbRef<T> to "join"
  • Your document contains a file or an ByteArray? You can use FileStorage
  • You document contains a big text? You can use FileStorage too.

Take a look on MongoDB documents about data modeling. All mongodb data modeling concepts are valid to LiteDB:
http://docs.mongodb.org/manual/core/data-modeling-introduction/

@sherry-ummen
Copy link
Author

Thanks Mauricio.

Yes the document is text and its big. Basically some graphical objects related data. And its legacy code which generates big objects. So its very difficult to change the behavior.

Why is it slow? Is it the serializer which is slow? We are currently using mongodb and if the doc size is more than 16MB then we store it as Blob.

But we want embedded database. And Litedb suits best.

Sent from my Windows Phone


From: Mauricio Davidmailto:notifications@github.com
Sent: ‎21/‎04/‎2015 16:07
To: mbdavid/LiteDBmailto:LiteDB@noreply.github.com
Cc: Sherry Ummenmailto:sherry.ummen@outlook.com
Subject: Re: [LiteDB] Document size limit to be increased to 16Mb? (#25)

Hello @sherry-ummen! It's easy to change this limit from 1Mb to 16Mb, just need change here:

https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Document/BsonDocument.cs#L14

But...

The question is: why your document are so big? I think 1Mb a realllllly big document, and I always try keep under 100Kb.
Big document consume too memory (all document must be loaded into memory) and are too slow in read and write operations. Remember: LiteDB is an embedded database so this memory consume are from "client".

  • Can you split your document in more documents? You can use DbRef<T> to "join"
  • Your document contains a file or an ByteArray? You can use FileStorage
  • You document contains a big text? You can use FileStorage too.

Take a look on MongoDB documents about data modeling. All mongodb data modeling concepts are valid to LiteDB:
http://docs.mongodb.org/manual/core/data-modeling-introduction/


Reply to this email directly or view it on GitHub:
#25 (comment)

@mbdavid
Copy link
Owner

mbdavid commented Apr 21, 2015

There is no problem to serialize/deserialize big documents, LiteDB uses TextRead/TextWriter to avoid performance problems. But documents are treated as a single unit, so when you need read a big document, you need read all data pages, store all in memory (CacheService) and deserialize all bytes. To save is the same problem: a minimal change must serialize all document and write in all pages.

FileStorage (as MongoDB GridFS) works as a splitter content in separate documents. To store big files, LiteDB split content in 1MB chunks and store one at time. After each chunk, LiteDB clear cache to avoid use too many memory.
https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Database/FileStorage/LiteFileStorage.cs#L49

@sherry-ummen
Copy link
Author

Ok so reading from all data pages is the problem? Then that should not be an issue in case of SSD ?. And will memory mapped i/o will help?

Sent from my Windows Phone


From: Mauricio Davidmailto:notifications@github.com
Sent: ‎21/‎04/‎2015 22:32
To: mbdavid/LiteDBmailto:LiteDB@noreply.github.com
Cc: Sherry Ummenmailto:sherry.ummen@outlook.com
Subject: Re: [LiteDB] Document size limit to be increased to 16Mb? (#25)

There is no problem to serialize/deserialize big documents, LiteDB uses TextRead/TextWriter to avoid performance problems. But documents are treated as a single unit, so when you need read a big document, you need read all data pages, store all in memory (CacheService) and deserialize all bytes. To save is the same problem: a minimal change must serialize all document and write in all pages.

FileStorage (as MongoDB GridFS) works as a splitter content in separate documents. To store big files, LiteDB split content in 1MB chunks and store one at time. After each chunk, LiteDB clear cache to avoid use too many memory.
https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Database/FileStorage/LiteFileStorage.cs#L49


Reply to this email directly or view it on GitHub:
#25 (comment)

@mbdavid
Copy link
Owner

mbdavid commented Apr 22, 2015

You will not avoid read all pages if your document is big and you need read all. To better performance, SSD disk are great and RAM memory too.

Read documents with 16Mb is not a big issue if you read one or two documents each time. If you have many, I recommend to "close" and "re-open" database (using(var db = new LiteDatabase(...) { ... }).

I have plans (it´s on my todo-list) to implement a better cache service that auto clear non-used cache pages, so it´s avoid to close/open database.

@mbdavid mbdavid closed this as completed Apr 25, 2015
@AhmedDeffous
Copy link

Thanks Mauricio.

Yes the document is text and its big. Basically some graphical objects related data. And its legacy code which generates big objects. So its very difficult to change the behavior.

Why is it slow? Is it the serializer which is slow? We are currently using mongodb and if the doc size is more than 16MB then we store it as Blob.

But we want embedded database. And Litedb suits best.

Sent from my Windows Phone

From: Mauricio Davidmailto:notifications@github.com Sent: ‎21/‎04/‎2015 16:07 To: mbdavid/LiteDBmailto:LiteDB@noreply.github.com Cc: Sherry Ummenmailto:sherry.ummen@outlook.com Subject: Re: [LiteDB] Document size limit to be increased to 16Mb? (#25)

Hello @sherry-ummen! It's easy to change this limit from 1Mb to 16Mb, just need change here:

https://github.com/mbdavid/LiteDB/blob/master/LiteDB/Document/BsonDocument.cs#L14

But...

The question is: why your document are so big? I think 1Mb a realllllly big document, and I always try keep under 100Kb. Big document consume too memory (all document must be loaded into memory) and are too slow in read and write operations. Remember: LiteDB is an embedded database so this memory consume are from "client".

  • Can you split your document in more documents? You can use DbRef<T> to "join"
  • Your document contains a file or an ByteArray? You can use FileStorage
  • You document contains a big text? You can use FileStorage too.

Take a look on MongoDB documents about data modeling. All mongodb data modeling concepts are valid to LiteDB: http://docs.mongodb.org/manual/core/data-modeling-introduction/

Reply to this email directly or view it on GitHub: #25 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants