Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File size of notes seem to be extremely large #847

Open
Iey4iej3 opened this issue Jul 24, 2023 · 19 comments
Open

File size of notes seem to be extremely large #847

Iey4iej3 opened this issue Jul 24, 2023 · 19 comments
Labels
bug Something isn't working

Comments

@Iey4iej3
Copy link

Describe the bug

After taking some notes, I find that the notes taken under Saber occupies much more spaces (~ 300 MBs during the last 3 weeks) than notes taken under Quill (~ 50 MBs during the last two years). There seem some efficiency issues of file size.

To reproduce

Take some notes, and look at the file size of notes.

Expected behavior

It should not be super large (e.g. exceeds that by Quill by a magnitude).

Saber version

v0.14.9

Device

  • Device:
  • OS: Android 13

Anything else?

No response

@Iey4iej3 Iey4iej3 added the bug Something isn't working label Jul 24, 2023
@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 24, 2023

I think #667 a #179 could be relevant.

@Iey4iej3

This comment was marked as off-topic.

@Iey4iej3

This comment was marked as off-topic.

@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 25, 2023

You can already use pen without pressure detection in pen settings:
tempFileForShare_20230725-143711

Maybe I just didn't get it. Do you use pressure sensitive pen and file is so big, or you use non-sensitive pen?

@ZebraVogel94349
Copy link
Contributor

I think you could also decrese the file size a lot, by using a binary format to store the coordinates and the pressure of each point instead of JSON (or including the binary data for each point in json, instead of storing the coordinates in plain text). Currently, a each point takes 41 Bytes. By using IEEE 754 single-precision floats for each value, you could reduce that to 12 Bytes and so reduce the file size by about 70%.

@Iey4iej3

This comment was marked as off-topic.

@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 25, 2023

Thanks for explanation. It's always very good to have really clear information :).

@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 25, 2023

I'm for a binary format, too. Saber hasn't been released publicly yet, so here is a "last" chance to change the format. You could do the format open (share its specification publicly + there always will be your open-source code using it), so I think there is no problem with it.

@Iey4iej3
Copy link
Author

By using IEEE 754 single-precision floats for each value,

Is there any reason to use floats in place of fixed point arithmetic? The later should be much more efficient than the former.

@ZebraVogel94349
Copy link
Contributor

By using IEEE 754 single-precision floats for each value,

Is there any reason to use floats in place of fixed point arithmetic? The later should be much more efficient than the former.

Using integers would probably be much more efficient, but I think you need to change a lot to make them work, since currently the editor uses doubles everywere. I don't see any other reason not to use them.

@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 25, 2023

It could use integers for even better performance, of course. When number of decimal points is strictly set to 4 and 6 respectively, it could be done just like in financial things. But I thing this is not a main problem (but of course it should bring some performance boost). I don't know Dart syntax at all, so I hasn't looked at @adil192's code yet. But I know C, so I know, the problems you noted here are relevant and they should be resolved. Now, Saber has huge performance issues (I tested it myself, see #667), so it's important to solve them. Otherwise Saber is very good app, so I think it should get a chance, even if the cost is completely rewriting back-end of ink saving/loading procedures.

@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 25, 2023

I don't know how Dart works and if there is simple ways how to use C/C++ libraries in Dart code like in Python. If Dart has got similar ability, maybe the best option is to write a little C/C++ for working with storage files for the best performance.

@Iey4iej3
Copy link
Author

By using IEEE 754 single-precision floats for each value,

Is there any reason to use floats in place of fixed point arithmetic? The later should be much more efficient than the former.

Using integers would probably be much more efficient, but I think you need to change a lot to make them work, since currently the editor uses doubles everywere. I don't see any other reason not to use them.

I don't know this language, but can one simply replace the keyword float by, say, foobar, and define the type foobar and encapsulate the addition and multiplication, say, so that other parts are somehow minimally touched?

@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 26, 2023

I think @adil192 wants to do something with that. They have a milestone for it — see https://github.com/adil192/saber/milestone/2. But I don't know if this milestone is meant to be before or after (more probably in my opinion) the official release. If they plan to do changes after release, it could be a little or more problematic since the app shouldn't change a lot after stable release.

@ZebraVogel94349
Copy link
Contributor

Today I tried saving each stroke in 3 floats converted into base64, which reduced the file size by about 50%. But when compressing a note which uses base64 and the same note without base64, the file sizes were almost identical (206kb and 213kb). I don't think using a binary file is much more space efficient than just compressing the notes (with gzip I was able to reduce the file size from about 900kb to 200kb). However, I don't know how compressing would impact the performance. A binary file format and an option to compress all notes would probably be the best solution.

@ceskyDJ
Copy link
Contributor

ceskyDJ commented Jul 28, 2023

You should take care of the performance. It's another problem of current implementation. For this reason, I highly recommend using binary format, so compression won't be necessary, and some performance overhead will be freed. Btw, good work! It seems a lot better now (from your numbers).

There is one other important thing – annotating imported images and documents. Do you work with them, too, when you enhance implementation of saving/loading mechanism?

@ZebraVogel94349
Copy link
Contributor

My base64 approach decreses the file size by a lot, but the encoding takes very long. When loading more then about 20 pages, there is a noticable lag (i think it was about 10ms per page). So it's not really a suitable method for reducing the file size. A binary format is definitely the best way to reduce the file size.
Maybe some already existing binary formats like BSON or CBOR are a good solution (there are dart packages for both of them). I experimented a bit with them but couldn't get anything working, since you have to change quite a lot and there is not much documentation available about those libraries except for the API reference. Especially BSON looked promising. Maybe I'll find a way to use it to save notes next week.

@adil192
Copy link
Member

adil192 commented Jul 31, 2023

@Iey4iej3 Did you mark your own comments as off-topic? Because I don't remember doing that and I'll unhide them if you didn't

adil192 added a commit that referenced this issue Jul 31, 2023
adil192 added a commit that referenced this issue Jul 31, 2023
adil192 added a commit that referenced this issue Jul 31, 2023
@Iey4iej3
Copy link
Author

I hid them because they are not directly related to the reduction of file sizes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants