Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce binary size #21

Open
eagleflo opened this issue Feb 11, 2023 · 4 comments
Open

Reduce binary size #21

eagleflo opened this issue Feb 11, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@eagleflo
Copy link
Owner

eagleflo commented Feb 11, 2023

As of now jisho is a quite large binary, as no effort whatsoever has been spent in optimizing for binary size.

However, it looks like Rust tooling has (recently?) grown more aware of binary sizes, and trying to update the embedded JMdict version to a more recent version triggered some built-in size limit of Crates.io. The JSON files derived from JMdict are certainly much more verbose than necessary, so this should be relatively easy to fix.

@eagleflo eagleflo added the enhancement New feature or request label Feb 11, 2023
@eagleflo
Copy link
Owner Author

Looking back to this, I've come up with some ideas:

  • Switching from JSON files to a binary format like CBOR or MessagePack
  • Compressing the JSON files
  • Switching to SQLite

Out of these, I'm right now most intrigued by the third option, as it would cut the amount of data into a third right away and provides an extensible base for future needs. I'll give it a try.

@eagleflo
Copy link
Owner Author

It's quite cumbersome to try to read an SQLite database that's embedded in the binary. I might come back to this approach later, but for now I'll just compress the JSON files with flate2. This is already a marked improvement.

eagleflo added a commit that referenced this issue Jun 25, 2023
eagleflo added a commit that referenced this issue Dec 4, 2023
@eagleflo
Copy link
Owner Author

eagleflo commented Dec 4, 2023

Compressing the JSON files results in the binary shrinking from 121MB to 32MB... however, this also results in a hefty performance degradation:

~/jisho (compress-dictionaries) % ./bench
    Finished release [optimized] target(s) in 0.02s
Benchmark 1: cargo run --release 緑
  Time (mean ± σ):     285.1 ms ±   4.3 ms    [User: 205.4 ms, System: 79.0 ms]
  Range (min … max):   279.6 ms … 294.7 ms    10 runs

Benchmark 1: cargo run --release みどり
  Time (mean ± σ):     326.9 ms ±   6.6 ms    [User: 234.9 ms, System: 91.2 ms]
  Range (min … max):   321.6 ms … 344.8 ms    10 runs

Benchmark 1: cargo run --release green
  Time (mean ± σ):     641.5 ms ±   4.5 ms    [User: 496.0 ms, System: 144.1 ms]
  Range (min … max):   635.1 ms … 648.4 ms    10 runs

compared to

~/jisho (main) % ./bench
   Compiling jisho v0.1.7 (/home/aku/jisho)
    Finished release [optimized] target(s) in 22.98s
Benchmark 1: cargo run --release 緑
  Time (mean ± σ):     204.7 ms ±   1.9 ms    [User: 137.3 ms, System: 66.8 ms]
  Range (min … max):   201.6 ms … 207.8 ms    14 runs

Benchmark 1: cargo run --release みどり
  Time (mean ± σ):     232.6 ms ±   2.7 ms    [User: 147.5 ms, System: 84.2 ms]
  Range (min … max):   229.0 ms … 237.6 ms    12 runs

Benchmark 1: cargo run --release green
  Time (mean ± σ):     448.2 ms ±   4.7 ms    [User: 295.3 ms, System: 151.7 ms]
  Range (min … max):   441.0 ms … 454.0 ms    10 runs

Slowing down the quick CLI lookup usecase by 50% is a dealbreaker. I'll figure out something else.

@eagleflo
Copy link
Owner Author

I keep thinking moving from JSON files to SQLite would most likely be a big improvement here, in addition to being more flexible in other ways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant