-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large file #35
Comments
Yes, there are a few things that we could do, some easier than others:
To get a best-case estimate, could you try gzip-ing the two files and seeing how large each of them are? |
You mean the proviola (18M/320K) while the alectryon (228M/7M)? |
Yes, absolutely. Thanks! One more experiment: if you enable caching, how large is the resulting cache file, and how much does it compress? (you can enable caching with |
Not sure how I do this with firefox. |
Oh, sorry: I meant enabling caching in Alectryon: Alectryon can speed up repetitive compilations by saving goals and responses as JSON files. |
Did the following command
but the html file is about the same size and there is nothing in the cache directory |
You're right, there Here's what I see on a mid-sized file:
So on an example like Same conclusions, though with an even more pronounced reduction in size, for a file known to give Alectryon trouble:
Oh, and if you're curious, here's a comparison of brotli vs gzip:
So it doesn't look like caches as currently implemented help much versus plain HTML in terms of size. Also, it looks like if your server supports compression you're not in too bad a place (especially if it supports brotli, which beatz gzip by 7x in the If you need smaller files, e.g. to keep them in a git repo, then caches don't immediately help, because they're not designed for size (they're pretty-printed, among other things). Let's see if we can make them smaller. Here's what happens when you just remove the blanks:
That's a bit more than 50% off; minifying further, using a less verbose format, we can save about 35% less:
and finally with a simple sed script to try to gauge further savings I get an extra 25%:
So, it would be reasonable, if uncompressed size is a concern, to invest in making smaller movie files. I think we could save about 80% versus uncompressed HTML. We would generate webpages that have only placeholders for the |
I've done many more experiments with this. Adding a simple form of deduplication (replacing repeated goals and hypotheses with backreferences) shrinks caches quite a bit:
I've implemented something similar for webpages directly, and it works pretty well, too:
(It also makes pages load a lot faster, about 60% in the Ranalysis case). Finally, I've added an option for cache compression. One thing that's really striking is that XZ and Brotli beat GZip by far on these examples:
Can you try the implementation that's in the |
Part of GH-35. This saves quite a bit of space.
Some measurements: 4.4M List.html 115K List.html.gz 83K List.html.br 2.0M List.v.deduplicated-hyp.html 88K List.v.deduplicated-hyp.html.gz 78K List.v.deduplicated-hyp.html.br 1.9M List.v.deduplicated-hyp-hyps.html 88K List.v.deduplicated-hyp-hyps.html.gz 77K List.v.deduplicated-hyp-hyps.html.br 1.8M List.v.deduplicated-hyp-hyps-ccl.html 88K List.v.deduplicated-hyp-hyps-ccl.html.gz 78K List.v.deduplicated-hyp-hyps-ccl.html.br 25M Ranalysis3.html 359K Ranalysis3.html.gz 53K Ranalysis3.html.br 3.0M Ranalysis3.v.deduplicated-hyp.html 68K Ranalysis3.v.deduplicated-hyp.html.gz 39K Ranalysis3.v.deduplicated-hyp.html.br 2.3M Ranalysis3.v.deduplicated-hyp-hyps.html 63K Ranalysis3.v.deduplicated-hyp-hyps.html.gz 38K Ranalysis3.v.deduplicated-hyp-hyps.html.br 1.5M Ranalysis3.v.deduplicated-hyp-hyps-ccl.html 58K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.gz 35K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.br Part of GH-35.
sorry I've been busy lately. I will try to test asap (firtunately by the end of the week) |
Some measurements: 4.4M List.html 115K List.html.gz 83K List.html.br 2.0M List.v.deduplicated-hyp.html 88K List.v.deduplicated-hyp.html.gz 78K List.v.deduplicated-hyp.html.br 1.9M List.v.deduplicated-hyp-hyps.html 88K List.v.deduplicated-hyp-hyps.html.gz 77K List.v.deduplicated-hyp-hyps.html.br 1.8M List.v.deduplicated-hyp-hyps-ccl.html 88K List.v.deduplicated-hyp-hyps-ccl.html.gz 78K List.v.deduplicated-hyp-hyps-ccl.html.br 25M Ranalysis3.html 359K Ranalysis3.html.gz 53K Ranalysis3.html.br 3.0M Ranalysis3.v.deduplicated-hyp.html 68K Ranalysis3.v.deduplicated-hyp.html.gz 39K Ranalysis3.v.deduplicated-hyp.html.br 2.3M Ranalysis3.v.deduplicated-hyp-hyps.html 63K Ranalysis3.v.deduplicated-hyp-hyps.html.gz 38K Ranalysis3.v.deduplicated-hyp-hyps.html.br 1.5M Ranalysis3.v.deduplicated-hyp-hyps-ccl.html 58K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.gz 35K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.br Part of GH-35.
Part of GH-35. This saves quite a bit of space.
Some measurements: 4.4M List.html 115K List.html.gz 83K List.html.br 2.0M List.v.deduplicated-hyp.html 88K List.v.deduplicated-hyp.html.gz 78K List.v.deduplicated-hyp.html.br 1.9M List.v.deduplicated-hyp-hyps.html 88K List.v.deduplicated-hyp-hyps.html.gz 77K List.v.deduplicated-hyp-hyps.html.br 1.8M List.v.deduplicated-hyp-hyps-ccl.html 88K List.v.deduplicated-hyp-hyps-ccl.html.gz 78K List.v.deduplicated-hyp-hyps-ccl.html.br 25M Ranalysis3.html 359K Ranalysis3.html.gz 53K Ranalysis3.html.br 3.0M Ranalysis3.v.deduplicated-hyp.html 68K Ranalysis3.v.deduplicated-hyp.html.gz 39K Ranalysis3.v.deduplicated-hyp.html.br 2.3M Ranalysis3.v.deduplicated-hyp-hyps.html 63K Ranalysis3.v.deduplicated-hyp-hyps.html.gz 38K Ranalysis3.v.deduplicated-hyp-hyps.html.br 1.5M Ranalysis3.v.deduplicated-hyp-hyps-ccl.html 58K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.gz 35K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.br Part of GH-35.
Part of GH-35. This saves quite a bit of space.
Some measurements: 4.4M List.html 115K List.html.gz 83K List.html.br 2.0M List.v.deduplicated-hyp.html 88K List.v.deduplicated-hyp.html.gz 78K List.v.deduplicated-hyp.html.br 1.9M List.v.deduplicated-hyp-hyps.html 88K List.v.deduplicated-hyp-hyps.html.gz 77K List.v.deduplicated-hyp-hyps.html.br 1.8M List.v.deduplicated-hyp-hyps-ccl.html 88K List.v.deduplicated-hyp-hyps-ccl.html.gz 78K List.v.deduplicated-hyp-hyps-ccl.html.br 25M Ranalysis3.html 359K Ranalysis3.html.gz 53K Ranalysis3.html.br 3.0M Ranalysis3.v.deduplicated-hyp.html 68K Ranalysis3.v.deduplicated-hyp.html.gz 39K Ranalysis3.v.deduplicated-hyp.html.br 2.3M Ranalysis3.v.deduplicated-hyp-hyps.html 63K Ranalysis3.v.deduplicated-hyp-hyps.html.gz 38K Ranalysis3.v.deduplicated-hyp-hyps.html.br 1.5M Ranalysis3.v.deduplicated-hyp-hyps-ccl.html 58K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.gz 35K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.br Part of GH-35.
Part of GH-35. This saves quite a bit of space.
Some measurements: 4.4M List.html 115K List.html.gz 83K List.html.br 2.0M List.v.deduplicated-hyp.html 88K List.v.deduplicated-hyp.html.gz 78K List.v.deduplicated-hyp.html.br 1.9M List.v.deduplicated-hyp-hyps.html 88K List.v.deduplicated-hyp-hyps.html.gz 77K List.v.deduplicated-hyp-hyps.html.br 1.8M List.v.deduplicated-hyp-hyps-ccl.html 88K List.v.deduplicated-hyp-hyps-ccl.html.gz 78K List.v.deduplicated-hyp-hyps-ccl.html.br 25M Ranalysis3.html 359K Ranalysis3.html.gz 53K Ranalysis3.html.br 3.0M Ranalysis3.v.deduplicated-hyp.html 68K Ranalysis3.v.deduplicated-hyp.html.gz 39K Ranalysis3.v.deduplicated-hyp.html.br 2.3M Ranalysis3.v.deduplicated-hyp-hyps.html 63K Ranalysis3.v.deduplicated-hyp-hyps.html.gz 38K Ranalysis3.v.deduplicated-hyp-hyps.html.br 1.5M Ranalysis3.v.deduplicated-hyp-hyps-ccl.html 58K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.gz 35K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.br Part of GH-35.
Part of GH-35. This saves quite a bit of space.
Some measurements: 4.4M List.html 115K List.html.gz 83K List.html.br 2.0M List.v.deduplicated-hyp.html 88K List.v.deduplicated-hyp.html.gz 78K List.v.deduplicated-hyp.html.br 1.9M List.v.deduplicated-hyp-hyps.html 88K List.v.deduplicated-hyp-hyps.html.gz 77K List.v.deduplicated-hyp-hyps.html.br 1.8M List.v.deduplicated-hyp-hyps-ccl.html 88K List.v.deduplicated-hyp-hyps-ccl.html.gz 78K List.v.deduplicated-hyp-hyps-ccl.html.br 25M Ranalysis3.html 359K Ranalysis3.html.gz 53K Ranalysis3.html.br 3.0M Ranalysis3.v.deduplicated-hyp.html 68K Ranalysis3.v.deduplicated-hyp.html.gz 39K Ranalysis3.v.deduplicated-hyp.html.br 2.3M Ranalysis3.v.deduplicated-hyp-hyps.html 63K Ranalysis3.v.deduplicated-hyp-hyps.html.gz 38K Ranalysis3.v.deduplicated-hyp-hyps.html.br 1.5M Ranalysis3.v.deduplicated-hyp-hyps-ccl.html 58K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.gz 35K Ranalysis3.v.deduplicated-hyp-hyps-ccl.html.br Part of GH-35.
OK, I've merged the changes; you can use |
very good! With
🎉 |
Woohoo! :) Thanks a lot for testing. |
Alectryon seems to generate big files. I realized that when github.io
told me I could not transfer file > 100MB. This file is 228M while its equivalent in Proviola was only 18M
Is it possible to do something?
The text was updated successfully, but these errors were encountered: