Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about minifying and unminifying. #52

Closed
hyunsooda opened this issue Feb 12, 2023 · 17 comments
Closed

Question about minifying and unminifying. #52

hyunsooda opened this issue Feb 12, 2023 · 17 comments

Comments

@hyunsooda
Copy link

Some of the submitted games have been compressed to adhere to the guidelines. Can anyone share their techniques for effectively compressing (minifying) the code, as well as the steps for decompressing (unminifying) them?

@OsePedro
Copy link

I wrote my first submission in minified form by hand. This was only practical because I had a very clear idea of what I wanted to create from the outset, but even then it took me a whole day to write it! But even if you write it in readable form and use a minification tool, you'll still have to do a lot of manual work to make it as small as possible.

The techniques that I used include:

  • Using point-free style ruthlessly, to avoid naming arguments and unique sub-expressions.
  • Using single-letter names for things that must be named. I documented type signatures and function behaviours in the comments as I went, to help myself remember what I'd called them, and I tried to pick the initial of a word that describes the function well ("r" for random, "c" for coordinates, etc.).
  • Avoiding type signatures as much as possible. You might need some early on, but by the time you've completed the game, you may have enough implicit type constraints to allow you to go back and remove them.
  • Not assuming that I know what spaces GHC requires. E.g. I didn't know that double quotes and backticks are always treated as the start/end of a token, meaning e.g. that u _""_=0 is a valid definition of ternary function, and that (`rem`m) is a valid partial application. GHC 8 will even allow you to omit the space between the lazy pattern match operator and its preceding argument (i.e. u c g~(x:y)=0 is valid in that version), but unfortunately GHC 9 no longer permits this.
  • Splitting expressions across lines and not assuming that I understand the indentation rule. E.g. for some reason, this is a valid definition of k :: String (it would be invalid without the space at the start of the second line):
{- Some definitions ... -} ;k=take i
 ['a'..]; {- Some more definitions ... -}
  • Using applicative notation to shorten list comprehensions. E.g. it allowed me to reduce [l a b|a<-r x,b<-r y] to l<$>r x<*>r y.
  • Looking up operator fixities with :i in GHCi, and using them to eliminate parentheses. The minimal precedence of ($) is very useful, and it may also help to know that (.) has maximal precedence (e.g. it takes precedence over (<$>), (<*>) and (++)).
  • Not assuming that you need to use curly braces as often as the wikibook suggests. E.g. the one-liner before the exercises doesn't require braces.
  • Moving definitions around, to fill up as much space as possible on each line. This takes a lot of trial and error, and you have to remember that not all semicolons mark the end of a movable definition. It's a good idea to make sure GHCi can still read your code after moving things around in a dense, minified block!

@kindaro
Copy link
Contributor

kindaro commented Feb 12, 2023

I contributed a minifier. Try it out!

@simonmichael
Copy link
Contributor

simonmichael commented Feb 12, 2023

Here's my current approach:

  • Develop in foo.unminified.hs that is terse, using the same 1-2 char names as the final version, but not fully minified - column/line limits are ignored and each logical element has its own line, so it's still fairly readable and maintainable. This remains the master copy, which periodically gets snapshotted and minified to foo.hs.
  • No type signatures needed ideally.
  • Keep a legend in the comments below describing the short names (I order them as they appear in the code, others order them alphabetically) - you might not need it now but it could be useful later, also it can help other readers a lot.
  • Curly braces around the whole thing and semi colon after every logical unit seem to work well, they add characters but allow almost full usage of the 80x10 space. You'll have to learn where braces/semicolons can/must be used - eg in case expressions, and it seems after where the following line must start with a semicolon.
  • Some folks seem to have had good results even without curly braces/semicolons.
  • For minification I just do it by hand each time (haven't had success with minify.hs yet) - joining lines at the 80th column, watching ghcid, tweaking spaces/line breaks where needed, hoping the end result will be within target!
  • ghcid, or HLS if it works well enough on your script, is a must to get live feedback while experimenting.

@hellwolf
Copy link
Contributor

hellwolf commented Feb 12, 2023

Using applicative notation to shorten list comprehensions. E.g. it allowed me to reduce [l a b|a<-r x,b<-r y] to l<$>r x<*>r y.

Oh, that's one nice trick... :) That means I might have a chance to jam a minimax strategy in?

@hellwolf
Copy link
Contributor

  1. I also use https://pointfree.io/ a lot to convert some functions to pointfree ones if it shortens.
  2. If common function such as map,foldl,take,drop appearing multiple times, you may also assign a single letter to it.

@OsePedro
Copy link

I just learnt another little space-saving trick while trying to maxify matchmaking. It turns out that arguments beginning with a digit end at the last digit. So e.g. f 9a=g a a is a valid way of writing f 9 a = g a a!

  1. If common function such as map,foldl,take,drop appearing multiple times, you may also assign a single letter to it.

More specifically, if its name is c characters long and it's used t times, then it's taking up c*t characters, whereas a single letter name would take up 2+c+t characters, as you need 2+c characters to define the name. As

          2+c+t <  c*t 
iff       2+c   <  (c-1)*t 
iff (2+c)/(c-1) <  t 

the number of function usages that you need before you start saving space is:

Name Length Minimum Required Usages
2 5
3 3
4 3
5+ 2

@hyunsooda
Copy link
Author

Thank you for sharing. I was curious to find out how to automate the process of minifying scripts. I assumed that there would be readily available solutions for this, similar to those offered for JavaScript, but it appears that this may not be the case for Haskell. This could be due to a lower demand for such solutions in the Haskell community compared to front-end development. Regardless, I appreciate your contributions. If there are no further comments, I will consider this discussion closed.

@OsePedro
Copy link

I assumed that there would be readily available solutions for this, similar to those offered for JavaScript

Bear in mind that it's impossible to write an optimal minifying script though (if it was, then Kolmogorov complexity would be computable). So even when using one that seems very effective, there's a good chance that there will be things that you could do to make the code even smaller.

@simonmichael
Copy link
Contributor

@hyunsooda please give minify.hs a try and let us know your experience.

@TristanCacqueray
Copy link
Contributor

I use a similar approach as @simonmichael :

  • Create a rough draft to check if it's playable and fun. Try to estimate if it fits the limit.
  • Compress the code by hand and adjust the scope if needed. That's the fun part :).
  • Decompress the final product and document it for posterity.

In tsp I used this technique to minify r = if p then x else y with o=True;r|p=x|o=y.

I run ghcid -T test game.hs for testing, and ghcid --command "ghci -pgmL markdown-unlit" ./README.lhs for the final version.

@fgaz
Copy link
Contributor

fgaz commented Feb 14, 2023

The top-level minify.hs didn't work for me, so I wrote my own in #63

@hyunsooda
Copy link
Author

@simonmichael, @fgaz,

I tried both versions and neither worked.

The tested code is:

import System.Info
main = do
    print os
    print arch
    print compilerName
    print compilerVersion

The output of both scripts is the same as follows:

import System.Info main=do print os print arch print compilerName print
 compilerVersion

So, I should manually fix it like below (i.e., adding curly braces and semicolons appropriately):

{import System.Info; main=do{print os; print arch; print compilerName;
print compilerVersion}}

@fgaz
Copy link
Contributor

fgaz commented Feb 14, 2023

Yes both scripts need semicolons and braces in their input. It's written in a comment in my minifier and in the global README. You can keep the indentation though; the following code should be minified correctly:

import System.Info;
main = do {
    print os;
    print arch;
    print compilerName;
    print compilerVersion
  }

@simonmichael
Copy link
Contributor

I confirm that /minify.hs and /hackage/brickbreaker.hs both handle fgaz's example above, and produce the same 90-byte output.

@simonmichael
Copy link
Contributor

https://github.com/haskell-game/tiny-games-hs/tree/main/hackage/7up7down has a cool trick: paste minified code into ormolu's online demo to unminify.

@hyunsooda
Copy link
Author

Really cool. I was using Ormole for formatting, but I never hacked it to unminify. Work nicely.

@TristanCacqueray
Copy link
Contributor

For what it's worth, I tinkered with ghc-lib-parser to do some pre-processing of the unminified source. Here is what I got so far: simple-haskell-minifier, this tool renames the binders to single letters and it performs some basic transformations, such as inlining single-use declaration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants