-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance problem with error message feature #48
Comments
After a couple of shots in the dark, it seems to me that the most straightforward way to go about it is to print violations in the main loop, right away, without building up a list of violations. Flag sounds good! (I honestly thought that you'd require it from the original PR.) I wouldn't call it as generically as |
I think it is good to keep a pure interface, and do IO only in the layer using that interface. fix-whitespace/src/Data/Text/FixWhitespace.hs Lines 70 to 73 in 37220b7
Ok, we can bikeshed it. For a |
Alleviates agda#48 but a proper performance fix would be better.
Alleviates agda#48 but a proper performance fix would be better.
Alleviates agda#48 but a proper performance fix would be better.
#49 provides a way to work around the performance issue by resurrecting the old implementation and putting the new one under a flag. This is largely independent of the the issue itself. I must say I find few things in live as sad as tuning performance of Haskell applications. Below are couple observations of my failed attempts at this in our case. First, many times slow applications generate excessive amount of garbage and productivity goes down. Surprising as it is we don't experience this. Productivity stays above 99%. Another observation: a lot of slowdown comes from just printing out stuff. I performed a little experiment where I don't do any analysis, and just print out every input line with the decoration fix :: Mode -> Verbose -> TabSize -> FilePath -> IO Bool
fix mode verbose _tabSize f = do
s <- Text.readFile f
pure (CheckViolation s (buildVs s))
>>= \case
CheckViolation s vs -> do
Text.hPutStrLn stderr (msg vs)
when (mode == Fix) $
withFile f WriteMode $ \h -> do
hSetEncoding h utf8
Text.hPutStr h s
return True
where
buildVs = zipWith LineError [1..] . Text.lines
msg vs
| mode == Fix =
"[ Violation fixed ] " <> ft
| otherwise =
"[ Violation detected ] " <> ft <>
(if not verbose then "" else
":\n" <> Text.unlines (map (displayLineError ft) vs))
ft = Text.pack f fix-whitespace on performance-experiments-2 [?] via λ 9.2.6
❯ time $FW --check 20000-violations.txt +RTS -s -p -RTS 2>/dev/null
________________________________________________________
Executed in 6.22 millis fish external
usr time 1.31 millis 341.00 micros 0.97 millis
sys time 5.00 millis 142.00 micros 4.86 millis
fix-whitespace on performance-experiments-2 [?] via λ 9.2.6
❯ time $FW -v --check 20000-violations.txt +RTS -s -p -RTS 2>/dev/null
________________________________________________________
Executed in 433.11 millis fish external
usr time 253.06 millis 0.18 millis 252.88 millis
sys time 180.00 millis 1.08 millis 178.91 millis (i also use profiling but without profiling numbers are not much different) |
My idea for improving performance of the output would be to try the |
Unfortunately, the basic lazy builder didn't seem to help anything (the code is here ulysses4ever@faabc6a). I'm thinking to give this a try: https://github.com/Bodigrim/linear-builder. |
I tried an even simpler experiment, and it does look like just printing stuff out is embarrassingly slow. Here's a little program that reads stdin, breaks it into lines, adds the index to every line and prints out the result: -- test.hs
module Main where
import TextShow
import qualified Data.Text as T
import qualified Data.Text.IO as T
main = T.getContents >>= T.putStrLn . procText
procText =
T.unlines .
zipWith (\i l -> showt i <> l) [1::Int ..] .
T.lines It's pretty slow already:
|
Oops, using |
Alleviates #48 but a proper performance fix would be better.
After PR #44, performance has degraded for large files with lots of violations.
Reproducer:
Takes 30sec on my machine. The released (0.0.11) finishes in a fraction of a second.
Proposed procedure:
--verbose
flag.The text was updated successfully, but these errors were encountered: