Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misaligned register output for Unicode journal using mintty #961

Closed
kohanyirobert opened this Issue Jan 27, 2019 · 13 comments

Comments

Projects
None yet
2 participants
@kohanyirobert
Copy link

kohanyirobert commented Jan 27, 2019

I'm using msys2 on Windows 10 with mintty 2.9.0 (x86_64-pc-msys) and I if I have a journal like this

1/1 a
        ábc                                      123
        abc

and I do hledger -f a.journal r this is the output

2019/01/01 a                    ábc                           123           123
                                abc                           -123             0

The same thing in cmd.exe setting the default codepage to Unicode with chcp 65001 works, the output look likes this

2019/01/01 a                    ábc                            123           123
                                abc                           -123             0

(which is the expected).

I've built hledger using these steps.

C:\Users\rkohanyi>hledger --version
hledger 1.12

If I just cat the journal file on mintty it looks okay, as the second output. Same with type (cat equivalent) in cmd.exe.

In mintty I use Lucida Console as my font (tried changing it, but it doesn't matter, shouldn't matter), locale is en_US character set is UTF-8 (Unicode).

The file itself is encoded with UTF-8 (verified in vim).

It seems to me that this behaviour is displayed when hledger executes through mintty.
Any idea on changing/fixing this?
I mainly use mintty and this 1 character misalignment throws me off completely sometimes :D

@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Jan 30, 2019

Thanks for the bug report. I don't have a machine to test this on myself. I wonder if it's any relation to #536 or #708 ?

@kohanyirobert

This comment has been minimized.

Copy link
Author

kohanyirobert commented Jan 30, 2019

#708 is not related. The problem described by OP happens when I use cmd.exe and I don't set the terminal's codepage to Unicode (by default it's latin1 or worse, but one can switch it with chcp 65001 and then everything is displayed perfectly).

#536 related I think. With my above example in mintty this

$ hledger -f a.journal b á
--------------------
                   0

doesn't work (although I ran this a few minutes before starting to write my post ... and it worked ... or at least it seemed to work, but now it doesn't :sad:).

But it does work in cmd.exe with the Unicode codepage set.

I've skimmed through the issue and basically it boils down to either it works under Cygwin or Windows, right? Not both? :D Not a big deal. Weird. One solution is to use different binaries in different terminals, maybe? I won't go down that route tho', it's not a blocking problem. Easier to rename accounts not use accented letters.

@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Jan 30, 2019

Thanks for the testing of this and the other windows issues. I'll update those pages also, but here's a summary of my current understanding of windows non-ascii issues:

  • six windows shell/build environments mentioned so far:
    cmd, powershell, cygwin-mintty, git-mingw64, msys2-mintty, appveyor

  • I think git-mingw64 can be considered one of the minttys, appveyor is cmd, and cmd and powershell can be considered similar.

  • three issues under discussion:

    • #536 non-cygwin hledger build in cygwin-mintty: non-ascii queries fail (due to misparsed journal)
    • #708 appveyor hledger-web build started from mingw64 or cmd: misrenders characters
    • #961 msys2-mintty hledger build in msys2-mintty: register misaligns with non-ascii characters
  • cmd/powershell builds in cmd/powershell: chcp 65001 generally resolves such issues

  • cmd/powershell hledger builds in mintty (cygwin or msys2): generally suffer from these issues. Using a mintty build usually resolves them (but not #961).

  • mintty hledger builds in cmd/powershell: ? presumably suffer from these issues and using a cmd/powershell build resolves

Back to #961: the á seems to shorten the line by a character. hledger has built-in logic to compensate for double width characters, perhaps it's misjudging á to be one of these ? If you have the time, you could try debugging it (eg make ghci, insert traceShowId or dbg0 "LABEL"'s in hledger/Hledger/Cli/Commands/Register.hs postingsReportItemAsText, :reload, :main r)...

--------------------------------------- 80 -------------------------------------
normal:
2019/01/01 a                    ábc                            123           123
                                abc                           -123             0
--------------------------------------- 80 -------------------------------------
yours:
2019/01/01 a                    ábc                           123           123
                                abc                           -123             0
@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Jan 30, 2019

Or as a quick test: what do you get from make ghci, charWidth 'á' ? I get 1. If you get 2, perhaps a fix is needed in hledger-lib/Hledger/Utils/String.hs charWidth. (Forcing it to 2 did replicate the kind of output you're seeing.)

@kohanyirobert

This comment has been minimized.

Copy link
Author

kohanyirobert commented Jan 30, 2019

I did the following in both cmd.exe and mintty: started ghci, then import Hledger.Utils.String, then charWidth 'á'. Both reported 1.

I did not fully comprehend this part:

(eg make ghci, insert traceShowId or dbg0 "LABEL"'s in hledger/Hledger/Cli/Commands/Register.hs postingsReportItemAsText, :reload, :main r)..

I'm okay to do this, but break it down for me a bit :D Do I need to modify the source and recompile to do this? Or can I use ghci to somehow hook into the execution and modify it? I know a bit of Haskell (I get and used traceShowId), but I'm not using it daily so I'm not sure what you mean.

@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Jan 30, 2019

Certainly! That is shorthand for a typical debugging workflow. It involves modifying the source a little bit and running it (in interpreted mode) from the ghci prompt. make ghci in the main hledger directory gets a useful ghci prompt. :main [ARGS] runs hledger from the ghci prompt. :reload reloads any source file changes into ghci. The kind of source file change I would typically add is to print out intermediate values. Inserting traceShowId or dbg0 "somelabel" before an expression is a good way to do that. Feel free to join #hledger for more help.

@kohanyirobert

This comment has been minimized.

Copy link
Author

kohanyirobert commented Jan 31, 2019

I cloned the repo (at this commit: c8b0c9a) and did make ghci in the repo's root dir in mintty and cmd.exe both.
I think it downloaded/compiled everything but there's in error just before the ghci prompt starts

C:\Users\rkohanyi\Work\git\hledger>make ghci
which: no gsed in ... [list of places gsed was looked for]
stack exec -- ghci  -rtsopts -Wall -fno-warn-unused-do-bind -fno-warn-name-shadowing -fno-warn-missing-signatures -fno-warn-orphans -fno-warn-type-defaults  -ihledger-lib -ihledger-lib/other/ledger-parse -ihledger -ihledger-ui -ihledger-web -ihledger-web/app -ihledger-api     -DPATCHLEVEL=166 -DDEVELOPMENT -DVERSION="\"1.12.99\"" hledger/Hledger/Cli/Main.hs
GHCi, version 8.6.3: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from C:\Users\rkohanyi\Work\git\hledger\.ghci

hledger\Hledger\Cli\Commands\Roi.hs:207:0: error:
     error: missing binary operator before token "("
     #if MIN_VERSION_math_functions(0,3,0)

    |
207 | #if MIN_VERSION_math_functions(0,3,0)
    | ^
`gcc.exe' failed in phase `C pre-processor'. (Exit code: 1)

Not sure about this. I don't have gnu-sed and I couldn't install it in pacman inside mintty... is it strictly required?

I've checked that I have a recent-ish gcc (7.3.0), not sure if that's a problem.

I didn't have time to dive into this more than this (maybe next week), but I'm still here and listening if you have any input, I'll help get to the bottom of this.

simonmichael added a commit that referenced this issue Jan 31, 2019

doc: don't use sed; fix accidental dedenting of some lists (#961)
The sed code was showing an error message, not too precise.
Pandoc's lua filters to the rescue!

[ci skip]
@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Jan 31, 2019

Thanks @kohanyirobert, this testing from the windows side is very useful. I've replaced sed with a more portable solution in latest master.

The Roi.hs error is odd. I guess stack exec -- cpp --version is also 7.3 ish. I have the same on a linux machine which has built successfully.

This #if test is like many others in the codebase, except this is the only one where the package (math-functions) has a hyphenated name. The docs for this cabal macro don't mention it, but I believe underscore is the right spelling here (compilation fails if I use a hyphen).

make ghci works here (mac, linux) but perhaps it's this, ie GHCI is not finding the cabal macros for some reason ? Maybe find cabal_macros.h (probably under .stack-work/ or hledger-*/.stack-work/) and try adding it explicitly to the ghci command (-optP-include -optPSOME/WHERE/cabal_macros.h). In the Makefile, or test on command line (see make ghci -n).

@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Jan 31, 2019

Oh, I replaced a different use of sed. Well I think that sed error is harmless, but hopefully we'll find a portable way of silencing it in due course..

@kohanyirobert

This comment has been minimized.

Copy link
Author

kohanyirobert commented Jan 31, 2019

Pulled latest master. Same old, same old. I ran make ghci. The warning about which: no gsed in ... is still there.
cabal_macros.h was nowhere to be found. On Windows stack installs itself into C:\sr, checked there, and my %USERPROFILE%\.local also. Also the hledger repo in .stack-work. Not there.

Someone in the linked issue (or maybe somewhere else) mentioned that the macro header gets generated on the first run of stack build so I'm running that .. it'll take some time :D I'll update my comment when it's done!

Edit: stack build took a while and actually never finished it hangs (or doing something painfully slow)

generics-sop-0.4.0.1: configure
generics-sop-0.4.0.1: build
shakespeare-2.0.20: configure
shakespeare-2.0.20: build
Progress 0/12

But! The macro header file got created in the meantime :) So make ghci runs now.

Changed line 87 in Register.h to

postingsReportAsText opts (_,items) = unlines $ map (postingsReportItemAsText (dbg0 "opts=" opts) (dbg0 "amtwidth=" amtwidth) (dbg0 "balwidth=" balwidth)) (dbg0 "items=" items)

the output of running :main -f <the sample journal from before> r is here. Let me know if you need me to debug some other variables, lines, etc. Hope this helps.

@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Mar 9, 2019

@kohanyirobert sorry I lost track of this.

It seems GHC 8.6.3 had a known bug on windows, hanging with packages like shakespeare. I'm currently updating hledger's stack.yaml to nightly-2019-03-09 & GHC 8.6.4, which should fix that.

Your debug output above shows paccount = "\9500\237bc". "\9500\237" renders here as two half-width glyphs (├í), causing the shortened line. Instead it should be "\225" (á). This suggests that things go wrong while parsing the file, reminiscent of #536. I'd be happy to receive a fix for this, but I suspect the short answer is "if you want non-ascii handled correctly, don't run hledger in a Windows environment (cmd, msys...) different from the one that built it".

@kohanyirobert

This comment has been minimized.

Copy link
Author

kohanyirobert commented Mar 10, 2019

short answer is "if you want non-ascii handled correctly, don't run hledger in an environment (cmd, msys, ...) different from the one that built it

Yes, it seems. :( Thanks for the effort tho'!

@simonmichael

This comment has been minimized.

Copy link
Owner

simonmichael commented Mar 11, 2019

Thanks. I've updated the download page and http://hledger.org/hledger.html#unicode-characters .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.