Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicode characters wrong representation #39

Closed
matsievskiysv opened this issue Jun 7, 2019 · 22 comments
Closed

unicode characters wrong representation #39

matsievskiysv opened this issue Jun 7, 2019 · 22 comments

Comments

@matsievskiysv
Copy link

Here is a test script for gnuplot:

reset
set encoding utf8
set terminal pngcairo enhanced
set grid
set xlabel 'α'
set ylabel 'Δβ'
set output "./test.png"
plot [x=0:5] exp(-x)*cos(5*x) with lines title "I_{b} = 0,2 Ампер"

When I run this from command line with gnuplot test.gpi, I get the following output:

test

But when I open the same file in emacs and issue the command gnuplot-send-buffer-to-gnuplot, I get:

test

@Harry79
Copy link

Harry79 commented Jun 7, 2019

I am invoking gnuplot from org-mode. This works fine:

#+begin_src gnuplot :file gnuplot-mode-issue-39.png
reset
set encoding utf8
set terminal pngcairo enhanced
set grid
set xlabel 'α'
set ylabel 'Δβ'
plot [x=0:5] exp(-x)cos(5x) with lines title "I_{b} = 0,2 Ампер"
#+end_src

#+RESULTS:
[[file:gnuplot-mode-issue-39.png]]
gnuplot-mode-issue-39

@matsievskiysv
Copy link
Author

I use GNU Emacs 26.1, gnuplot 5.2 patchlevel 6. And this is my init script for gnuplot-mode. Maybe I messed up some configurations.

@Harry79
Copy link

Harry79 commented Jun 7, 2019

Same Emacs, but Gnuplot Version 4.6 patchlevel 7.

@Harry79
Copy link

Harry79 commented Jun 7, 2019

... on Windows by the way.

@conao3
Copy link
Collaborator

conao3 commented Mar 21, 2020

Windows is not supported for now. Ref: #33.

@conao3 conao3 closed this as completed Mar 21, 2020
@matsievskiysv
Copy link
Author

I don't use Windows

@mtreca mtreca reopened this Mar 21, 2020
@matsievskiysv
Copy link
Author

Problem is still there.
Linux 5.4.0-4-amd64 #1 SMP Debian 5.4.19-1 (2020-02-13) x86_64 GNU/Linux
GNU Emacs 26.3 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.11) of 2019-09-22, modified by Debian

@mtreca
Copy link
Collaborator

mtreca commented Mar 21, 2020

Hmm. I also get a correct output with emacs -Q and the following configuration:

(require 'use-package)
(use-package gnuplot
  :load-path "~/.emacs.d/lib/gnuplot-20200317.131")

test

Can you try with emacs -Q as well?

@matsievskiysv
Copy link
Author

Nope, I get the same output with emacs -Q

@rolandog
Copy link

rolandog commented Mar 1, 2022

I'm having the same problem (finally found this issue after a lot of searching!).

I'm using:

  • Emacs: GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.23, cairo version 1.16.0)
  • Gnuplot: Version 5.4 patchlevel 1
  • gnuplot-mode: 20220102.1637

I have tried so far:

  • Using a magic comment of # -*- coding: utf-8; mode: org; -*-
  • Setting in my init.el files:
    ;; add utf-8 at the front of the priority list for automatic detection
    (prefer-coding-system 'utf-8)
    
    ;; coding system to be used for encoding the buffer contents on saving
    (setq buffer-file-coding-system 'utf-8)
    
    ;; decide a coding system to use for a file I/O operation
    (add-to-list 'file-coding-system-alist '("\\.org\\'" . utf-8) t)
    
    ;; decide a coding system to use for a process I/O operation
    (add-to-list 'process-coding-system-alist '("gnuplot\\'" . utf-8) t)
    

Although I'm unsure if I entered process-coding-system-alist properly.

My steps to reproduce are:
  1. emacs -Q.
  2. run the following on *scratch*, or save to init-gnuplot-mode-issue-39.el:
;; -*- coding: utf-8; mode: elisp; -*-

;; current-language-environment is "English"
(set-language-environment "UTF-8")

;; sets the value of various coding systems
;; - coding system of a newly created buffer
;; - default coding system for subprocess I/O
;; - file-name-coding-system (if ASCII-compatible)
;; - set-terminal-coding-system
;; - set-keyboard-coding-system (if ASCII-compatible)
(set-default-coding-systems 'utf-8)

;; set clipboard and selection coding systems
(set-clipboard-coding-system 'utf-8)
(set-selection-coding-system 'utf-8)

;; add utf-8 at the front of the priority list for automatic detection
(prefer-coding-system 'utf-8)

;; coding system to be used for encoding the buffer contents on saving
(setq buffer-file-coding-system 'utf-8)

;; decide a coding system to use for a file I/O operation
(add-to-list 'file-coding-system-alist '("\\.org\\'" . utf-8) t)

;; decide a coding system to use for a process I/O operation
(add-to-list 'process-coding-system-alist '("gnuplot\\'" . utf-8) t)

;; add all descendant directories of a directory to your load-path
;; https://www.emacswiki.org/emacs/LoadPath
(let ((default-directory  "~/.config/emacs/lisp/"))
  (normal-top-level-add-subdirs-to-load-path))

;; add path to elpa packages for purposes of esting with emacs -Q
(add-to-list 'load-path (directory-file-name "~/.config/emacs/elpa/gnuplot-20220102.1637"))

;; disable completions for gnuplot; otherwise we get:
;; run-hooks: Symbol's function definition is void: gnuplot-context-sensitive-mode
(setq gnuplot-use-context-sensitive-completion nil)

(autoload 'gnuplot-mode "gnuplot" "Gnuplot major mode" t)
(autoload 'gnuplot-make-buffer "gnuplot" "open a buffer in gnuplot-mode" t)
(add-to-list 'auto-mode-alist '("\\.gp\\'" . gnuplot-mode) t)

;; enable org-mode
(require 'org)

;; enable gnuplot-mode
(require 'gnuplot)

;; languages to make available in org-mode
(org-babel-do-load-languages
 'org-babel-load-languages
 '((gnuplot . t)))
  1. Launch either emacs -Q gnuplot-mode-issue-39.org (and run the previous stuff on scratch), or emacs -Q --load gnuplot-mode-issue-39.el gnuplot-mode-issue-39.org
  2. C-c C-c on the following gnuplot src block in gnuplot-mode-issue-39.org:
# -*- coding: utf-8; mode: org; -*-

#+begin_src gnuplot :file gnuplot-mode-issue-39.png
reset
set encoding utf8 # or set encoding locale
set terminal pngcairo enhanced
set grid
set xlabel "α"
set ylabel "Δβ"
plot [x=0:5] exp(-x)*cos(5*x) with lines title "I_{b} = 0,2 Ампер"
#+end_src

Note: I had to modify the equation a bit so that I didn't get some warnings, and I also had to disable gnuplot-use-context-sensitive-completion because it was also throwing some errors.

Edit: forgot to add gnuplot-mode version.

Edit 2: I forgot to mention that I tried both set encoding utf8 and set encoding locale with LC_ALL=en_US.UTF-8 emacs -Q --load gnuplot-mode-issue-39.el gnuplot-mode-issue-39.org to see if that would work (see Emacs docs, and the Gnuplot docs). But, I'm still getting the same image as the original commenter.

@rolandog
Copy link

rolandog commented Mar 1, 2022

By the way, if I switch to the *gnuplot* buffer and hit C-x RET f, I get:

Coding system for saving file (default iso-2022-jp):

I also forgot to add an image in the previous post:

gnuplot-mode-issue-39

Slightly different, but same artifacts.

Edit: decided to add the extra code that I mentioned here into the custom 'init.el' file above.

@rolandog
Copy link

rolandog commented Mar 1, 2022

In my case, what I could find is that---apparently---the words for the labels (the only text in my own file that had non-ascii characters) were being decoded in ISO-8859-1 (Latin-1) or possibly ISO-8858-15:

  • The word Mínimo showed up as Mínimo
  • The word Máximo showed up as Máximo

So, this is text encoded as UTF-8 that is somehow being decoded as ISO-8859-1 inside a UTF-8 encoded document (the svg).

If I switch to set encoding iso_8859_1 I somehow get an even 'worse' encoding:

  • The word Mínimo showed up as MÃ�­nimo
  • The word Máximo showed up as MÃ�¡ximo

I'm not sure if I'm wrapping my head around this properly, but I think that that would be the previous 'Mínimo' taken as UTF-8 and re-decoded as ISO-8859-1 inside an already ISO-8859-1 encoded document (the svg).

@mtreca
Copy link
Collaborator

mtreca commented Mar 7, 2022

Thank you for the detailed report @rolandog. I will try to reproduce your issue and investigate it, but I doubt I will be able to recreate it. I would wager that the encoding issue comes from the elisp side of things after calling `gnuplot-send-buffer-to-gnuplotˋ. I will look into it today, but I am not too familiar with the internals of the package.

@mtreca
Copy link
Collaborator

mtreca commented Mar 8, 2022

Hi again.

Thank you for the detailed bug report. I am afraid that I don't have more knowledge of the issue than you do at this point, especially regarding encoding in Emacs.

Could you open your test gnuplot file, remove all encoding-related configuration and execute the following in the gnuplot buffer using M-S-: (buffer-substring-no-properties (point-min) (point-max)). I would like to see if the resulting text is correctly displayed or if encoding errors are already present.

Second, could you inspect the variable gnuplot-gui-all-types and give me the values present at key "ENCODING"?

It is quite complicated for me to debug your issue directly as I have no encoding issues on my own machine. Interestingly, the gnuplot buffer also shows me the same encoding as you, iso-2022-jp. This, however, does not affect by ability to properly generate figures.

@rolandog
Copy link

rolandog commented Mar 19, 2022

Hi @mtreca . Thank you for your guidance; sorry that it took me long to get back.

These are now the contents of the test files:

gnuplot-mode-issue-39-test.org

#+begin_src gnuplot :file gnuplot-mode-issue-39-test.png
reset
#set encoding utf8
set terminal pngcairo enhanced
set grid
set xlabel "α"
set ylabel "Δβ"
plot [x=0:5] exp(-x)*cos(5*x) with lines title "I_{b} = 0,2 Ампер"
#+end_src

#+RESULTS:
[[file:gnuplot-mode-issue-39-test.png]]

gnuplot-mode-issue-39-test.el
;; add path to elpa packages for purposes of testing with emacs -Q
(add-to-list 'load-path (directory-file-name "~/.config/emacs/elpa/gnuplot-20220102.1637"))

;; disable completions for gnuplot; otherwise we get:
;; run-hooks: Symbol's function definition is void: gnuplot-context-sensitive-mode
(setq gnuplot-use-context-sensitive-completion nil)

(autoload 'gnuplot-mode "gnuplot" "Gnuplot major mode" t)
(autoload 'gnuplot-make-buffer "gnuplot" "open a buffer in gnuplot-mode" t)
(add-to-list 'auto-mode-alist '("\\.gp\\'" . gnuplot-mode) t)

;; enable org-mode
(require 'org)

;; enable gnuplot-mode
(require 'gnuplot)

;; languages to make available in org-mode
(org-babel-do-load-languages
 'org-babel-load-languages
 '((gnuplot . t)))

After running:

user@computer:~$ LC_ALL=en_US.UTF-8 emacs -Q --load ~/org/gnuplot-mode-issue-39-test.el ~/org/gnuplot-mode-issue-39-test.org

or

user@computer:~$ emacs -Q --load ~/org/gnuplot-mode-issue-39-test.el ~/org/gnuplot-mode-issue-39-test.org

Here are the results.

Output of M-S-: (buffer-substring-no-properties (point-min) (point-max))

This was the same for either of the two commands, and for commenting out the 'set encoding utf8' line or leaving it uncommented.

For information about GNU Emacs and the GNU system, type C-h C-a.
executing Gnuplot code block...
Starting gnuplot plotting program...Done
Code block evaluation complete.
Type C-c C-c or C-c C-x to view the image as text or hex.
Auto-saving...done
"
	G N U P L O T
	Version 5.4 patchlevel 1    last modified 2020-12-01 

	Copyright (C) 1986-1993, 1998, 2004, 2007-2020
	Thomas Williams, Colin Kelley and many others

	gnuplot home:     http://www.gnuplot.info
	faq, bugs, etc:   type \"help FAQ\"
	immediate help:   type \"help\"  (plot window: hit 'h')

Terminal type is now 'qt'
gnuplot> cd '/home/rolandog/org/'
gnuplot> 
gnuplot> set term png

Terminal type is now 'png'
set output \"gnuplot-mode-issue-39-test.png\"
reset
set encoding utf8
set terminal pngcairo enhanced
set grid
set xlabel 'α'
Options are 'nocrop enhanced size 640,480 font \"arial,12.0\" '
gnuplot> set ylabel 'Δβ'
gnuplot> gnuplot> gnuplot> 
Terminal type is now 'pngcairo'
Options are ' background \"#ffffff\" enhanced fontscale 1.0 size 640, 480 '
gnuplot> gnuplot> gnuplot> plot [x=0:5] exp(-x)*cos(5*x) with lines title 'I_{b} = 0,2 Ампер'
gnuplot> set output

gnuplot> gnuplot> gnuplot> "


And, finally... this was a bit interesting.

When I launch Emacs normally (no -Q, with an init.el that has kept the encoding declarations I wrote atop)... I get

value of ENCODING key from gnuplot-gui-all-types variable in Emacs with no -Q
 ("encoding"
  ("ENCODING" 'list " " "default" "iso_8859_1" "cp850" "cp437"))

But when launching with any of the previous commands (with -Q --load gnuplot-mode-issue-39-test.el), there is no gnuplot-gui-all-types value available to be inspected.

P.S. I also tried changing the quotes from the strings from double to single (because I thought maybe the Gnuplot Quote Character option had something to do). I also tried saving the same script (while adding a 'set output line') to see if I could get different results, but the results were the same (I sent the buffer to gnuplot with C-c C-b).

@mtreca
Copy link
Collaborator

mtreca commented Mar 22, 2022

Thank you for the detailed reply. I think I am reaching the limits of my technical abilities here, especially given that I am fairly inexperienced with the package, or encoding in general.

I might take a deeper look into the issue, but it won't be right away since I have very little free time at the moment. I am leaving the issue open in the meantime, in case anyone wants to chime in or bring a solution to the problem.

@rolandog
Copy link

Don't worry @mtreca. I'm very grateful for your help and support. My intention was to document the issue as thoroughly as possible so that people much smarter and experienced than I am, like yourself, could have a better shot at solving this.

temporary solution

I was not able to solve this particular problem, but I was able to side-step the issue by not engaging directly through a gnuplot buffer:

#+begin_src gnuplot :eval never :dir ~/org :tangle gnuplot-mode-issue-39-test.gp :tangle-mode (identity #o600)
  set encoding utf8
  set terminal pngcairo enhanced
  set output 'gnuplot-mode-issue-39-test.png'
  set grid
  set xlabel 'α'
  set ylabel 'Δβ'
  plot [x=0:5] exp(-x)*cos(5*x) with lines title 'I_{b} = 0,2 Ампер'
#+end_src

#+begin_src gnuplot :dir ~/org :results silent
  load 'gnuplot-mode-issue-39-test.gp'
#+end_src

This ended up producing:
gnuplot-mode-issue-39-test

In my actual test-case I think I'll have to look into writing a tsv or csv file to be able to load some data in the gnuplot scriptfile.

additional clues

It may end up being a problem from my environment variables (perhaps nl_NL.UTF-8 should be nl_NL.utf8?). I've read some reporting about possibly related bugs (see links at the end).

As I had been in the process of arranging dotfiles according to the XDG Base Directory Specification, I noticed the ~/.gnuplot_history file, and decided to inspect it. The contents surprised me, because they were showing up as some of the mangled characters in the reports:

set\040terminal\040png
set\040output\040'/tmp/gnuplot6QxskC'
set\040encoding\040utf8
set\040terminal\040pngcairo\040enhanced
set\040output\040'gnuplot-mode-issue-39-test.png'
set\040grid
set\040xlabel\040'α'
set\040ylabel\040'Î\M^Tβ'
plot\040[x=0:5]\040exp(-x)*cos(5*x)\040with\040lines\040title\040'I_{b}\040=\0400,2\040Ð\M^PмпеÑ\M^@'
exit

In the Emacs *gnuplot* buffer, the characters appeared fine,... so it was odd to see such a difference. So, I started gnuplot in its interactive mode in a terminal, and I tried pasting the script line by line. To my surprise, the characters became mangled as I pasted them (e.g. 'Δβ' would become 'Îβ').

After checking the docs once again, I tried pasting the original gnuplot script file, but substituting the UTF-8 characters by their octal representations:

set encoding utf8
set terminal pngcairo enhanced
set output 'gnuplot-mode-issue-39-test.png'
set grid
set xlabel "\316\261"
set ylabel "\316\224\316\262"
plot [x=0:5] exp(-x)*cos(5*x) with lines title 'I_{b} = 0,2 \320\220\320\274\320\277\320\265\321\200'

And this ended up working in the terminal, and in the org-mode code block:

#+begin_src gnuplot :file gnuplot-mode-issue-39-test.png
  reset
  set encoding utf8
  set terminal pngcairo enhanced
  set grid
  set xlabel "\316\261"
  set ylabel "\316\224\316\262"
  plot [x=0:5] exp(-x)*cos(5*x) with lines title 'I_{b} = 0,2 \320\220\320\274\320\277\320\265\321\200'
#+end_src

I'm not sure why using double quotes was necessary for the x and y labels:

Note that strings in double-quotes are parsed differently than those enclosed in single-quotes. The major
difference is that backslashes may need to be doubled when in double-quoted strings.

I'm out of time for today, but I'll try getting more familiar with the Emacs / gnuplot and gnuplot-mode internals to try to pinpoint where exactly things are going wrong. But I guess this has helped to narrow-down the search-space for the actual bug.

Related bugs or S.O. questions:

@mtreca
Copy link
Collaborator

mtreca commented Mar 23, 2022

Thanks for the extremely detailed information. You are more than welcome to give a shot at fixing the issue and contributing to the emacs-gnuplot package more generally!

@rolandog
Copy link

You're welcome!

I have finally stumbled upon the root cause for the bug.

It seems there is a licensing conflict between gnuplot (which I didn't know wasn't part of the GNU project) and a dependency used to read the lines from the terminal.

To avoid that conflict, some distributions --- like Debian --- build gnuplot with editline instead of readline.

After recompiling from source with Stephen Kitt's instructions, sending the buffer to gnuplot worked like a charm (as well as just pasting a script in the interactive version of the terminal, and the history file as well).

A note for @matsievskiysv:

  • you will need to recompile
  • you may need to set some environment variables in the terminal (like export DEBEMAIL="mail@example.com")
  • you may need to commit the changes before being able to build: (dpkg-source --commit)
  • in my case, I only had to install the 'gnuplot-data_5.4.1+dfsg1-1.1_all.deb' and 'gnuplot-qt_5.4.1+dfsg1-1.1_amd64.deb' files

Sorry for all the trouble @mtreca. I think this issue can finally be closed, as it's not directly related to gnuplot-mode.

@rolandog
Copy link

Btw, I'll try reporting the bug upstream (to Ubuntu and Debian). From what I've read online libedit should've supported unicode since 2014 or 2016 by default.

If this doesn't get fixed upstream, maybe I'll contribute a small section for the bottom of the README, so that others may find out WHY this may happen and a pointer on how to fix it.

@mtreca
Copy link
Collaborator

mtreca commented Mar 24, 2022

Very nice detective work @rolandog!

I modified the README accordingly. We can finally close this.

@mtreca mtreca closed this as completed Mar 24, 2022
@rolandog
Copy link

rolandog commented Apr 2, 2022

Thanks @mtreca ! Happy to have helped track down how to reproduce and fix this bug. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants