New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
non-ascii file name regression #32
Comments
Already reported by me and \detokenize should be used to avoid that (from #24 (comment)) |
@aminophen Not quite that simple ... but I have to say I'm surprised that the binaries treat the file name argument at the TeX level (it's not |
Something like |
Any arguments to *tex is treated as TeX code;-) When the first token is a character, *tex treats it as if \input is prefixed; when the first token is a control sequence, \input is not prefixed. |
@aminophen I have to say I've always imagined the logic differently :) 'If the first char is the escape char, treat as TeX code, otherwise read as a filename' |
one thing I had experimented with is starting with
\long\def\UTFviii@two@octets#1#2{%
\string#1\string#2}
and switching to the main definition "later" but the timing gets tricky and
making the implicit input on the commandline work like an explicit \input
also isn't as easy as one would hope.
…On 8 April 2018 at 10:43, Joseph Wright ***@***.***> wrote:
@aminophen <https://github.com/aminophen> I have to say I've always
imagined the logic differently :) 'If the first char is the escape char,
treat as TeX code, otherwise read as a filename'
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#32 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABNcArQJzyDnHcRFQk3Z6NamsJy7vfpfks5tmdukgaJpZM4TLdnn>
.
|
Delaying utf8.def etc. to \everyjob might be an only solution to this (not tested well for all engines, and if so I will have to adjust platex as well) (edit: it will change a log filename opened by --- latex.ltx.orig 2018-04-07 06:33:45.000000000 +0900
+++ latex.ltx 2018-04-08 19:15:09.000000000 +0900
@@ -8641,12 +8641,6 @@
\catcode10=12 % ctrl J
\catcode12=13 % ctrl L
\catcode13=5 % newline
-\@tempcnta=128
-\loop
- \catcode\@tempcnta=13
- \advance\@tempcnta\@ne
-\ifnum\@tempcnta<256
-\repeat
\def\UseRawInputEncoding{%
\let\DeclareFontEncoding@\DeclareFontEncoding@saved % revert
\let\DeclareUnicodeCharacter\@undefined % revert
@@ -8669,10 +8663,6 @@
\repeat
}
\let\DeclareFontEncoding@saved\DeclareFontEncoding@
-\edef\inputencodingname{utf8}%
-\input{utf8.def}
-\let\@inpenc@test\@undefined
-\let\saved@space@catcode\@undefined
\else
\@tempcnta=0
\loop
@@ -8793,6 +8783,18 @@
\endgroup}
\let\@filelist\@gobble
\def\@addtofilelist#1{\xdef\@filelist{\@filelist,#1}}%
+\everyjob\expandafter{\the\everyjob
+\@tempcnta=128
+\loop
+ \catcode\@tempcnta=13
+ \advance\@tempcnta\@ne
+\ifnum\@tempcnta<256
+\repeat
+\edef\inputencodingname{utf8}%
+\input{utf8.def}
+\let\@inpenc@test\@undefined
+\let\saved@space@catcode\@undefined
+}
\makeatother
\errorstopmode
\dump |
@aminophen yes I'm actually currently running some tests with ltfinal changed as
would need the longer cases as well, not just the two byte of course. delaying the catcode activation until everyjob would work on the commandline but if we can make it work without that it may give a path to accepting utf8 filenames more generally in the document (which did not work in previous releases after inputenc was loaded) |
724013b works as expected for pdfLaTeX; I commited a support for that change in pLaTeX texjporg/platex@8b6c518 and it’s ok on both pLaTeX and upLaTeX. I’ll upload the new version of pLaTeX, when LaTeX is ready. |
Brief outline of the bug
One of my test cases breaks with the the current 2018-04-01 release.
What I have done:
ltxbase
)This gives:
Maybe this is a MiKTeX-specific Windows bug. I will do further tests on macOS and Linux.
Minimal example showing the bug
Log file (required) and possibly PDF file
texput.log
The text was updated successfully, but these errors were encountered: