Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\c_document_cctab does not make upper-half of the 8-bit range active #814

Closed
stone-zeng opened this issue Oct 5, 2020 · 4 comments
Closed

Comments

@stone-zeng
Copy link
Contributor

In l3interface:

  • \c_document_cctab:

    Category code table for a standard LaTeX document, as set by the LaTeX kernel. In particular, the upper-half of the 8-bit range will be set to "active" with pdfTeX only. No babel shorthands will be activated.

However, for the following example:

\csname tl_analysis_show:n\endcsname{1:^^70^^80}

\ExplSyntaxOn
\tl_analysis_show:n { 2:^^70^^80 }

\cctab_begin:N \c_document_cctab
\csname tl_analysis_show:n\endcsname{3:^^70^^80}
\cctab_end:

\ExplSyntaxOff

Run with latex or pdflatex (or *-dev), it gives

LaTeX2e <2020-10-01>
L3 programming layer <2020-09-24> xparse <2020-03-03>
The token list contains the tokens:
>  1 (the character 1)
>  : (the character :)
>  p (the letter p)
>  � (active character=macro:->\UTFviii@invalid@err �).
<recently read> }
                 
l.1 ...me tl_analysis_show:n\endcsname{1:^^70^^80}
                                                  
? 
The token list contains the tokens:
>  2 (the character 2)
>  : (the letter :)
>  p (the letter p)
>  � (active character=macro:->\UTFviii@invalid@err �).
<recently read> }
                 
l.4 \tl_analysis_show:n { 2:^^70^^80 }
                                      
? 
The token list contains the tokens:
>  3 (the character 3)
>  : (the character :)
>  p (the letter p)
>  � (the character �).
<recently read> }
                 
l.7 ...me tl_analysis_show:n\endcsname{3:^^70^^80}
                                                  
? 

So after \c_document_cctab, ^^80 becomes a normal character rather than an active one.

BTW, the variables described here should be \c_code_cctab and \c_document_cctab:

latex3/l3kernel/l3cctab.dtx

Lines 769 to 775 in 19ea461

% \begin{variable}{\c_document_cctab, \c_other_cctab}
% To pick up document-level category codes, we need to delay set up to the
% end of the format, where that's possible. Also, as there are a \emph{lot}
% of category codes to set, we avoid using the official interface and store the
% document codes using internal code. Depending on whether we are in the hook
% or not, the catcodes may be code or document, so we explicitly set up both
% correctly.

@josephwright
Copy link
Member

Hopefully now sorted: I was forgetting that we needed to do this

@josephwright josephwright reopened this Oct 8, 2020
@josephwright
Copy link
Member

@PhelypeOleinik (but others too): is there any point in the delay I set up for creating the table? I think we are now ignoring LaTeX2e settings entirely, so we could skip that: correct?

@zauguin
Copy link
Member

zauguin commented Oct 8, 2020

@josephwright If I'm not missing anything we are only ignoring the LaTeX2e settings in 8-bit engines and therefore still need the delay for Unicode engines.

@josephwright
Copy link
Member

@zauguin Ah right yes: the catcodes from UnicodeData have to be loaded. Probably should note that in the code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants