Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

l3text: improve the handling of \exp_not:n in \text_expand:n #875

Closed
Skillmon opened this issue Apr 24, 2021 · 8 comments
Closed

l3text: improve the handling of \exp_not:n in \text_expand:n #875

Skillmon opened this issue Apr 24, 2021 · 8 comments

Comments

@Skillmon
Copy link
Contributor

Skillmon commented Apr 24, 2021

Currently, \text_expand:n makes a few assumptions on the usage of \exp_not:n, namely:

  • only \exp_after:wN follows \exp_not:n (and only the correct number of \exp_after:wNs)
  • is followed by a group in braces which is the group \exp_not:n acts on (breaking with non-standard category codes; or code that only forms the group for \exp_not:n upon further expansion)

As a result, a usage such as \exp_not:n \exp_after:wN \exp_after:wN { <stuff> } (which might be wrong from a programmers point of view, but don't throw an error) will throw an error inside of \text_expand:n. Also macros like \tl_head:n and \tl_tail:n don't work inside of \text_expanded:n.

While getting everything right with \exp_not:n would require much code to basically do a part of TeX's parsing, we could at least fix the errors being thrown on a wrong number of \exp_after:wNs and fix the usage of \tl_head:n inside \text_expanded:n with a simple change:

\documentclass{minimal}

\ExplSyntaxOn
% throwing errors
\tl_set:Nx \l_tmpa_tl
  { \text_expand:n { \exp_not:n \exp_after:wN \exp_after:wN { abc } } }
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \tl_head:n { abc } } }

% proposed fix
\cs_set:Npn \__text_expand_noexpand:nn #1#2
  {
    \exp_after:wN \__text_expand_store:n \exp_after:wN { \exp_not:n #1 {#2} }
    \__text_expand_loop:w
  }

% now working correctly
\tl_set:Nx \l_tmpa_tl
  { \text_expand:n { \exp_not:n \exp_after:wN \exp_after:wN { abc } } }
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \tl_head:n { abc } } }
\ExplSyntaxOff

\stop

Unfortunately, this doesn't solve everything, e.g., \tl_tail:n will still throw low level TeX errors.

@blefloch
Copy link
Member

Can you take a look at 30d827a please? I opted for f-expanding what follows \unexpanded, which is essentially enough to ensure that the following macro parameter is the argument of \unexpanded. Then I kept the grabbing of tokens until { and treated them as you suggest. This helps in case someone decided to use \unexpanded \relax { ... }, but it hurts in case of unconventional catcodes. Any opinion welcome on this minor choice.

@Skillmon
Copy link
Contributor Author

@blefloch looks good to me. But considering f-expansion (which I totally forgot about being a really good solution here) we could also implement a small loop which implements the behaviour of \unexpanded completely.

@Skillmon
Copy link
Contributor Author

Skillmon commented Apr 25, 2021

The following implements \exp_not:n for \text_expand:n with ignoring \relax and spaces, as well as the typical error recovery \unexpanded would do when it hits a missing opening brace. It works with arbitrary category codes.

This is just an MWE, so the code would need some refactoring to be merged into expl3:

\documentclass{minimal}

\ExplSyntaxOn
% new fix
\cs_set:Npn \__text_expand_cs_expand:N #1
  {
    \__text_if_expandable:NTF #1
      {
        \token_if_eq_meaning:NNTF #1 \exp_not:n
          {
            \exp_after:wN \__text_expand_noexpand:w
            \exp:w \exp_end_continue_f:w
          }
          { \exp_after:wN \__text_expand_loop:w #1 }
      }
      {
        \__text_expand_store:n {#1}
        \__text_expand_loop:w
      }
  }
\cs_set:Npn \__text_expand_noexpand:w #1 \q__text_recursion_stop
  {
    \tl_if_head_is_N_type:nTF {#1}
      { \__text_expand_noexpand:N }
      {
        \tl_if_head_is_group:nTF {#1}
          { \__text_expand_noexpand:n }
          { \__text_expand_noexpand_space:w }
      }
    #1 \q__text_recursion_stop
  }
\cs_set:Npn \__text_expand_noexpand:n #1
  {
    \__text_expand_store:n {#1}
    \__text_expand_loop:w
  }
\cs_set:Npn \__text_expand_noexpand:N #1
  {
    \token_if_eq_meaning:NNTF #1 \scan_stop:
      {
        \exp_after:wN \__text_expand_noexpand:w
        \exp:w \exp_end_continue_f:w
      }
      {
        \msg_expandable_error:nn { text } { missing-brace }
        \exp_after:wN \__text_expand_noexpand_recover:n
          \exp_after:wN { \if_false: } \fi: #1
      }
  }
\cs_set:Npn \__text_expand_noexpand_recover:n #1
  {
    \__text_expand_store:n {#1}
    \__text_expand_loop:w
  }
\exp_last_unbraced:NNo \cs_set:Npn \__text_expand_noexpand_space:w \c_space_tl
  {
    \exp_after:wN \__text_expand_noexpand:w
    \exp:w \exp_end_continue_f:w
  }
\exp_args:Nnnx \msg_new:nnn { text } { missing-brace }
  { Missing~ \c_left_brace_str \c_space_tl inserted. }


% tests
\group_begin:
\char_set_catcode_group_begin:N (
\char_set_catcode_group_end:N )
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \exp_not:n ( \empty ) } }
\group_end:

\tl_set:Nx \l_tmpa_tl { \text_expand:n { \tl_tail:n { abc } } }

% error recovery like \exp_not:n would do
\tl_set:Nx \l_tmpa_tl
  { \text_expand:n { \exp_not:n \hbox_set:Nn \l_tmpa_box { abc } } }

\stop

@Skillmon
Copy link
Contributor Author

Actually that still misses some aspects, this would break if the f-expansion was stopped by a space which is followed by more expandable stuff. The following has some changes in \__text_expand_noexpand:N to account for that, testing

  • whether the token is expandable (if so continue f-expansion),
  • whether it is defined (if so throw the missing brace error and "recover" from it), or
  • it is undefined (if so throw the undefined control sequence error and continue f-expansion).
\documentclass{minimal}

\ExplSyntaxOn
% new fix
\cs_set:Npn \__text_expand_cs_expand:N #1
  {
    \__text_if_expandable:NTF #1
      {
        \token_if_eq_meaning:NNTF #1 \exp_not:n
          {
            \exp_after:wN \__text_expand_noexpand:w
            \exp:w \exp_end_continue_f:w
          }
          { \exp_after:wN \__text_expand_loop:w #1 }
      }
      {
        \__text_expand_store:n {#1}
        \__text_expand_loop:w
      }
  }
\cs_set:Npn \__text_expand_noexpand:w #1 \q__text_recursion_stop
  {
    \tl_if_head_is_N_type:nTF {#1}
      { \__text_expand_noexpand:N }
      {
        \tl_if_head_is_group:nTF {#1}
          { \__text_expand_noexpand:n }
          { \__text_expand_noexpand_space:w }
      }
    #1 \q__text_recursion_stop
  }
\cs_set:Npn \__text_expand_noexpand:n #1
  {
    \__text_expand_store:n {#1}
    \__text_expand_loop:w
  }
\cs_set:Npn \__text_expand_noexpand:N #1
  {
    \token_if_eq_meaning:NNTF #1 \scan_stop:
      {
        \exp_after:wN \__text_expand_noexpand:w
        \exp:w \exp_end_continue_f:w
      }
      {
        \token_if_expandable:NTF #1
          {
            \exp_after:wN \__text_expand_noexpand:w
            \exp:w \exp_end_continue_f:w
          }
          {
            \cs_if_exist:NTF #1
              {
                \msg_expandable_error:nn { text } { missing-brace }
                \exp_after:wN \__text_expand_noexpand_recover:n
                  \exp_after:wN { \if_false: } \fi:
              }
              {
                \exp_after:wN \__text_expand_noexpand:w
                \exp:w \exp_end_continue_f:w
              }
          }
        #1
      }
  }
\cs_set:Npn \__text_expand_noexpand_recover:n #1
  {
    \__text_expand_store:n {#1}
    \__text_expand_loop:w
  }
\exp_last_unbraced:NNo \cs_set:Npn \__text_expand_noexpand_space:w \c_space_tl
  {
    \exp_after:wN \__text_expand_noexpand:w
    \exp:w \exp_end_continue_f:w
  }
\exp_args:Nnnx \msg_new:nnn { text } { missing-brace }
  { Missing~ \c_left_brace_str \c_space_tl inserted. }


%% tests
% non-standard category codes
\group_begin:
\char_set_catcode_group_begin:N (
\char_set_catcode_group_end:N )
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \exp_not:n ( \empty ) } }
\group_end:

% ignoring \relax
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \exp_not:n \scan_stop: \empty { \empty } } }

% \tl_tail:n that didn't work with my first fix (but with Bruno's)
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \tl_tail:n { abc } } }

% spaces followed by more expandable stuff
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \use:n { \use:n { \exp_not:n } ~ } ~ \empty { \empty } } }

% throw an error upon undefined macros
\tl_set:Nx \l_tmpa_tl { \text_expand:n { \exp_not:n \my_undefined_cs: { \empty } } }

% error recovery like \exp_not:n would do for missing braces
\tl_set:Nx \l_tmpa_tl
  { \if_false: { \fi: \text_expand:n { \exp_not:n \hbox_set:Nn \l_tmpa_box { abc } } } }

\stop

@Skillmon
Copy link
Contributor Author

This still fails for the very unlikely case that someone uses an implicit opening group like in \unexpanded\bgroup \empty} though.

@josephwright
Copy link
Member

Remember that the input is meant to be 'text' in general. I think we probably do want to worry about \tl_head:n and the like, but arbitrary low-level code it I think something we can live without.

@Skillmon
Copy link
Contributor Author

@josephwright I don't think we should support the \bgroup variant (and I'm afraid that wouldn't be really possible anyway, as it would require unbalanced braces in the input which will trip the reading at some point anyway). So either go with the single f-expansion as in @blefloch's code, or with my second variant , which supports a big portion of \exp_not:n's syntax.

@blefloch
Copy link
Member

I've gone for a simplification of your second variant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants