Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-terminating character macro for #\- breaks #: reader. #489

Closed
karlosz opened this issue May 12, 2024 · 2 comments
Closed

Non-terminating character macro for #\- breaks #: reader. #489

karlosz opened this issue May 12, 2024 · 2 comments

Comments

@karlosz
Copy link

karlosz commented May 12, 2024

The following minimal example shows a bug in CCL where defining a non-terminating character macro breaks symbol tokenization with #: (but not without #:!)

;; no-op, other than making #- a character macro.
(let ((standard-readtable (copy-readtable nil)))
  (set-macro-character #\- (lambda (stream char)
                             (unread-char char stream)
                             (let ((*readtable* standard-readtable))
                               (read stream t nil t)))
                       t)
  (read-from-string "-abcd") ; OK
  (read-from-string "#:-abcd") ; errors
  )

Note that the print name for '-abcd is also different after making #- a character macro, though I think that's OK. Furthermore, this works correctly on sbcl, cmucl, clisp, and ecl.

I think the function READ-SYMBOL-TOKEN is at fault: it should do a check for non-terminating-ness for character macros.

karlosz added a commit to karlosz/sbcl that referenced this issue May 12, 2024
* Instead of using a special $ character, just portably modify the XC
readtable to do the right thing for floats by installing reader macros
for every possible initial character for a float. Special care needs
to be taken for #\., since consing dot needs to work. As of the time
of this commit, CCL, ECL, and CLISP break without installing a left
parenthesis macro that can also communicate with our new dot reader
macro, while SBCL and CMU CL do just fine without the additional new
left parenthesis character macro. It's unclear to me which behavior is
correct, and it may be the case that the standard is underspecified in
the situation where a reader macro is installed for dot and then
someone tries to read in a dotted list. In any case, installing a left
parenthesis reader macro for everybody is a portable solution that is
guaranteed to work everywhere.

* Also just install a reader macro for #c which constructs target
complex numbers instead of writing out the constant construction
manually.

* This avoids the big hack where we intercepted reading normal number
syntax in the reader only for specific versions of SBCL to catch float
and complex literals written the normal way and flame. Now we can make
sure host float and complex number literals never show up during the
build when bootstrapping from *any* host Lisp, by never allowing host
float and complex literals to be read in in the first place.

* This change uncovered a bug in CCL, whereby it was no longer able to
read #:-cache correctly. I've filed a bug for this here
Clozure/ccl#489 and worked around the issue
by just replacing it with its string name, since it was only used as
an argument for SYMBOLICATE.

* ECL also needs some help with the radix readers. See comment for bug
report.
karlosz added a commit to karlosz/sbcl that referenced this issue May 12, 2024
* Instead of using a special $ character, just portably modify the XC
readtable to do the right thing for floats by installing reader macros
for every possible initial character for a float. Special care needs
to be taken for #\., since consing dot needs to work. As of the time
of this commit, CCL, ECL, and CLISP break without installing a left
parenthesis macro that can also communicate with our new dot reader
macro, while SBCL and CMU CL do just fine without the additional new
left parenthesis character macro. It's unclear to me which behavior is
correct, and it may be the case that the standard is underspecified in
the situation where a reader macro is installed for dot and then
someone tries to read in a dotted list. In any case, installing a left
parenthesis reader macro for everybody is a portable solution that is
guaranteed to work everywhere.

* Also just install a reader macro for #c which constructs target
complex numbers instead of writing out the constant construction
manually.

* This avoids the big hack where we intercepted reading normal number
syntax in the reader only for specific versions of SBCL to catch float
and complex literals written the normal way and flame. Now we can make
sure host float and complex number literals never show up during the
build when bootstrapping from *any* host Lisp, by never allowing host
float and complex literals to be read in in the first place.

* This change uncovered a bug in CCL, whereby it was no longer able to
read #:-cache correctly. I've filed a bug for this here
Clozure/ccl#489 and worked around the issue
by just replacing it with its string name, since it was only used as
an argument for SYMBOLICATE.

* ECL also needs some help with the radix readers. See comment for bug
report.
karlosz added a commit to karlosz/sbcl that referenced this issue May 12, 2024
* Instead of using a special $ character, just portably modify the XC
readtable to do the right thing for floats by installing reader macros
for every possible initial character for a float. Special care needs
to be taken for #\., since consing dot needs to work. As of the time
of this commit, CCL, ECL, and CLISP break without installing a left
parenthesis macro that can also communicate with our new dot reader
macro, while SBCL and CMU CL do just fine without the additional new
left parenthesis character macro. It's unclear to me which behavior is
correct, and it may be the case that the standard is underspecified in
the situation where a reader macro is installed for dot and then
someone tries to read in a dotted list. In any case, installing a left
parenthesis reader macro for everybody is a portable solution that is
guaranteed to work everywhere.

* Also just install a reader macro for #c which constructs target
complex numbers instead of writing out the constant construction
manually.

* This avoids the big hack where we intercepted reading normal number
syntax in the reader only for specific versions of SBCL to catch float
and complex literals written the normal way and flame. Now we can make
sure host float and complex number literals never show up during the
build when bootstrapping from *any* host Lisp, by never allowing host
float and complex literals to be read in in the first place.

* This change uncovered a bug in CCL, whereby it was no longer able to
read #:-cache correctly. I've filed a bug for this here
Clozure/ccl#489 and worked around the issue
by just replacing it with its string name, since it was only used as
an argument for SYMBOLICATE.

* ECL also needs some help with the radix readers. See comment for bug
report.
stassats pushed a commit to sbcl/sbcl that referenced this issue May 12, 2024
* Instead of using a special $ character, just portably modify the XC
readtable to do the right thing for floats by installing reader macros
for every possible initial character for a float. Special care needs
to be taken for #\., since consing dot needs to work. As of the time
of this commit, CCL, ECL, and CLISP break without installing a left
parenthesis macro that can also communicate with our new dot reader
macro, while SBCL and CMU CL do just fine without the additional new
left parenthesis character macro. It's unclear to me which behavior is
correct, and it may be the case that the standard is underspecified in
the situation where a reader macro is installed for dot and then
someone tries to read in a dotted list. In any case, installing a left
parenthesis reader macro for everybody is a portable solution that is
guaranteed to work everywhere.

* Also just install a reader macro for #c which constructs target
complex numbers instead of writing out the constant construction
manually.

* This avoids the big hack where we intercepted reading normal number
syntax in the reader only for specific versions of SBCL to catch float
and complex literals written the normal way and flame. Now we can make
sure host float and complex number literals never show up during the
build when bootstrapping from *any* host Lisp, by never allowing host
float and complex literals to be read in in the first place.

* This change uncovered a bug in CCL, whereby it was no longer able to
read #:-cache correctly. I've filed a bug for this here
Clozure/ccl#489 and worked around the issue
by just replacing it with its string name, since it was only used as
an argument for SYMBOLICATE.

* ECL also needs some help with the radix readers. See comment for bug
report.
@xrme
Copy link
Member

xrme commented May 23, 2024

#476 complains about the printer doing unnecessary slashification, by the way.

@xrme
Copy link
Member

xrme commented May 24, 2024

Fixed in f1f5963.

@xrme xrme closed this as completed May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants