Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/ and other ISSUE! chars, make # space (not empty) #1076

Closed
wants to merge 1 commit into from

Conversation

hostilefork
Copy link
Member

Rebol2 and R3-Alpha break issues to become refinements at slash:

rebol2>> load "#ab/cd"
== [#ab /cd
]

This changes it to follow the same load rules as FILE!, and include
the slashes:

>> x: load "#ab/cd"
== #ab/cd

>> type of x
== #[datatype! issue!]

It also makes more characters in ISSUE! legal. This is designed
to allow it evolve into the "cleaner" form of CHAR!, ultimately
replacing the CHAR! datatype entirely.

>> ensure issue! #:
== #:

>> ensure issue! #<
== #<

While it looks sensible that #; would be an ISSUE!, the ; is
a comment anywhere at this time. In order to help catch cases
that would forget and use #; like #: and cause hard to find
bugs of code getting commented out, this makes that case error.

It also makes the choice that # means an issue containing a space,
not the empty issue. This offers a convenient alternative to
expressing space, that is shorter than #" " or the word space.

@hostilefork
Copy link
Member Author

hostilefork commented Sep 26, 2020

So @rgchris, I don't know where you stand on working with R3C (it would help if I knew!). But I tried to factor this out to make it easy for you to apply, if you chose to. I think by and large things are falling into place in a way consistent with what you have historically advocated for... e.g. FILE! and ISSUE! having internal slashes. (See the test...I'd really like to see more tests written to take advantage of Redbolisms this way!)

This is all related closely to the ISSUE! and CHAR! unification into one atomic UTF-8 optimized immutable type.

I want to be really certain about the # decaying to just space part. It gives an alternative to BLANK! as space:

>> unspaced ["controversial" _ "behavior"]
convtroversial behavior  ; !!! we've wondered if this was right

>> unspaced ["replacement" # "behavior"]
replacement behavior  ; this may be a more solid answer

It doesn't vanish quite as much as the BLANK! does for spacing. But there's some benefit in that, as it's kind of a blocky "negative space" space. But the fact that # would really be the space character in contexts it was found that were character related would make it less of a surprise:

>> second "a b"
== #

I'd whimsically toyed with the idea of contexts something like that returning BLANK! before, e.g. TO BLOCK! of TEXT!

 >> to block! "ab cd"
 == [#"a" #"b" _ #"c" #"d"]

It made some ideas better, but that's no match for:

 >> to block! "ab cd"
 == [#a #b # #c #d]

But making all dialects equate BLANK! with how NULL is handled doesn't feel right, as I really want this to work:

   >> did parse [1 "a" 2] [integer! _ integer!]
   == #[true]

This makes PARSE suitable for "deconstructing" things, and underscore is the traditionally-used "I don't care" of deconstruction.

Anyway, food for thought and hopefully adding up changes you will approve of.

Rebol2 and R3-Alpha break issues to become refinements at slash:

    rebol2>> load "#ab/cd"
    == [#ab /cd
    ]

This changes it to follow the same load rules as FILE!, and include
the slashes:

    >> x: load "#ab/cd"
    == #ab/cd

    >> type of x
    == #[datatype! issue!]

It also makes more characters in ISSUE! legal.  This is designed
to allow it evolve into the "cleaner" form of CHAR!, ultimately
replacing the CHAR! datatype entirely.

    >> ensure issue! #:
    == #:

    >> ensure issue! #<
    == #<

While it looks sensible that `#;` would be an ISSUE!, the ; is
a comment anywhere at this time.  In order to help catch cases
that would forget and use `#;` like `#:` and cause hard to find
bugs of code getting commented out, this makes that case error.

It also makes the choice that `#` means an issue containing a space,
not the empty issue.  This offers a convenient alternative to
expressing space, that is shorter than #" " or the word `space`.
@hostilefork
Copy link
Member Author

hostilefork commented Sep 27, 2020

It made some ideas better, but that's no match for:

 >> to block! "ab cd"
 == [#a #b # #c #d]

There's definitely a lot of improvement there cutting down on the quotes. But as I push this through a bit further, I'm finding myself kind of less happy with the # - as - space concept. There's something off balance in the 1-charness of it.

Something I'm wondering is if caret standing alone could start one of these ISSUECHAR!s when escaping was involved:

  >> to block! "ab cd"
  == [#a #b ^_ #c #d]

That feels more "balanced", to have things like ^/ or ^(1C) standing on their own, lexically. And you can really see were the spaces and newlines are (which will be the most common non-printables needing escaping by far)

But I think BLANK! should keep meaning space in DELIMIT, but only at a literal level:

https://forum.rebol.info/t/treat-blank-s-from-variables-or-evaluation-like-null/1348

@iArnold
Copy link

iArnold commented Sep 27, 2020

It also makes the choice that # means an issue containing a space,
not the empty issue. This offers a convenient alternative to
expressing space, that is shorter than #" " or the word space.

And #{} would be the empty issue like <{}> the empty TAG?

@hostilefork
Copy link
Member Author

hostilefork commented Sep 27, 2020

And #{} would be the empty issue like <{}> the empty TAG?

I'm leaning that <{}> as the empty tag is probably the way to go. With tags being mutable there has to be a way to render their non-scannable-forms. We're talking about a world where <a tag > will scan legally but < a tag> will not, so if the latter is created by runtime means it needs to mold as <{ a tag}>. The former would mold as is.

But I've mentioned that trying out # as space in practice wasn't as pleasing as I at first thought.

So # being an empty "issuechar!" does have applications, like what I'm currently calling "blackhole" usage:

https://forum.rebol.info/t/sending-values-into-space/1347

But it could also be a synonym for #"^(00)", or #^(00) or ^(00) ... e.g. an issuechar! that converts to codepoint 0. One benefit of that is that it becomes "toxic" in the process, so that you can't append it to strings. Which makes it a better outlier type for such purposes. And it can help with understanding why there's no such "issuechar!" as #"a^(00)b", because the only zero-bearing issuechar is an absolutely empty one.

@hostilefork
Copy link
Member Author

Treating # as the empty issue. Perhaps it could ultimately double as the "codepoint 0" CHAR!, e.g. append &{AABB} # would produce &{AABB00} under the new rules. This would avoid creating a stringlike class that actually materially contained 0 bytes, which I'm trying to avoid.

Committed here:

4f3c86f

So to sum up the good news: more characters open up for ISSUE! (like / and .) as they are no longer wanted for putting ISSUE! in PATH! or TUPLE!. The "bad" news is that it will become an immutable class with optimized storage and no indexed position, so it will need to be converted to text for manipulations of that kind.

@hostilefork hostilefork closed this Oct 7, 2020
@hostilefork
Copy link
Member Author

Not sure about # being the "space issue"... it seems anti-intuitive to me...

As I mentioned, that was abandoned. The usage of BLANK! as a space in things like DELIMIT is now standardized, with it only applying to "source level blanks":

https://forum.rebol.info/t/treat-blank-s-from-variables-or-evaluation-like-null/1348

@hostilefork hostilefork deleted the fancier-issue branch October 12, 2020 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants