Skip to content

Fix trailing ) from interfering with extraction in Clojure keywords #18345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 30, 2025

Conversation

eneroth
Copy link
Contributor

@eneroth eneroth commented Jun 19, 2025

Summary

In a form like,

(if condition :bg-white :bg-black)

:bg-black will fail to extract, while :bg-white is extracted as expected. This PR fixes this case, implements more comprehensive candidate filtering, and supersedes a previous PR.

Having recently submitted a PR for handling another special case with Clojure keywords (the presence of : inside of keywords), I thought it best to invert the previous strategy: Instead of handling special cases one by one, consume keywords according to the Clojure reader spec. Consume nothing else, other than strings.

Because of this, this PR is a tad more invasive rather than additive, for which I apologize. The strategy is this:

  • Strings begin with a " and ends with an unescaped ". Consume everything between these delimiters (existing case).
  • Keywords begin with :, and end with whitespace, or one out of a small set of specific reserved characters. Everything else is a valid character in a keyword. Consume everything between these delimiters, and apply the class splitting previously contained in the outer loop. My previous special case handling of : inside of keywords in Extract candidates with variants in Clojure/ClojureScript keywords #18338 is now redundant (and is removed), as this is a more general solution.
  • Discard everything else.

I'm hoping that a strategy that is based on Clojure's definition of strings and keywords will pre-empt any further issues with edge cases.

Closes #18344.

Test plan

  • Added failing tests.
  • cargo test -> failure
  • Added fix
  • cargo test -> success

@eneroth eneroth requested a review from a team as a code owner June 19, 2025 07:37
@thecrypticace thecrypticace self-assigned this Jun 27, 2025
@thecrypticace
Copy link
Contributor

@eneroth Can you see if the checkbox to allow maintainers to edit the PR is checked? I've got a few tweaks I want to make.

@thecrypticace
Copy link
Contributor

Also could use a changelog entry

In order to pre-empt any further problems from Clojure keywords, this commit inverts the previous logic: Instead of consuming everything and handling special cases, consume characters as defined by the Clojure keyword specification, and nothing else.

In addition, leave string consumption as is, but—again—drop the `"`s, in order to align with the strategy of throwing away everything that is not the content of a string or a keyword, including delimiters (`"`, `:` respectively).
@eneroth eneroth requested a review from thecrypticace June 28, 2025 06:28
@eneroth
Copy link
Contributor Author

eneroth commented Jun 28, 2025

Sorry @thecrypticace, from what I can see on Github docs, there's supposed to be a checkbox for me to check here:

Screenshot 2025-06-28 at 08 32 05

But I can't see any. In lieu of that, rebased on main and amended with your changes, and added a change log entry.

@thecrypticace
Copy link
Contributor

Huh that's super weird. I wonder why it's missing. Thanks for making the changes!

Co-authored-by: Jordan Pittman <jordan@cryptica.me>
@thecrypticace
Copy link
Contributor

thecrypticace commented Jun 30, 2025

Ah weird | b'#' being gone makes the tests fail. I guess we can keep it in for the time being then. 🤔

edit: I know why. It's a bit annoying. We need to read whole keywords then split them into utilities. I wonder if this means we need an expanded character set no matter what.

Can do that as a followup bug fix if we turn out to need that.

@thecrypticace
Copy link
Contributor

I'm gonna go head and merge this so I can get it in. Will fix the failing # test in a followup commit.

@thecrypticace thecrypticace merged commit 05b65d5 into tailwindlabs:main Jun 30, 2025
6 of 7 checks passed
@eneroth
Copy link
Contributor Author

eneroth commented Jun 30, 2025

@thecrypticace Ah yeah, because of the syntax <elem>#<id>.<class-1>.<class-2>…<class-n>. This is unfortunately a requirement for Clojure for the Hiccup DSL to work.

@thecrypticace
Copy link
Contributor

Yeah, it makes sense. We just can't be quite as selective about certain characters during the pre-processing stage which is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trailing ) interferes with candidate extraction from Clojure/Script keywords
2 participants