Skip to content

Added whitespaces to the "collapse consecutive whitespaces" command #4898

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 15, 2022
Merged

Added whitespaces to the "collapse consecutive whitespaces" command #4898

merged 3 commits into from
Jun 15, 2022

Conversation

carlos-montano-hub
Copy link
Contributor

@carlos-montano-hub carlos-montano-hub commented May 27, 2022

fixes #4883

Changes proposed in this pull request:

  • replaced value.replace(/\\s+/,' ') for value.replace(/[\\p{Zs}\\s]+/,' ')

@carlos-montano-hub carlos-montano-hub changed the title fixes#4883 Added whitespaces to the "collapse consecutive whitespaces" command May 28, 2022
@github-actions github-actions bot added the Type: Bug Issues related to software defects or unexpected behavior, which require resolution. label May 28, 2022
Copy link
Member

@wetneb wetneb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! That looks very good to me. Here are a few ideas for improvement:

  • Do we actually need to keep the \s in the regex? I would suspect those characters are already included in \p{Zs}, no?
  • We have some functional tests (written in Cypress) and one of them covers this functionality. Perhaps it would be not too hard to adapt this functional test so that the example data includes some interesting Unicode whitespace? You can find the test at main/tests/cypress/cypress/integration/project/grid/column/edit-cells/common-transforms/collapse-consecutive-whitespace.spec.js

@ostephens
Copy link
Member

Do we actually need to keep the \s in the regex? I would suspect those characters are already included in \p{Zs}, no?

I think we need to keep both as \p{Zs} does not include the tab character (I'm not sure if there are other characters included in \s but not \p{Zs} but the tab character is definitely an issue)

@wetneb wetneb merged commit c9d5a4f into OpenRefine:master Jun 15, 2022
wetneb added a commit that referenced this pull request Jun 15, 2022
…ommand (#4898)

* fixes#4883

* Add non-breaking spaces to Cypress test data

* Update spec for operation notification after change of test data

Co-authored-by: Carlos Montano <carlos.montano@192.168.1.4>
Co-authored-by: Antonin Delpeuch <antonin@delpeuch.eu>
@carlos-montano-hub
Copy link
Contributor Author

Hello!
Sorry for the delay, I was a little absent the past month. About the testing, should I make another issue for that, or was it already fixed?

@carlos-montano-hub carlos-montano-hub deleted the issue-4883-collapse-consecutive-whitespace branch June 29, 2022 22:38
@wetneb
Copy link
Member

wetneb commented Jun 30, 2022

Thanks for checking! I added the test to this pull request already (see https://github.com/OpenRefine/OpenRefine/pull/4898/files)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Issues related to software defects or unexpected behavior, which require resolution.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Collapse consecutive whitespace" operation does not collapse all possible unicode whitespace
3 participants