Skip to content

Added whitespaces to the "collapse consecutive whitespaces" command#4898

Merged
wetneb merged 3 commits into
OpenRefine:masterfrom
carlos-montano-hub:issue-4883-collapse-consecutive-whitespace
Jun 15, 2022
Merged

Added whitespaces to the "collapse consecutive whitespaces" command#4898
wetneb merged 3 commits into
OpenRefine:masterfrom
carlos-montano-hub:issue-4883-collapse-consecutive-whitespace

Conversation

@carlos-montano-hub

@carlos-montano-hub carlos-montano-hub commented May 27, 2022

Copy link
Copy Markdown
Contributor

fixes #4883

Changes proposed in this pull request:

  • replaced value.replace(/\\s+/,' ') for value.replace(/[\\p{Zs}\\s]+/,' ')

@carlos-montano-hub carlos-montano-hub changed the title fixes#4883 Added whitespaces to the "collapse consecutive whitespaces" command May 28, 2022
@github-actions github-actions Bot added the Type: Bug Issues related to software defects or unexpected behavior, which require resolution. label May 28, 2022

@wetneb wetneb left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! That looks very good to me. Here are a few ideas for improvement:

  • Do we actually need to keep the \s in the regex? I would suspect those characters are already included in \p{Zs}, no?
  • We have some functional tests (written in Cypress) and one of them covers this functionality. Perhaps it would be not too hard to adapt this functional test so that the example data includes some interesting Unicode whitespace? You can find the test at main/tests/cypress/cypress/integration/project/grid/column/edit-cells/common-transforms/collapse-consecutive-whitespace.spec.js

@ostephens

Copy link
Copy Markdown
Member

Do we actually need to keep the \s in the regex? I would suspect those characters are already included in \p{Zs}, no?

I think we need to keep both as \p{Zs} does not include the tab character (I'm not sure if there are other characters included in \s but not \p{Zs} but the tab character is definitely an issue)

@wetneb wetneb merged commit c9d5a4f into OpenRefine:master Jun 15, 2022
wetneb added a commit that referenced this pull request Jun 15, 2022
…ommand (#4898)

* fixes#4883

* Add non-breaking spaces to Cypress test data

* Update spec for operation notification after change of test data

Co-authored-by: Carlos Montano <carlos.montano@192.168.1.4>
Co-authored-by: Antonin Delpeuch <antonin@delpeuch.eu>
@carlos-montano-hub

Copy link
Copy Markdown
Contributor Author

Hello!
Sorry for the delay, I was a little absent the past month. About the testing, should I make another issue for that, or was it already fixed?

@carlos-montano-hub carlos-montano-hub deleted the issue-4883-collapse-consecutive-whitespace branch June 29, 2022 22:38
@wetneb

wetneb commented Jun 30, 2022

Copy link
Copy Markdown
Member

Thanks for checking! I added the test to this pull request already (see https://github.com/OpenRefine/OpenRefine/pull/4898/files)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Type: Bug Issues related to software defects or unexpected behavior, which require resolution.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"Collapse consecutive whitespace" operation does not collapse all possible unicode whitespace

3 participants