Skip to content

[Feature] Add remove_eol_characters hook#12

Merged
zhouzaida merged 9 commits into
open-mmlab:masterfrom
C1rN09:add_remove_eol
Sep 13, 2022
Merged

[Feature] Add remove_eol_characters hook#12
zhouzaida merged 9 commits into
open-mmlab:masterfrom
C1rN09:add_remove_eol

Conversation

@C1rN09
Copy link
Copy Markdown
Contributor

@C1rN09 C1rN09 commented Sep 13, 2022

This PR adds a new script/pre-commit-hook, named remove-improper-eol-in-cn-docs.

The script aims at resolving extra whitespaces in Chinese docs, which is a long-standing HTML issue as discussed here.

To solve the issue, this script finds and removes end_of_line characters which split natural Chinese paragraphs. For example,

这是一个,
像诗一样的
测试

will be changed to

这是一个,像诗一样的测试

However, the following cases stay unchanged:

  • Docs written in English

This is,
a poem-like
test

  • Natural paragraphs (split by 2+ eol characters in Markdown)

这是一个

测试

@zhouzaida
Copy link
Copy Markdown
Collaborator

zhouzaida commented Sep 13, 2022

@C1rN09
Copy link
Copy Markdown
Contributor Author

C1rN09 commented Sep 13, 2022

Tested with the following config in MMEngine

- repo: https://github.com/C1rN09/pre-commit-hooks.git
    rev: 02421bc
    hooks:
      - id: check-copyright
        args: ["mmengine", "tests"]
      - id: remove-improper-eol-in-cn-docs

Seems good.

Comment thread pre_commit_hooks/remove_eol_characters.py
Comment thread pre_commit_hooks/remove_eol_characters.py Outdated
Comment thread README.md Outdated
Comment thread setup.py Outdated
Comment thread pre_commit_hooks/remove_eol_characters.py Outdated
Comment thread tests/test_remove_eol_characters.py Outdated
Comment thread tests/test_remove_eol_characters.py Outdated
Comment thread tests/test_remove_eol_characters.py Outdated
Comment thread tests/test_remove_eol_characters.py Outdated
@zhouzaida zhouzaida merged commit 1bed9a0 into open-mmlab:master Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants