Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError when running cz bump #1110

Closed
jg40305 opened this issue May 15, 2024 · 5 comments
Closed

UnicodeDecodeError when running cz bump #1110

jg40305 opened this issue May 15, 2024 · 5 comments

Comments

@jg40305
Copy link

jg40305 commented May 15, 2024

Description

Hi, I'm attempting to use commitizen in my project. When I ran the cz bump command, I encountered a UnicodeDecodeError and received the following error message:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\kzsu\workspace\ssrreer\venv\Scripts\cz.exe\__main__.py", line 7, in <module>
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\cli.py", line 607, in main
    args.func(conf, arguments)()
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\commands\bump.py", line 306, in __call__
    changelog_cmd()
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\commands\changelog.py", line 176, in __call__
    changelog_meta = self.changelog_format.get_metadata(self.file_name)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\changelog_formats\base.py", line 37, in get_metadata
    return self.get_metadata_from_file(changelog_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\changelog_formats\base.py", line 43, in get_metadata_from_file
    for index, line in enumerate(file):
UnicodeDecodeError: 'cp950' codec can't decode byte 0xe7 in position 50: illegal multibyte sequence

Upon investigating the error, I discovered that the CHANGELOG.md file is being opened using the system code page (cp950 in my case).

commitizen\changelog_formats\base.py#L36

I suspect that the CHANGELOG.md contains non-English words, which is likely causing the error.

## 1.0.0 (2024-05-15)

### BREAKING CHANGE

- 當作第一版(MAJOR)

### Feat

- test commitizen

Steps to reproduce

  1. Prepare __version__.py, .cz.toml:
  • __version__.py
__version__ = "0.0.1"
  • .cz.toml
[tool.commitizen]
name = "cz_conventional_commits"
tag_format = "$version"
version_scheme = "semver2"
version = "0.0.1"
encoding = "utf-8"
update_changelog_on_bump = true
version_files = [
  ".cz.toml:version",
  "__version__.py:__version__",
]
  1. git add * and cz commit:
? Select the type of change you are committing feat: A new feature. Correlates with MINOR in SemVer
? What is the scope of this change? (class or file name): (press [enter] to skip)
 
? Write a short and imperative summary of the code changes: (lower case and no period)
 test commitizen
? Provide additional contextual information about the code changes: (press [enter] to skip)
 testing
? Is this a BREAKING CHANGE? Correlates with MAJOR in SemVer Yes
? Footer. Information about Breaking Changes and reference issues that this commit closes: (press [enter] to skip)
 當作第一版(MAJOR)

feat: test commitizen

testing

BREAKING CHANGE: 當作第一版(MAJOR)


[master (root-commit) bcd9958] feat: test commitizen
 2 files changed, 12 insertions(+)
 create mode 100644 .cz.toml
 create mode 100644 __version__.py

Commit successful!
  1. cz bump
Tag 0.0.1 could not be found.
Possible causes:
- version in configuration is not the current version
- tag_format is missing, check them using 'git tag --list'

? Is this the first tag created? Yes
bump: version 0.0.1 → 1.0.0
tag to create: 1.0.0
increment detected: MAJOR

[master 61ba03b] bump: version 0.0.1 → 1.0.0
 3 files changed, 11 insertions(+), 2 deletions(-)
 create mode 100644 CHANGELOG.md

warning: CRLF will be replaced by LF in .cz.toml.
The file will have its original line endings in your working directory
warning: CRLF will be replaced by LF in CHANGELOG.md.
The file will have its original line endings in your working directory

Done!
  1. git add .gitignore & cz commit
  • .gitignore
venv/
  • cz commit
? Select the type of change you are committing feat: A new feature. Correlates with MINOR in SemVer
? What is the scope of this change? (class or file name): (press [enter] to skip)
 
? Write a short and imperative summary of the code changes: (lower case and no period)
 add .gitignore file
? Provide additional contextual information about the code changes: (press [enter] to skip)
 
? Is this a BREAKING CHANGE? Correlates with MAJOR in SemVer No
? Footer. Information about Breaking Changes and reference issues that this commit closes: (press [enter] to skip)
 

feat: add .gitignore file


[master aa4b9c5] feat: add .gitignore file
 1 file changed, 1 insertion(+)
 create mode 100644 .gitignore

Commit successful!
  1. cz bump
bump: version 1.0.0 → 1.1.0
tag to create: 1.1.0
increment detected: MINOR

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\kzsu\workspace\ssrreer\venv\Scripts\cz.exe\__main__.py", line 7, in <module>
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\cli.py", line 607, in main
    args.func(conf, arguments)()
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\commands\bump.py", line 306, in __call__
    changelog_cmd()
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\commands\changelog.py", line 176, in __call__
    changelog_meta = self.changelog_format.get_metadata(self.file_name)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\changelog_formats\base.py", line 37, in get_metadata
    return self.get_metadata_from_file(changelog_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\kzsu\workspace\ssrreer\venv\Lib\site-packages\commitizen\changelog_formats\base.py", line 42, in get_metadata_from_file
    for index, line in enumerate(file):
UnicodeDecodeError: 'cp950' codec can't decode byte 0xe7 in position 50: illegal multibyte sequence
  1. current CHANGELOG.md
## 1.0.0 (2024-05-15)

### BREAKING CHANGE

- 當作第一版(MAJOR)

### Feat

- test commitizen

Current behavior

I adjusted the code locally:
commitizen\changelog_formats\base.py#L36

with open(filepath) as changelog_file:

change to

with open(filepath, encoding=self.config.settings["encoding"]) as changelog_file:

and rerun cz bump again.

bump: version 1.0.0 → 1.1.0
tag to create: 1.1.0
increment detected: MINOR

[master 3652d2d] bump: version 1.0.0 → 1.1.0
 3 files changed, 8 insertions(+), 2 deletions(-)

Done!

This resolved the error.

  • CHANGELOG.md
## 1.1.0 (2024-05-15)

### Feat

- add .gitignore file

## 1.0.0 (2024-05-15)

### BREAKING CHANGE

- 當作第一版(MAJOR)

### Feat

- test commitizen

  • git log
commit 3652d2da3b1a52a0d3fc4fa276c1282a573b608f (HEAD -> master, tag: 1.1.0)
Author: John Su <jg40305@gmail.com>
Date:   Wed May 15 19:34:18 2024 +0800

    bump: version 1.0.0 → 1.1.0

commit aa4b9c58202275ceb8feadfa3dbadadc7568fda8
Author: John Su <jg40305@gmail.com>
Date:   Wed May 15 19:30:02 2024 +0800

    feat: add .gitignore file

commit 61ba03b180664cbb912b1d9628221c2f028854bd (tag: 1.0.0)
Author: John Su <jg40305@gmail.com>
Date:   Wed May 15 19:27:31 2024 +0800

    bump: version 0.0.1 → 1.0.0

commit bcd9958778d8fbf742e48d722e51eeee0ca44ea8
Author: John Su <jg40305@gmail.com>
Date:   Wed May 15 19:26:46 2024 +0800

    feat: test commitizen

    testing

    BREAKING CHANGE: 當作第一版(MAJOR)

Desired behavior

I think that altering commitizen\changelog_formats\base.py#L36 is the most straightforward approach, and it can also be configured within [tool.commitizen].
However, as BaseFormat seems to be a core class, there could be unforeseen scenarios that prevent this approach.

Screenshots

No response

Environment

  • commitizen version: 3.25.0
  • python version: Python 3.12.0
  • operating system: Windows
cz version --report
Commitizen Version: 3.25.0
Python Version: 3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)]
Operating System: Windows
@Lee-W
Copy link
Member

Lee-W commented May 20, 2024

I thought we already had this resolved in the previous fix. It looks like we are still missing a few. We probably could change it to smart_open as well 🤔 @jg40305 Are you interested in creating a PR for this fix?

@Lee-W
Copy link
Member

Lee-W commented May 20, 2024

Related issue: #826

@Lee-W Lee-W changed the title UnicodeDecodeError when run the command cz bump UnicodeDecodeError when running cz bump May 21, 2024
tyuchx pushed a commit to tyuchx/commitizen that referenced this issue May 21, 2024
@Lee-W
Copy link
Member

Lee-W commented May 22, 2024

@jg40305 We have a fix in #1133. Could you please try 3.26.2 and see whether the issue has been fixed? I'll close this one for now. But feel free to reopen it if the issue still persists

@Lee-W Lee-W closed this as completed May 22, 2024
@jg40305
Copy link
Author

jg40305 commented May 23, 2024

Hi @Lee-W, I am grateful to you and your team for solving this problem. After testing, cz bump can now run without that error. Thank you again for your assistance.

Here is the stdout:

bump: version 1.0.0 → 1.1.0
tag to create: 1.1.0
increment detected: MINOR

[master 4058788] bump: version 1.0.0 → 1.1.0
 3 files changed, 8 insertions(+), 2 deletions(-)

And my cz version --report:

Commitizen Version: 3.27.0
Python Version: 3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)]
Operating System: Windows

@Lee-W
Copy link
Member

Lee-W commented May 27, 2024

@jg40305 Thanks for helping us confirming 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants