Skip to content

[mypyc] Add primitive for bytes.decode #10951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 11, 2021
Merged

Conversation

97littleleaf11
Copy link
Collaborator

Description

Implements part of mypyc/mypyc#880

@97littleleaf11 97littleleaf11 marked this pull request as ready for review August 9, 2021 13:45
}

#define PyUnicode_UTF8(op) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better not use a macro here, as it buys us little and makes the code harder to understand and maintain. If you need to share code, use an inline function.

#define PyUnicode_UTF8(op) \
(assert(PyUnicode_Check(op)), \
assert(PyUnicode_IS_READY(op)), \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to crash the process in case the string object is not ready, which is not the right thing to do? Instead, you should probably use PyUnicode_READY and return NULL if it fails.

decode_types: List[RType] = [bytes_rprimitive, str_rprimitive, str_rprimitive]
decode_constants: List[Tuple[int, RType]] = [(0, pointer_rprimitive),
(0, pointer_rprimitive)]
for i in range(len(decode_types)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for loop doesn't save a lot of code and makes this harder to understand. I'd prefer just having three normal primitive definitions.


str_ssize_t_size_op = custom_op(
arg_types=[str_rprimitive],
return_type=c_pyssize_t_rprimitive,
c_function_name='CPyStr_Size_size_t',
error_kind=ERR_NEG_INT)

Copy link
Collaborator Author

@97littleleaf11 97littleleaf11 Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer three separate helper functions.

}

PyObject* CPy_DecodeWithErrors(PyObject *obj, PyObject *encoding, PyObject *errors) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, I'd prefer to have three primitive definitions that all call a single C function, i.e. we'd only duplicate the method_op declarations, not C implementations. Some primitives can provide fixed extra arguments, but without using the for loop for clarity. Sorry for the extra back and forth!

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

@@ -598,6 +600,7 @@ def test_decode() -> None:
assert b.decode('gbk') == '浣犲ソ'
assert b.decode('latin1') == 'ä½\xa0好'

[case testEncode]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it would be reasonable to have them in a single test case, since these are related (all string operations) and having fewer run tests will speed up tests. But it's not a big deal. The original name (testChrOrdEncodeDecode) wasn't the clearest though. Also we already have testStringOps, which would cover these as well. Not sure what's the best way to organize our tests.

@JukkaL JukkaL merged commit 5adb0a0 into python:master Aug 11, 2021
@97littleleaf11 97littleleaf11 deleted the decode branch February 22, 2022 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants