Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbscan() in lib389 can return bytes #5872

Closed
vashirov opened this issue Aug 1, 2023 · 3 comments · Fixed by #5874 or #5887
Closed

dbscan() in lib389 can return bytes #5872

vashirov opened this issue Aug 1, 2023 · 3 comments · Fixed by #5874 or #5887
Labels
lib389 Involves lib389 librabry needs triage The issue will be triaged during scrum
Milestone

Comments

@vashirov
Copy link
Member

vashirov commented Aug 1, 2023

Issue Description
dbscan() in lib389 extracts information from the database file. Most of the time the information returned by dbscan executable is strings. But when attribute encryption or changelog encryption is enabled, the database can contain values that can't be parsed as a string in Python.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
dirsrvtests/tests/suites/replication/encryption_cl5_test.py:65: in _check_unhashed_userpw_encrypted
    dbscanOut = inst.dbscan(DEFAULT_BENAME, 'replication_changelog')
/usr/local/lib/python3.9/site-packages/lib389/__init__.py:3072: in dbscan
    result = subprocess.run(cmd, text=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
/usr/lib64/python3.9/subprocess.py:507: in run
    stdout, stderr = process.communicate(input, timeout=timeout)
/usr/lib64/python3.9/subprocess.py:1121: in communicate
    stdout = self.stdout.read()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <encodings.utf_8.IncrementalDecoder object at 0x7f510ffabb80>
input = b"\ndbid: 0000006f000000000000\n\tentry count: 11\n\ndbid: 000000de000000000000\n\tpurge ruv:\n\t\t{replicageneration}...94\xf7\x9f\xa5\xf4\xfb\xd5\xb49\x87W\n\t\tunhashed#user#password: \xa1\x98\xcf\xea\xb8F\xa8\xc9FHe\x8f\x0b\\\xfa\xd7\n"
final = True

    def decode(self, input, final=False):
        # decode input (taking the buffer into account)
        data = self.buffer + input
>       (result, consumed) = self._buffer_decode(data, self.errors, final)
E       UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 6019: invalid start byte

/usr/lib64/python3.9/codecs.py:322: UnicodeDecodeError

By default subprocess output is considered bytes:
https://docs.python.org/3/library/subprocess.html#subprocess.CompletedProcess.stdout

stdout
Captured stdout from the child process. A bytes sequence, or a string if run() was called with an encoding, errors, or text=True. None if stdout was not captured.

But we explicitly use text=True to indicate that it is supposed to be a string:

result = subprocess.run(cmd, text=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

I think we should change dbscan() to always return bytes.

@vashirov vashirov added needs triage The issue will be triaged during scrum lib389 Involves lib389 librabry labels Aug 1, 2023
vashirov added a commit that referenced this issue Aug 4, 2023
Bug Description:
When attribute encryption or changelog encryption is used, `dbscan()`
can return bytes instead of a string.

Fix Description:
* Update subprocess call to expect bytes instead of string.
* Revert changes to the tests done in
  8bf7829.
* Update entryrdn_test to expect output from dbscan as bytes.

Fixes: #5872
Relates: #5859

Reviewed-by: @progier, @droideck (Thanks!)
vashirov added a commit that referenced this issue Aug 4, 2023
Bug Description:
When attribute encryption or changelog encryption is used, `dbscan()`
can return bytes instead of a string.

Fix Description:
* Update subprocess call to expect bytes instead of string.
* Revert changes to the tests done in
  8bf7829.
* Update entryrdn_test to expect output from dbscan as bytes.

Fixes: #5872
Relates: #5859

Reviewed-by: @progier389, @droideck (Thanks!)
vashirov added a commit that referenced this issue Aug 4, 2023
Bug Description:
When attribute encryption or changelog encryption is used, `dbscan()`
can return bytes instead of a string.

Fix Description:
* Update subprocess call to expect bytes instead of string.
* Revert changes to the tests done in
  8bf7829.
* Update entryrdn_test to expect output from dbscan as bytes.

Fixes: #5872
Relates: #5859

Reviewed-by: @progier389, @droideck (Thanks!)
vashirov added a commit that referenced this issue Aug 4, 2023
Bug Description:
When attribute encryption or changelog encryption is used, `dbscan()`
can return bytes instead of a string.

Fix Description:
* Update subprocess call to expect bytes instead of string.
* Revert changes to the tests done in
  8bf7829.
* Update entryrdn_test to expect output from dbscan as bytes.

Fixes: #5872
Relates: #5859

Reviewed-by: @progier389, @droideck (Thanks!)
@vashirov
Copy link
Member Author

vashirov commented Aug 4, 2023

b1bdf50..4ba6190 389-ds-base-2.1 -> 389-ds-base-2.1
c4a0abf..d2af71c 389-ds-base-2.2 -> 389-ds-base-2.2
7c7afb7..f01a613 389-ds-base-2.3 -> 389-ds-base-2.3

@progier389
Copy link
Contributor

progier389 commented Aug 8, 2023

Seeing a regression in nightly CI tests: (missing a str() in is_dbi in import test ) ( I will create a new pr to fix it )

@progier389
Copy link
Contributor

b62bd43..2dab922 main -> main
434d63e..ed0093d 389-ds-base-2.3 -> 389-ds-base-2.3
94144bb..5cc2502 389-ds-base-2.2 -> 389-ds-base-2.2
fea4a6f..c714ed8 389-ds-base-2.1 -> 389-ds-base-2.1

@progier389 progier389 added this to the 2.1.0 milestone Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lib389 Involves lib389 librabry needs triage The issue will be triaged during scrum
Projects
None yet
2 participants