Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 5852 = Remove turbo mode from the code base #5893

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jchapma
Copy link
Contributor

@jchapma jchapma commented Aug 10, 2023

Description: Turbo mode is a feature that improves the search performance for a low number of highly active connections. Since its introduction the deployment model has changed and this feature is no longer required. Turbo mode adds some complexity to the code so removal will simplify connection code.

Fix description: Following investigation into the pros/cons of turbo mode it has been decided that this feature should be removed from the codebase.

Relates: #5852

Reviewed by: ?

Copy link
Contributor

@progier389 progier389 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Firstyear Firstyear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love PR's that delete lots of code like this :)

@@ -1311,7 +1262,7 @@ connection_read_operation(Connection *conn, Operation *op, ber_tag_t *tag, int *
if (SLAPD_PR_WOULD_BLOCK_ERROR(err) ||
SLAPD_SYSTEM_WOULD_BLOCK_ERROR(syserr)) {
struct PRPollDesc pr_pd;
PRIntervalTime timeout = PR_MillisecondsToInterval(CONN_TURBO_TIMEOUT_INTERVAL);
PRIntervalTime timeout = PR_MillisecondsToInterval(100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be a constant for future tweaking.

@vashirov
Copy link
Member

I had turbo mode turned off in dse.ldif. After I updated packages, my instance refused to start:

Aug 16 09:31:59 rhel93 ns-slapd[5820]: [16/Aug/2023:09:31:59.487554556 -0400] - ERR - dse_read_one_file - The entry cn=config in file /etc/dirsrv/slapd-localhost/dse.ldif (lineno: 10) is invalid, error code 1 (Operations error) - Setting of nsslapd-enable-turbo-mode attribute is currently disabled, it will be removed in the future

I think we should not make this an error, but rather WARN or INFO, and not prevent DS from starting.

Copy link
Contributor

@tbordaz tbordaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice code. Few comments/questions

@@ -1217,7 +1168,7 @@ connection_read_operation(Connection *conn, Operation *op, ber_tag_t *tag, int *

/* If we still haven't seen a complete PDU, read from the network */
while (*tag == LBER_DEFAULT) {
int32_t ioblocktimeout_waits = conn->c_ioblocktimeout / CONN_TURBO_TIMEOUT_INTERVAL;
int32_t ioblocktimeout_waits = conn->c_ioblocktimeout;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that ioblocktimeout_waits is a count of the number of wait, not a duration. So we should set it to the number of time we 'waits_done'.

This is why it divides ioblocktimeout by CONN_TURBO_TIMEOUT_INTERVAL and sleep for CONN_TURBO_TIMEOUT_INTERVAL.

@@ -1731,52 +1531,11 @@ connection_threadmain(void *arg)
maxthreads = conn->c_max_threads_per_conn;
more_data = 0;
ret = connection_read_operation(conn, op, &tag, &more_data);
if ((ret == CONN_DONE) || (ret == CONN_TIMEDOUT)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is just logging debug info regarding the connection. It logs info about turbo mode, that should be removed, but the rest of the message looks good to me.

* c_idlesince is set in handle_pr_read_ready - since we
* are bypassing both of those, we set idlesince here
*/
conn->c_idlesince = curtime;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing update of idlesince ? There are more data on the socket so it looks to me we can update idlesince.

Description: Turbo mode is a feature that improves the search performance
for a low number of highly active connections. Since its introduction the
deployment model has changed and this feature is no longer required. Turbo
mode adds some complexity to the code so removal will simplify connection code.

Fix description: Following investigation into the pros/cons of turbo mode it has
been decided that this feature should be removed from the codebase.

Relates: 389ds#5852

Reviewed by: ?
@jchapma
Copy link
Contributor Author

jchapma commented Aug 31, 2023

The nsslapd-enable-turbo-mode attribute will remain in the code for a short while, the user will be able to get the attr value and will receive an info message when trying to replace its value.

In the initial commit, I return error code 1 (operations error) when the user tries to replace the attr value, which will issue an info message to the user. But, in the corner case where the user has upgraded DS, instance restart explodes while loading dse.ldif (#5893 (comment))

So instead, I return success when the user tries to replace the attr value and I have added the attr type to reject_attr_type() in configdse.c. This way I can issue a message to the user when a replace is attempted and maintain dse compatibility.

What do you think ?

@tbordaz
Copy link
Contributor

tbordaz commented Oct 25, 2023

A performance issue was found with the replication total init testcase (see #5761). Additional tests during that investigation, focus on turbo mode impact. It is looking turbo mode was beneficial.
A possibility is that a randomly selected worker, was picking up the first entry of the total update and because of the constant flow of new entries, its turns into a listener/worker (thanks to turbo mode) and finally applies many/all the operations of the total init.

It would be interesting to verify turbo-mode removal on total init

@jchapma
Copy link
Contributor Author

jchapma commented Oct 25, 2023

A performance issue was found with the replication total init testcase (see #5761). Additional tests during that investigation, focus on turbo mode impact. It is looking turbo mode was beneficial. A possibility is that a randomly selected worker, was picking up the first entry of the total update and because of the constant flow of new entries, its turns into a listener/worker (thanks to turbo mode) and finally applies many/all the operations of the total init.

It would be interesting to verify turbo-mode removal on total init

I agree, following our discussion yesterday it seems like turbo mode could play a part, well for a total init anyway. I will open a ticket to investigate this.

@progier389
Copy link
Contributor

I agree, following our discussion yesterday it seems like turbo mode could play a part, well for a total init anyway. I will open a ticket to investigate this.
FYI: In yesterday tests, I have seen that after disabling the turbo mode, during total update the average operation optime and wtime increased (but not the time between start and end of the total update session.
That said the database was quite small (around 550 entries) but it will probably worth to redo the test with 100K or 1M entries )

@tbordaz
Copy link
Contributor

tbordaz commented Dec 31, 2023

Another thing to clarify is that bind/accept slowness when turbo mode is 'on'. (see solution 4292701). ATM it is not clear to me why it is slower, possibly the consequence a long computation to switch on/off turbo mode.
So AFAIK, this PR requires verification of 'total init' (#5893 (comment)) and long accept/bind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants