Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce "shared" sessions #139

Merged
merged 11 commits into from
Dec 24, 2020

Conversation

LeonidVas
Copy link
Contributor

@LeonidVas LeonidVas commented Nov 8, 2020

Before the patchset a connection to server was synonym of the queue session. Now the session has a unique UUID (returned by the "queue.identify()" method), and one session can have many connections. To connect to an existing session, call "queue.identify(uuid)" with the previously obtained UUID.
Also, for the queue the ttr option has been added.
ttr in seconds - the time after which, if there is no active connection in the session, it will be released with all its tasks.

Closes #85

@ChangeLog:
The "shared" sessions was added to the queue. Previously, a connection to server was synonym of the queue session. Now the session has a unique UUID (returned by the "queue.identify()" method), and one session can have many connections. To connect to an existing session, call "queue.identify(uuid)" with the previously obtained UUID.
Also, for the queue the ttr option has been added.
ttr in seconds - the time after which, if there is no active connection in the session, it will be released with all its tasks.

@LeonidVas LeonidVas self-assigned this Nov 8, 2020
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
@LeonidVas LeonidVas force-pushed the lvasiliev/gh-85-add-possibility-work-after-reconnect branch from 9b89fe4 to 1d8f708 Compare November 25, 2020 14:41
@LeonidVas
Copy link
Contributor Author

LeonidVas commented Nov 25, 2020

The benchmark was run before and after adding "shared" sessions.
The command that was used to run the test: rm -rf *.snap *.xlog && taskset 3 tarantool t/benchmark/multi_consumer_work.lua
For each variant of the parameters, ten runs of tests were carried out, and after the result was averaged.
Result format: (time it takes to fill the queue / time it takes to confirm the tasks) in microseconds.

Master:

consumers
batches
1 100 10000
1 102 / 393 5479 / 10681 65914 / 322383
100 7272 / 9107 49043 / 224925 4962018 / 33961217
10000 41834 / 220479 3972183 / 25542455

With "shared" sessions:

consumers
batches
1 100 10000
1 123 / 446 2776 / 12724 64437 / 509038
100 7900 / 8411 55467 / 274383 4899408 / 35677578
10000 54831 / 266131 4259797 / 30703081

In most cases, we have a loss of performance ~ 5 - 20%.

@LeonidVas LeonidVas force-pushed the lvasiliev/gh-85-add-possibility-work-after-reconnect branch from 1d8f708 to d0b2b21 Compare November 30, 2020 20:28
@LeonidVas
Copy link
Contributor Author

After refactoring.

Master:

consumers
batches
200 400
200 132887 / 659421 268149 / 1490263
400 247721 / 1538678 480667 / 3362063

With "shared" sessions:

consumers
batches
200 400
200 136281 / 620819 259882 / 1426101
400 249045 / 1361905 490443 / 2613279

@LeonidVas LeonidVas force-pushed the lvasiliev/gh-85-add-possibility-work-after-reconnect branch from d0b2b21 to c94c30b Compare November 30, 2020 21:57
Copy link
Contributor

@olegrok olegrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patch. Consider several comments below. I'll give more detailed review a bit later.

Also I think we could push d8de8b9 out of order to simplify patchset and reduce diff.

README.md Show resolved Hide resolved
queue/abstract/queue_session.lua Outdated Show resolved Hide resolved
queue/abstract/queue_session.lua Outdated Show resolved Hide resolved
queue/abstract.lua Show resolved Hide resolved
queue/abstract.lua Show resolved Hide resolved
queue/abstract/queue_session.lua Outdated Show resolved Hide resolved
queue/abstract/queue_session.lua Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract/queue_session.lua Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
queue/abstract.lua Outdated Show resolved Hide resolved
@LeonidVas
Copy link
Contributor Author

Thanks for your patch. Consider several comments below. I'll give more detailed review a bit later.

Hi! Thanks for the review.

Also I think we could push d8de8b9 out of order to simplify patchset and reduce diff.

No, it's needed fo a3d6f4e

@LeonidVas LeonidVas force-pushed the lvasiliev/gh-85-add-possibility-work-after-reconnect branch 2 times, most recently from 9c8bf69 to bf1faf5 Compare December 16, 2020 14:31
@@ -7,7 +7,7 @@ local qc = require('queue.compat')
local num_type = qc.num_type
local str_type = qc.str_type

local session = box.session
local connection = box.session
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks as a 'connection' module that is a bit strange: we have no such server-side entity. How about defining a function to obtain a connection id, like so: local connection_id = box.session.id? Not sure here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally I use connection.on_disconnect()


-- methods
local method = {
get_uuid = get_uuid,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually omit get_ for getters. How about connection_uuid()? It'll also more descriptive and more uniform if we'll rename remove_connection() to connection_remove() (so the connection_ will become a prefix).

What also disquets me a bit: there is no add_connection() or connection_add(), but there are get_uuid() and identify(). Maybe I'm too tired, but it looks hard to understand the difference between the two latter functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not connection, session.
I think this question will be best to discussed by voice.

on_session_remove = on_session_remove,
remove_connection = remove_connection,
set_settings = set_settings,
session_grant = session_grant,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gives the ability to magane sessions from given user, right? How about just grant()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted.

@@ -640,6 +664,13 @@ local function build_stats(space)
return stats
end

--- identify the session.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is a bit confusing for me. I would say something like 'Associate a current connection with given session'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted.

-- wakeup all waiters
while true do
waiter = box.space._queue_consumers.index.pk:min{id}
local waiter = box.space._queue_consumers.index.pk:min{conn_id}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to understand the idea: consumers are still identified by a connection id rather than queue session id? Can you explain, why it is different from identifying a task ownership?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several consumers can work in the same session with the same tasks. For example, one consumer can take a task and another consumer (in the same session) can ack this task.

README.md Outdated
Comment on lines 381 to 382
In the queue the session has a unique UUID and one session can have many
connections. Also, the consumer can reconnect to the existing session during the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one session can have many connections

I would say 'many connections may share one logical session' or kinda. It is a thing of taste, feel free to ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted.

README.md Outdated
without parameters.
To connect to the existing session, call the `queue.identify(session_uuid)`
with the UUID of the session.
In case of failure, an error will be thrown.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: what kinds of failures may appear here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example: Trying to use invalid format UUID, expired UUID.

README.md Outdated
without parameters.
To connect to the existing session, call the `queue.identify(session_uuid)`
with the UUID of the session.
In case of failure, an error will be thrown.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an example for net.box to better describe the idea. I guess we can use on_connect net.box trigger to transparently overcome a reconnect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said earlier, the examples will be added after the feature will be pushed to master.

@@ -372,6 +373,20 @@ Available `options`:
* `ttr` - time to release in seconds. The time after which, if there is no active
connection in the session, it will be released with all its tasks.

## Session identify

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would highlight the ability we want to provide with this function somewhere at the start of the section, For example:

Sometimes we need an ability to acknowledge a task after reconnect (because retrying it is undesirable) or even acknowlegde using another connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted.

local queue_session_ids = box.space._queue_session_ids
local session_uuid = queue_session.get_uuid(conn_id)

queue_session_ids:delete{conn_id}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jusr raw idea (maybe for further PRs): it may be helpful to have ability to trace connections and sessions and issue a log entry (on verbose or even debug level) when a connection is added or removed from a session or moved to another session.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@@ -372,6 +373,20 @@ Available `options`:
* `ttr` - time to release in seconds. The time after which, if there is no active
connection in the session, it will be released with all its tasks.

## Session identify
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but maybe it worth to name the section 'Overcome a reconnect'. That is what a user will actually find.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Overcome a reconnect' is just one of the variants what for the sessions can be used.

Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't verified the implementation thoroughly, but the idea looks okay for me.

I left some comments: mostly regarding API and presenting the feature in the readme. Some are just to better understand the idea.

The term "queue session" will be added to the queue while working
on #85. One "queue session" will include many connections (box.session).
For clarity, box.session has been renamed to connection.

Needed for #85
In the future, we will refuse to use the "pk" index of the
"_queue_taken" space to improve performance (the conclusion about
the improvement in performance is based on the benchmarks that
will be added in one of the next commits). So, let's minimize
its use.

Needed for #85
The same time functions are used in several files, so move them
to a separate file.

Needed for #85
The patch adds the ability to set the queue settings(by calling
"queue.cfg(opts)").
Now only one setting is available - "ttr"(time to release).
"ttr" in seconds - the time after which, if there is no active
connection in the session, it will be released with all its tasks.

Part of #85
@LeonidVas LeonidVas force-pushed the lvasiliev/gh-85-add-possibility-work-after-reconnect branch from ac664e5 to 373139c Compare December 22, 2020 14:36
Before the patch a connection to server was synonym of the queue
session. Now the session has a unique UUID (returned by the
"queue.identify()" method), and one session can have many
connections.
The session will be deleted (all session tasks will be released) after
"ttr" seconds have passed since the last connection was disconnected.
To connect to an existing session, call "queue.identificate(uuid)"
with the previously obtained UUID.

"ttr" in seconds - the time after which, if there is no active
connection in the session, it will be released with all its tasks.

Also, the "_queue_taken" internal space has been replaced with the
"_queue_taken_2" with a change format and used indexes (for better
performance).
The downgrade to previous version works correctly.

Closes of #85
@LeonidVas LeonidVas force-pushed the lvasiliev/gh-85-add-possibility-work-after-reconnect branch from 373139c to 00365d6 Compare December 22, 2020 14:52
@LeonidVas LeonidVas merged commit 8a8818f into master Dec 24, 2020
@LeonidVas LeonidVas deleted the lvasiliev/gh-85-add-possibility-work-after-reconnect branch December 24, 2020 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot :ack task after implicit net.box reconnect
6 participants