New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conn vs data cleanup #3442

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@bagder
Copy link
Member

bagder commented Jan 5, 2019

Easy handles and connections

History

A common pattern in the libcurl code is having connections (struct
connectdata) point to its corresponding transfer (easy handle) using the
conn->data struct member.

We therefore pass around the 'conn' pointer only in a lot of places as we can
defer the transfer from it (using conn->data).

This made perfect sense back in the days when each transfer had a single
connection for the duration of a transfer and became less good when we added
Pipelining and downright ugly when we added support for multiplexed
connections.

Now, several simultaneous transfers can use the same connection so conn->data
has to be updated very frequently to make it accurate for every possible
situation. This is error-prone and is repeteadly a reason for issues and bugs.

Two transfers using two connections

    /------------\          /--------------\
    | Transfer 1 |   ===>   | Connection 1 |
    | 'data'     |          \--------------/
    \------------/

    /------------\
    | Transfer 2 |          /--------------\
    | 'data'     |   ===>   | Connection 2 |
    \------------/          \--------------/

Two transfers using one connection

    /------------\          /--------------\
    | Transfer 1 |   ===>   | Connection 1 |
    | 'data'     |     -->  \--------------/
    \------------/    /
                     /
    /------------\  /
    | Transfer 2 | /
    | 'data'     |
    \------------/

A better way

A transfer however only uses a single connection for a transfer (or none at
all) at any given time. data->conn gets assigned when a connection is setup
and it gets cleared when the connection is again closed or returned to the
connection pool. Several transfers can point to the same connection while they
multiplex over the same one.

Going forward

Use of 'conn->data' should be removed over time in preference to 'data->conn'.
Function prototypes should rather work with struct Curl_easy *data as the
identifier poiting out the specific transfer and 'data->conn' to get the
connection (which then is NULL if there is no associated connection for this
transfer), rather than the old style that passes in struct connectdata *conn
to most functions.

  • Curl_attach_connnection() - this is the new function that gets called when
    a connection is "attached" to a transfer. When the two gets their
    association. The association is then kept until detached at a later point.

  • Curl_detach_connnection() - this is the new function that gets called to
    separate the tranfer from its associated connection. From this point on,
    the transfer has no longer any associated connection.

Gradual, not single-shot

I previously made an attempt in making this change (basically removing
conn->data) in a single shot, a single pull-request, but even before it was
completed the changeset reached over 3000 modified lines. Quite simply too
large to be reviewable and landable in one atomic change.

That experience made me draw the conclusion that we need to make this change
gradually and over time.

Over time, data->conn should be used and conn->data use should be reduced.

@bagder bagder force-pushed the bagder/conn-data-cleanup branch from b74e1f0 to 5c022f8 Jan 5, 2019

@jay

This comment has been minimized.

Copy link
Member

jay commented Jan 7, 2019

I think it will probably be a bit rocky regardless. As it is it looks fine to me however CI re Curl_connect create_conn does not always allocate for conn.

@bagder

This comment has been minimized.

Copy link
Member Author

bagder commented Jan 7, 2019

I think it will probably be a bit rocky regardless.

Probably, yes. But I still think doing it gradually instead of a huge single merge will make a softer landing.

As it is it looks fine to me however CI re Curl_connect create_conn does not always allocate for conn.

Clearly some more fixes are needed to make the builds green...

@bagder bagder force-pushed the bagder/conn-data-cleanup branch 2 times, most recently from a60166a to 4dcc400 Jan 8, 2019

bagder added a commit that referenced this pull request Jan 9, 2019

urldata: rename easy_conn to just conn
We use "conn" everywhere to be a pointer to the connection.

Introduces two functions that "attaches" and "detaches" the connection
to and from the transfer.

Going forward, we should favour using "data->conn" (since a transfer
always only has a single connection or none at all) to "conn->data"
(since a connection can have none, one or many transfers associated with
it and updating conn->data to be correct is error prone and a frequent
reason for internal issues).

Closes #3442

@bagder bagder force-pushed the bagder/conn-data-cleanup branch from 4dcc400 to 3abec02 Jan 10, 2019

bagder added a commit that referenced this pull request Jan 10, 2019

urldata: rename easy_conn to just conn
We use "conn" everywhere to be a pointer to the connection.

Introduces two functions that "attaches" and "detaches" the connection
to and from the transfer.

Going forward, we should favour using "data->conn" (since a transfer
always only has a single connection or none at all) to "conn->data"
(since a connection can have none, one or many transfers associated with
it and updating conn->data to be correct is error prone and a frequent
reason for internal issues).

Side-effect: this also removes some old pipelining logic that isn't used and
is destined for removal in April 2019 according to DEPRECATE.md.

Closes #3442
urldata: rename easy_conn to just conn
We use "conn" everywhere to be a pointer to the connection.

Introduces two functions that "attaches" and "detaches" the connection
to and from the transfer.

Going forward, we should favour using "data->conn" (since a transfer
always only has a single connection or none at all) to "conn->data"
(since a connection can have none, one or many transfers associated with
it and updating conn->data to be correct is error prone and a frequent
reason for internal issues).

Side-effect: this also removes some old pipelining logic that isn't used and
is destined for removal in April 2019 according to DEPRECATE.md.

Closes #3442

@bagder bagder force-pushed the bagder/conn-data-cleanup branch from 3abec02 to 152a81e Jan 10, 2019

@bagder

This comment has been minimized.

Copy link
Member Author

bagder commented Jan 10, 2019

Fixed the merge conflicts #3448 caused.

@bagder bagder closed this in ba24323 Jan 11, 2019

@bagder bagder deleted the bagder/conn-data-cleanup branch Jan 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment