Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Allow TCP notifiers to cause connections to yield while receiving #1343

Closed
SeanTAllen opened this issue Oct 20, 2016 · 3 comments
Closed
Assignees

Comments

@SeanTAllen
Copy link
Member

SeanTAllen commented Oct 20, 2016

Allow TCPConnectionNotify instances some level of control over the rate at which they receive data from TCPConnection.

This RFC has documentation required along with code before this is merged.

https://github.com/ponylang/rfcs/blob/master/text/0019-tcp-received-bool.md

@SeanTAllen
Copy link
Member Author

@jemc I haven't been able to come up with a good way to test this. Any thoughts?

@jemc
Copy link
Member

jemc commented Oct 20, 2016

Honestly, it doesn't seem very testable to me. It seems like it doesn't impact anything that is visible from within the program - just performance.

@SeanTAllen
Copy link
Member Author

That was my thinking @jemc.

@SeanTAllen SeanTAllen self-assigned this Oct 21, 2016
SeanTAllen added a commit that referenced this issue Oct 21, 2016
This comes from direct experience using the existing `TCPConnection`
functionality at Sendence. We are heavy users of `expect` on `TCPConnection` in
order to support framed protocols. Our `received` methods on notifiers are
generally of the following form:

```pony
  fun ref received(conn: TCPConnection ref, data: Array[U8] iso) =>
    if _header then
      // convert the 4 byte header into a value for expect, aka payload length
      let expect = Bytes.to_u32(data(0), data(1), data(2), data(3)).usize()

      conn.expect(expect)
      _header = false
    else
      // do something with payload
      ...

      // reset expect for next 4 byte header
        conn.expect(4)
        _header = true
    end
```

This short of usage is why `expect` was initially added to `TCPConnection`.
Upon usage, we found a serious drawback with this approach. `TCPConnection`
will read up to 4k of data on a single behavior run and if there is still data
available, it will then send itself a `_read_again` message to trigger more
reading of additional data. It does this so that it doesn't hogged the
scheduler while reading from the socket. This can work reasonably well in some
scenarios but not others.

In the framed protocol example above, if the message payloads are small then
4k of data can result in a lot of messages being sent from our `received`
method to other actors in the application. In an application that is
continously receiving data, this results in a very bursty scheduling experience.

After consulting with Sylvan, we changed `received` and `TCPConnection` to
allow `received` to return a Boolean to indicate whether `TCPConnection`
should continue sending more data on this behavior run.

We've found that for some workloads, we are able to get equal performance
while greatly lowering latency by having `TCPConnection` call `_read_again`
earlier than it otherwise would.

Closes RFC #19
Closes #1343
SeanTAllen added a commit that referenced this issue Oct 21, 2016
This comes from direct experience using the existing `TCPConnection`
functionality at Sendence. We are heavy users of `expect` on `TCPConnection` in
order to support framed protocols. Our `received` methods on notifiers are
generally of the following form:

```pony
  fun ref received(conn: TCPConnection ref, data: Array[U8] iso) =>
    if _header then
      // convert the 4 byte header into a value for expect, aka payload length
      let expect = Bytes.to_u32(data(0), data(1), data(2), data(3)).usize()

      conn.expect(expect)
      _header = false
    else
      // do something with payload
      ...

      // reset expect for next 4 byte header
        conn.expect(4)
        _header = true
    end
```

This short of usage is why `expect` was initially added to `TCPConnection`.
Upon usage, we found a serious drawback with this approach. `TCPConnection`
will read up to 4k of data on a single behavior run and if there is still data
available, it will then send itself a `_read_again` message to trigger more
reading of additional data. It does this so that it doesn't hogged the
scheduler while reading from the socket. This can work reasonably well in some
scenarios but not others.

In the framed protocol example above, if the message payloads are small then
4k of data can result in a lot of messages being sent from our `received`
method to other actors in the application. In an application that is
continously receiving data, this results in a very bursty scheduling experience.

After consulting with Sylvan, we changed `received` and `TCPConnection` to
allow `received` to return a Boolean to indicate whether `TCPConnection`
should continue sending more data on this behavior run.

We've found that for some workloads, we are able to get equal performance
while greatly lowering latency by having `TCPConnection` call `_read_again`
earlier than it otherwise would.

Closes RFC #19
Closes #1343
SeanTAllen added a commit that referenced this issue Oct 21, 2016
This comes from direct experience using the existing `TCPConnection`
functionality at Sendence. We are heavy users of `expect` on `TCPConnection` in
order to support framed protocols. Our `received` methods on notifiers are
generally of the following form:

```pony
  fun ref received(conn: TCPConnection ref, data: Array[U8] iso) =>
    if _header then
      // convert the 4 byte header into a value for expect, aka payload length
      let expect = Bytes.to_u32(data(0), data(1), data(2), data(3)).usize()

      conn.expect(expect)
      _header = false
    else
      // do something with payload
      ...

      // reset expect for next 4 byte header
        conn.expect(4)
        _header = true
    end
```

This short of usage is why `expect` was initially added to `TCPConnection`.
Upon usage, we found a serious drawback with this approach. `TCPConnection`
will read up to 4k of data on a single behavior run and if there is still data
available, it will then send itself a `_read_again` message to trigger more
reading of additional data. It does this so that it doesn't hogged the
scheduler while reading from the socket. This can work reasonably well in some
scenarios but not others.

In the framed protocol example above, if the message payloads are small then
4k of data can result in a lot of messages being sent from our `received`
method to other actors in the application. In an application that is
continously receiving data, this results in a very bursty scheduling experience.

After consulting with Sylvan, we changed `received` and `TCPConnection` to
allow `received` to return a Boolean to indicate whether `TCPConnection`
should continue sending more data on this behavior run.

We've found that for some workloads, we are able to get equal performance
while greatly lowering latency by having `TCPConnection` call `_read_again`
earlier than it otherwise would.

Closes RFC #19
Closes #1343
SeanTAllen added a commit that referenced this issue Oct 21, 2016
This comes from direct experience using the existing `TCPConnection`
functionality at Sendence. We are heavy users of `expect` on `TCPConnection` in
order to support framed protocols. Our `received` methods on notifiers are
generally of the following form:

```pony
  fun ref received(conn: TCPConnection ref, data: Array[U8] iso) =>
    if _header then
      // convert the 4 byte header into a value for expect, aka payload length
      let expect = Bytes.to_u32(data(0), data(1), data(2), data(3)).usize()

      conn.expect(expect)
      _header = false
    else
      // do something with payload
      ...

      // reset expect for next 4 byte header
        conn.expect(4)
        _header = true
    end
```

This short of usage is why `expect` was initially added to `TCPConnection`.
Upon usage, we found a serious drawback with this approach. `TCPConnection`
will read up to 4k of data on a single behavior run and if there is still data
available, it will then send itself a `_read_again` message to trigger more
reading of additional data. It does this so that it doesn't hogged the
scheduler while reading from the socket. This can work reasonably well in some
scenarios but not others.

In the framed protocol example above, if the message payloads are small then
4k of data can result in a lot of messages being sent from our `received`
method to other actors in the application. In an application that is
continously receiving data, this results in a very bursty scheduling experience.

After consulting with Sylvan, we changed `received` and `TCPConnection` to
allow `received` to return a Boolean to indicate whether `TCPConnection`
should continue sending more data on this behavior run.

We've found that for some workloads, we are able to get equal performance
while greatly lowering latency by having `TCPConnection` call `_read_again`
earlier than it otherwise would.

Closes RFC #19
Closes #1343
SeanTAllen added a commit that referenced this issue Oct 21, 2016
This comes from direct experience using the existing `TCPConnection`
functionality at Sendence. We are heavy users of `expect` on `TCPConnection` in
order to support framed protocols. Our `received` methods on notifiers are
generally of the following form:

```pony
  fun ref received(conn: TCPConnection ref, data: Array[U8] iso) =>
    if _header then
      // convert the 4 byte header into a value for expect, aka payload length
      let expect = Bytes.to_u32(data(0), data(1), data(2), data(3)).usize()

      conn.expect(expect)
      _header = false
    else
      // do something with payload
      ...

      // reset expect for next 4 byte header
        conn.expect(4)
        _header = true
    end
```

This short of usage is why `expect` was initially added to `TCPConnection`.
Upon usage, we found a serious drawback with this approach. `TCPConnection`
will read up to 4k of data on a single behavior run and if there is still data
available, it will then send itself a `_read_again` message to trigger more
reading of additional data. It does this so that it doesn't hogged the
scheduler while reading from the socket. This can work reasonably well in some
scenarios but not others.

In the framed protocol example above, if the message payloads are small then
4k of data can result in a lot of messages being sent from our `received`
method to other actors in the application. In an application that is
continously receiving data, this results in a very bursty scheduling experience.

After consulting with Sylvan, we changed `received` and `TCPConnection` to
allow `received` to return a Boolean to indicate whether `TCPConnection`
should continue sending more data on this behavior run.

We've found that for some workloads, we are able to get equal performance
while greatly lowering latency by having `TCPConnection` call `_read_again`
earlier than it otherwise would.

Closes RFC #19
Closes #1343
jemc pushed a commit that referenced this issue Oct 21, 2016
#1355)

This comes from direct experience using the existing `TCPConnection`
functionality at Sendence. We are heavy users of `expect` on `TCPConnection` in
order to support framed protocols. Our `received` methods on notifiers are
generally of the following form:

```pony
  fun ref received(conn: TCPConnection ref, data: Array[U8] iso) =>
    if _header then
      // convert the 4 byte header into a value for expect, aka payload length
      let expect = Bytes.to_u32(data(0), data(1), data(2), data(3)).usize()

      conn.expect(expect)
      _header = false
    else
      // do something with payload
      ...

      // reset expect for next 4 byte header
        conn.expect(4)
        _header = true
    end
```

This short of usage is why `expect` was initially added to `TCPConnection`.
Upon usage, we found a serious drawback with this approach. `TCPConnection`
will read up to 4k of data on a single behavior run and if there is still data
available, it will then send itself a `_read_again` message to trigger more
reading of additional data. It does this so that it doesn't hogged the
scheduler while reading from the socket. This can work reasonably well in some
scenarios but not others.

In the framed protocol example above, if the message payloads are small then
4k of data can result in a lot of messages being sent from our `received`
method to other actors in the application. In an application that is
continously receiving data, this results in a very bursty scheduling experience.

After consulting with Sylvan, we changed `received` and `TCPConnection` to
allow `received` to return a Boolean to indicate whether `TCPConnection`
should continue sending more data on this behavior run.

We've found that for some workloads, we are able to get equal performance
while greatly lowering latency by having `TCPConnection` call `_read_again`
earlier than it otherwise would.

Closes RFC #19
Closes #1343
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants