Skip to content

Commit 14a1eaa

Browse files
sunilmutdavem330
authored andcommitted
hv_sock: perf: loop in send() to maximize bandwidth
Currently, the hv_sock send() iterates once over the buffer, puts data into the VMBUS channel and returns. It doesn't maximize on the case when there is a simultaneous reader draining data from the channel. In such a case, the send() can maximize the bandwidth (and consequently minimize the cpu cycles) by iterating until the channel is found to be full. Perf data: Total Data Transfer: 10GB/iteration Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket reader Packet size: 64KB CPU sys time was captured using the 'time' command for the writer to send 10GB of data. 'Send Buffer Loop' is with the patch applied. The values below are over 10 iterations. |--------------------------------------------------------| | | Current | Send Buffer Loop | |--------------------------------------------------------| | | Throughput | CPU sys | Throughput | CPU sys | | | (MB/s) | time (s) | (MB/s) | time (s) | |--------------------------------------------------------| | Min | 407 | 7.048 | 401 | 5.958 | |--------------------------------------------------------| | Max | 455 | 7.563 | 542 | 6.993 | |--------------------------------------------------------| | Avg | 440 | 7.411 | 451 | 6.639 | |--------------------------------------------------------| | Median | 446 | 7.417 | 447 | 6.761 | |--------------------------------------------------------| Observation: 1. The avg throughput doesn't really change much with this change for this scenario. This is most probably because the bottleneck on throughput is somewhere else. 2. The average system (or kernel) cpu time goes down by 10%+ with this change, for the same amount of data transfer. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent ac383f5 commit 14a1eaa

File tree

1 file changed

+31
-14
lines changed

1 file changed

+31
-14
lines changed

net/vmw_vsock/hyperv_transport.c

Lines changed: 31 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,9 @@ struct hvs_recv_buf {
5555
};
5656

5757
/* We can send up to HVS_MTU_SIZE bytes of payload to the host, but let's use
58-
* a small size, i.e. HVS_SEND_BUF_SIZE, to minimize the dynamically-allocated
59-
* buffer, because tests show there is no significant performance difference.
58+
* a smaller size, i.e. HVS_SEND_BUF_SIZE, to maximize concurrency between the
59+
* guest and the host processing as one VMBUS packet is the smallest processing
60+
* unit.
6061
*
6162
* Note: the buffer can be eliminated in the future when we add new VMBus
6263
* ringbuffer APIs that allow us to directly copy data from userspace buffer
@@ -674,28 +675,44 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
674675
struct hvsock *hvs = vsk->trans;
675676
struct vmbus_channel *chan = hvs->chan;
676677
struct hvs_send_buf *send_buf;
677-
ssize_t to_write, max_writable, ret;
678+
ssize_t to_write, max_writable;
679+
ssize_t ret = 0;
680+
ssize_t bytes_written = 0;
678681

679682
BUILD_BUG_ON(sizeof(*send_buf) != PAGE_SIZE_4K);
680683

681684
send_buf = kmalloc(sizeof(*send_buf), GFP_KERNEL);
682685
if (!send_buf)
683686
return -ENOMEM;
684687

685-
max_writable = hvs_channel_writable_bytes(chan);
686-
to_write = min_t(ssize_t, len, max_writable);
687-
to_write = min_t(ssize_t, to_write, HVS_SEND_BUF_SIZE);
688-
689-
ret = memcpy_from_msg(send_buf->data, msg, to_write);
690-
if (ret < 0)
691-
goto out;
688+
/* Reader(s) could be draining data from the channel as we write.
689+
* Maximize bandwidth, by iterating until the channel is found to be
690+
* full.
691+
*/
692+
while (len) {
693+
max_writable = hvs_channel_writable_bytes(chan);
694+
if (!max_writable)
695+
break;
696+
to_write = min_t(ssize_t, len, max_writable);
697+
to_write = min_t(ssize_t, to_write, HVS_SEND_BUF_SIZE);
698+
/* memcpy_from_msg is safe for loop as it advances the offsets
699+
* within the message iterator.
700+
*/
701+
ret = memcpy_from_msg(send_buf->data, msg, to_write);
702+
if (ret < 0)
703+
goto out;
692704

693-
ret = hvs_send_data(hvs->chan, send_buf, to_write);
694-
if (ret < 0)
695-
goto out;
705+
ret = hvs_send_data(hvs->chan, send_buf, to_write);
706+
if (ret < 0)
707+
goto out;
696708

697-
ret = to_write;
709+
bytes_written += to_write;
710+
len -= to_write;
711+
}
698712
out:
713+
/* If any data has been sent, return that */
714+
if (bytes_written)
715+
ret = bytes_written;
699716
kfree(send_buf);
700717
return ret;
701718
}

0 commit comments

Comments
 (0)