Skip to content

Conversation

@saviorand
Copy link
Collaborator

No description provided.

@saviorand saviorand self-assigned this May 28, 2024
Comment on lines 218 to 228
fn read(self, inout buf: Bytes) raises -> Int:
var new_buf = Pointer[UInt8]().alloc(default_buffer_size)
var new_buf = UnsafePointer[UInt8]().alloc(default_buffer_size)
var bytes_recv = recv(self.fd, new_buf, default_buffer_size, 0)
if bytes_recv == -1:
return 0
if bytes_recv == 0:
return 0
var bytes_str = String(new_buf.bitcast[Int8](), bytes_recv)
buf = bytes_str._buffer
var bytes_str = String(new_buf.bitcast[UInt8](), bytes_recv + 1)
buf = bytes(bytes_str, pop=False)
return bytes_recv

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not handle large payloads very well in combination with the request parsing logic in the sys/server. Needs to be refactored

Copy link
Collaborator

@thatstoasty thatstoasty Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could try reading directly into the ptr owned by buf if you want to try avoiding the additional string allocation. I'm not sure how much time it would save, but that's how I've been doing it for my net file descriptor implementation. Hasn't blown up yet, might be worth a try: https://github.com/thatstoasty/gojo/blob/main/gojo/net/fd.mojo#L47

fn read(inout self, inout dest: List[Byte]) -> (Int, Error):
      """Receive data from the file descriptor and write it to the buffer provided."""
      var bytes_received = recv(
          self.fd,
          DTypePointer[DType.uint8](dest.unsafe_ptr()).offset(dest.size),
          dest.capacity - dest.size,
          0,
      )
      if bytes_received == -1:
          return 0, Error("Failed to receive message from socket.")
      dest.size += bytes_received

      if bytes_received < dest.capacity:
          return bytes_received, Error(io.EOF)

      return bytes_received, Error()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, cool! Thanks! @alainrollejr going to try this out today

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, seems to work nicely

@alainrollejr
Copy link

To get a feel for the problem, I basically searched for all magic number occurences equal to 4096 (there are quite a lot) and replaced them all by 1048576. Might be not all of them needed enlarging but then I have not grown with the code so hard for me to go analyse that.
I can then increase packet sizes without the client or the server crashing.
The fundamental problem remains though that speed still rapidly decreases with packet size. I suspect there is room for optimization by considering whether all those allocs really need to be within the function calls themselves. I.e. you could probably allocate a big buffer at struct instantiation time and then point to that one over and over again rather than alloc and free ?
For what it's worth, output of my test client:

=======================
packet size 1000 Bytes:
=========================
Sent and received 1000 packets in 0.4144 seconds
Packet rate 2.41 kilo packets/s
Bit rate 19.3  Mbps
=======================
packet size 2000 Bytes:
=========================
Sent and received 1000 packets in 0.4268 seconds
Packet rate 2.34 kilo packets/s
Bit rate 37.5  Mbps
=======================
packet size 4000 Bytes:
=========================
Sent and received 1000 packets in 0.4800 seconds
Packet rate 2.08 kilo packets/s
Bit rate 66.7  Mbps
=======================
packet size 8000 Bytes:
=========================
Sent and received 1000 packets in 0.5553 seconds
Packet rate 1.80 kilo packets/s
Bit rate 115.2  Mbps
=======================
packet size 16000 Bytes:
=========================
Sent and received 1000 packets in 0.7618 seconds
Packet rate 1.31 kilo packets/s
Bit rate 168.0  Mbps
=======================
packet size 32000 Bytes:
=========================
Sent and received 1000 packets in 1.1680 seconds
Packet rate 0.86 kilo packets/s
Bit rate 219.2  Mbps
=======================
packet size 64000 Bytes:
=========================
Sent and received 1000 packets in 1.7495 seconds
Packet rate 0.57 kilo packets/s
Bit rate 292.7  Mbps
=======================
packet size 128000 Bytes:
=========================
Sent and received 1000 packets in 1.7782 seconds
Packet rate 0.56 kilo packets/s
Bit rate 575.9  Mbps
=======================
packet size 256000 Bytes:
=========================
Traceback (most recent call last):
  File "/Users/alainrolle/miniconda3/envs/py312/lib/python3.12/site-packages/urllib3/response.py", line 737, in _error_catcher
    yield

@saviorand
Copy link
Collaborator Author

@alainrollejr ah, good to know, thanks. I'm refactoring more today, expecting some major improvements both in reliability in performance after it's done. Will keep you posted

@saviorand
Copy link
Collaborator Author

saviorand commented Jun 9, 2024

@alainrollejr made some really major refactoring this weekend on this branch. Below is a selection of a couple test runs I made. I don't see a steep increase anymore with the size of the packet, although as you can see overall the rate is slower now. This is with connection: close though, which is not how it should work by default. Setting tcp_keep_alive should improve the performance up to 2x-2.5x based on experience. Working on this currently.

=======================
packet size 1280 Bytes:
=========================
Sent and received 1000 packets in 1.4183 seconds
Packet rate 0.71 kilo packets/s
Bit rate 7.2  Mbps
=======================

=======================
packet size 8500 Bytes:
=========================
Sent and received 1000 packets in 2.3081 seconds
Packet rate 0.43 kilo packets/s
Bit rate 29.5  Mbps

=======================
packet size 10000 Bytes:
=========================
Sent and received 1000 packets in 2.2904 seconds
Packet rate 0.44 kilo packets/s
Bit rate 34.9  Mbps

packet size 12800 Bytes:
=========================
Sent and received 1000 packets in 2.5152 seconds
Packet rate 0.40 kilo packets/s
Bit rate 40.7  Mbps
=======================

=======================
packet size 100000 Bytes:
=========================
Sent and received 1000 packets in 2.0973 seconds
Packet rate 0.48 kilo packets/s
Bit rate 381.4  Mbps

=======================
packet size 128000 Bytes:
=========================
Sent and received 1000 packets in 2.1425 seconds
Packet rate 0.47 kilo packets/s
Bit rate 477.9  Mbps

packet size 128000 Bytes:
=========================
Sent and received 1000 packets in 1.9405 seconds
Packet rate 0.52 kilo packets/s
Bit rate 527.7  Mbps

@thatstoasty
Copy link
Collaborator

thatstoasty commented Jun 11, 2024

@saviorand It's cool to see you're using some of the other structs from Gojo! I'm just going through the code to see if I spot anything else that might help speed up the code.

Copyable has been removed from these structs in the latest gojo release to prevent this from accidentally happening.

@saviorand saviorand merged commit e80c3b1 into main Jun 16, 2024
@thatstoasty thatstoasty deleted the feature/no-string-conversions branch January 13, 2025 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants