Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use lib.protocol recycle mechanism #14

Merged
merged 1 commit into from
Sep 1, 2015
Merged

Conversation

dpino
Copy link
Member

@dpino dpino commented Sep 1, 2015

Makes objects from lib.protocol to use their "recycle" mechanism. After using an object, call object:free() to add the object to a list of unused objects in the corresponding class. This avoids calling the code protected under the "recycle" flag in the lib.protocol classes.

More info: http://comments.gmane.org/gmane.network.snabb.devel/1009

Performance improves slightly:

Processed 1.0 million packets in 5.00 seconds (658850640 bytes; 1.05 Gbps)
Made 4,012 breaths: 255.00 packets per breath; 1246.29 us per breath
Rate(Mpps): 0.205

@dpino
Copy link
Member Author

dpino commented Sep 1, 2015

I refactored some functions reported as time consuming by the profiler:

====== apps/lwaftr/lwaftr.lua ======
@@ 60 @@
      | function LwAftr:binding_lookup_ipv4_from_pkt(pkt, pre_ipv4_bytes)
      |    local dst_ip_start = pre_ipv4_bytes + 16
      |    -- Note: ip is kept in network byte order, regardless of host byte order
  11% |    local ip = ffi.cast("uint32_t*", pkt.data + dst_ip_start)[0]
      |    -- TODO: don't assume the length of the IPv4 header; check IHL
      |    local ipv4_header_len = 20
      |    local dst_port_start = pre_ipv4_bytes + ipv4_header_len + 2
   9% |    local port = C.ntohs(ffi.cast("uint16_t*", pkt.data + dst_port_start)[0])
   3% |    return self:binding_lookup_ipv4(ip, port)
      | end
      |
      | -- Todo: make this O(1)
@@ 259 @@
      |    local ether_dst = self.b4_mac -- FIXME: this should probaby use NDP
      |
      |    local ttl_offset = constants.ethernet_header_size + 8
   7% |    pkt.data[ttl_offset] = pkt.data[ttl_offset] - 1
   4% |    local ttl = pkt.data[ttl_offset]
      |    -- Do not encapsulate packets that now have a ttl of zero
      |    if ttl == 0 then -- TODO: make this conditional on icmp_policy?
      |       local icmp_config = {type = constants.icmpv4_time_exceeded,
@@ 359 @@
      |       local pkt = link.receive(i)
      |       if debug then print("got a pkt") end
      |       local ethertype_offset = 12
  16% |       local ethertype = C.ntohs(ffi.cast('uint16_t*', pkt.data + ethertype_offset)[0])
      |       local out_pkt = nil
      |
      |       if ethertype == constants.ethertype_ipv4 then -- Incoming packet from the internet
   7% |          ffi.copy(scratch_ipv4, pkt.data + constants.ethernet_header_size + constants.ipv4_src_addr, 4)
   4% |          out_pkt = self:_encapsulate_ipv4(pkt)
      |       elseif ethertype == constants.ethertype_ipv6 then
      |          -- decapsulate iff the source was a b4, and forward/hairpin
      |          out_pkt = self:from_b4(pkt)
      |       end -- FIXME: silently drop other types; is this the right thing to do?
      |       --if debug then print("encapsulated") end
      |       if out_pkt then
   3% |          if type(out_pkt) == type({}) then -- Fragmented
      |             for _,opkt in ipairs(out_pkt) do
      |                link.transmit(o, opkt)
      |             end

Performance has increased to 0.45 Mbps.

 Processed 2.3 million packets in 5.00 seconds (1457616720 bytes; 2.33 Gbps)
Made 8,876 breaths: 255.00 packets per breath; 563.37 us per breath
Rate(Mpps): 0.453

There's still intrepreted and garbage collected code reported so there's room for more optimizations.

40%  Interpreted
  -- 48%  lwaftr.lua:method
  -- 24%  lwaftr.lua:_encapsulate_ipv4
  -- 17%  lwaftr.lua:binding_lookup_ipv4_from_pkt
37%  Compiled
  -- 10%  packet.lua:prepend
  --  7%  lwaftr.lua:fixup_tcp_checksum
  --  7%  datagram.lua:push
  --  6%  packet.lua:clone
  --  5%  class.lua:new
  --  5%  lwaftr.lua:_encapsulate_ipv4
  --  4%  packet.lua:shiftleft
  --  4%  link.lua:receive
  --  4%  class.lua:free
  --  3%  header.lua:header
  --  3%  lwaftr.lua:ethertype
  --  3%  class.lua:superClass
  --  3%  ethernet.lua:dst
  --  3%  lwaftr.lua:dst_ip
  --  3%  link.lua:transmit
14%  C code
  -- 87%  lwaftr.lua:method
  -- 13%  lwaftr.lua:_encapsulate_ipv4
 7%  Garbage Collector
  -- 100%  lwaftr.lua:method

@kbara
Copy link

kbara commented Sep 1, 2015

I'm interesting in potentially reusing the datagrams/headers at some point rather than just recycling them, but this is a step in the right direction. The tests still pass with this change. Would you do similar work on icmp.lua before merge?

Calling :free() on objects from lib.protocol puts the object on a list of
unused objects in the corresponding class. The next call of the class's
constructor will return an object from the free list or create a new one if
the list is empty.
kbara added a commit that referenced this pull request Sep 1, 2015
Use lib.protocol recycle mechanism
@kbara kbara merged commit 00de183 into Igalia:ipv6_encap Sep 1, 2015
@dpino dpino deleted the ipv6_encap_opt branch September 22, 2015 08:46
takikawa added a commit that referenced this pull request Mar 20, 2017
Add l7fw app and `snabb wall filter`
wingo pushed a commit that referenced this pull request Mar 20, 2018
Add LuaJIT test suite as submodule & integrate with Travis-CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants