R3.0 #89

jhickman8x3 · 2016-02-26T15:24:21Z

No description provided.

Change-Id: I7a24151be54ccc2c3c9505ea7afb8437752ef25f

Vrouter to generate EEXIST error when flow add request failed due to race between agent and datapath, where flow entry was already created due to packet trap Closes-Bug: 1547501 Change-Id: I0d2073622b06998dd5778fb94749958afc09bac0 (cherry picked from commit c17b4d4)

We can not hash L4 ports as fragments other than the first have no L4 headers. In result fragments of the same packets may have been distributed onto different lcores and potentially be routed out of order. The fix was originally committed by Michal Dubiel (commit d59ca66) For a reason the fix was dropped in the commit 1056396 Change-Id: Iac607e6e14a139a0c015eeb5c9ff8c3de04854c6 Closes-Bug: #1474432

If a flow operation is requested for an index which is not active, vrouter should return an ENOENT error for such operation Closes-Bug: 1548678 Change-Id: I42bbd58f41a868b1ce630046ae40e911df0d9491 (cherry picked from commit 1cf3761)

Change-Id: Ia9ae864af98a645b60c5880db29ba8ff1b3c494f Closes-Bug: #1517665

The root cause is mbuf starvation due to lots of fragments. The mbufs get stuck in the fragment assembler. To fix the issue we disable fragment assembler allocations if the number of allocated fragment elements is greater than 1024. This basically temporarily drop of out of order fragments. + Fix memory allocation statistics. Change-Id: Ife775ac3073a9c5981672bd64952661c28380a7c Closes-bug: #1545798

Add a new option "--match" to match flows based on certain criteria. With the match option, one will be able to match based on source ip or destination ip (with or without ports) or vrf or protocol or any of them together with any other of them. e.g.: --match 1.1.1.1:20 --match "1.1.1.1:20,2.2.2.2:22" --match "[fe80::225:90ff:fec3:afa]:22 --match "10.204.217.10:56910 & vrf 0 & proto tcp" --match "10.204.217.10:56910,169.254.0.3:22 & vrf 0 & proto tcp" Change-Id: Id70e6d60950babe44612e6aded036c5d1bec6f92 Partial-BUG: #1513001

…into R3.0

Insufficient space in external memory buffer is not actually an error for the vRouter/DPDK, so we supress those messages by using -DSANDESH_QUIET. Change-Id: I5bc0a51f7c4659f2733a72e9b45322d117265a87 Closes-bug: #1546291

…kets If port mirroring is enabled, cloned packet is subjected to mirroring using the original packet's forwarding metadata. If mirroring code changes the metadata content, original packet will be forwarded as per the changed fmd and results in wrong forwarding. In the current case, mirroring is to a VM in different VN (hence new VRF) and mirroring code is modifying the fmd's dvrf to new VRF. The original ARP packet's L2 and L3 looksups are happening on the modified VRF resulting in ARP getting dropped. Also the type of packet is identified using vr_pkt_type() after packet is mirrored. This is resulting in wrong pkt_type being used for mirrored packet hence the source IP packet the mirrored packet is not correctly computed. As a fix, new FMD is used for mirrored packet and packet type is identified before mirroring itself. Change-Id: I1fbe7d7e83c7e66c3287b8082444483578117d14 closes-bug: #1549727

The number of TX/RX descriptors are also decreased to account for this. Change-Id: Ib29da97e229fbf6a77644e449d2a81281f438058 Closes-Bug: 1550073

… number of TX/RX descriptors are also decreased to account for this." into R3.0

If mirroring is enabled on analyzing VM's port itself, one copy of the original packet gets mirrored to that port. But this mirrored packet should not be mirroed again. Right now without verifying whether the packet is mirrored or not, it is attempted for mirroring again. When a packet is mirrored it is marked with flag VP_FLAG_FROM_DP. If this flag is set it is not mirrored again. Change-Id: Ie4cca916945878b77bc4db7666992f3933db3e13 closes-bug: #1549761

…red packets" into R3.0

Currently the L2 header used for both L2 and L3 GRO processing is only next hop id. Post GRO, the nexthop is retrieved from this id and packet is subjected to this nexthop for further processing. The ingress interface filled in the packet is nh_dev of this nexthop. This is resulting in ingress interface as Tap interface though packet acttually ingressed on Fabric interface. This can potentially lead to issues in the packet's further processing as Tap interface is not the real ingress interface. As a fix, the ingress interface id is also pushed as L2 header. Change-Id: I6fbe055fff7da4c957fde29d132b56ea19bd2f3f closes-bug: #1550632

When Transmit port mirroring is enabled, packet received on Fabric interface is right now mirrored using the source IP of the inner packet. This results in RPF failure on Analyzer VM's compute node because the compute node which is doing the port mirroring is using other compute node's VM IP. As a fix, if mirroring is Tax mirroring, rather using inner packets source ip, dest ip is used, so that Analyzer VM's RPF will not have any issues Change-Id: I5beaa0dd0cc3c886a1e77f244c8003595ed348e2 closes-bug: #1550312

Vrouter will track tcp flags, but will not evict the flow entry once the session is closed and let vrouter agent look at the flag to delete the flow eventually. This is required till vrouter agent handles evict event in a more stable way Partial-Bug: 1550759 Change-Id: Ic7074442b22fc2c11362274e4e390654f3f2400a

We call vr_dpdk_virtio_stop() from uvhm_client_munmap() which in turn is called from vr_uvh_nl_vif_del_handler(). These are as tiny as possible changes based on the the abandoned refactoring commit: https://review.opencontrail.org/#/c/17178/ Closes-bug: #1551093 Change-Id: I4653601635dbb7f29d7af9ce15a18193904f583f

was tunneled or not Once the packet is submitted for GRO, all datapath information is lost. Post GRO, only values that are saved in the packet are the vif and the nexthop. vif is a recent addition to the saved information that helped us to identify which interface the packet came from originally. Once the vif value was set properly, the logic that checked whether the packet should be trapped to agent or not based on the presence of label (basically fabric arp responses should be trapped or not), misbehaved since label information is not saved pre-GRO and hence not available in the metadata post GRO. For now, fix the specific logic by checking whether the egress vrf is different from the ingress vrf, which will be the case since physical interface vrf will not be the same as vm's vrf. Change-Id: Iba000889039bc8a5020fc11a462ba1b1a68ce1c8 Closes-BUG: #1551382

A mirror action needs its own forwarding metadata since the forwarding parameeters that are used for mirroring will be different from those that are used for packet forwarding. In this particular case, the vrf to which the packets are mirrored are different from the vrf to which packets need to be classified, and hence the packets were dropped in the flow module with no source route as the drop reason. Change-Id: I21930818a5cdec560818acb79f671ce29ac8bc85 Closes-BUG: #1552101

Currently when a flow entry is deleted, the corresponding hash entry is kept at the head of the free list and after successfully inserting into the list the entry's flags are reset. If any other core is attempting to fetch a free entry, there is a likely chance that just added entry will be given as free entry before the flags are reset as this entry is at the head of the list. Because of this there is a race condition for setting the flags between deletion of entry and additin of entry. If deletion overwrites the addition flags, the Valid flags gets overwritten. Because of this the entry would not be in free list but still would not be considered a valid entry. Searching this entry by index fails as this is not a valid entry. As a fix, the flags while inserting the entry to free list are modified before inserting so that there would not be any race condition. Change-Id: I67b1077e75a8a9f4ca5ddcfe3a7513061fd20681 closes-bug: #1552544

I have no hardware with 16 cores, but I guess with lots of lcores assigned to vRouter/DPDK some of them might get sleep due to no RX queues assigned. Change-Id: I7e37e14ef9aebfc3613ef5e908091ee35093d1bb Closes-bug: #1505472

In case of ICMP error packets for ICMP errors, we were not initializing flow key and trying to form a flow out of that key, resulting in wrong key length and corrupted flow entry(s). We will drop such packets. Change-Id: Idae46a7e128482ad89da8b5bd1bd0ef6b17ef28e Closes-BUG: #1556363

For an ICMP packet, the port numbers remain the same in either direction i.e.: source port remains the same for both foward and the reverse flows as does the destination port. Hence, we should not be swapping ports in calculating the flow key while trying to tag an ICMP error packet to a flow set up for the original stream. Change-Id: Ic5df8aec5f1009441aefd3d177568d55f3cb0d2c Closes-BUG: #1554236

…o R3.0

In the process of indicating a flow miss to the agent, if a pclone fails, vRouter returns without unsetting the NEW flag, a flag that indicates that the datapath is in the process of creating a new flow. The side effect of not unsetting this flag is that agent can not modify the entry and this leads to perennial HOLD flows in the system. To fix this problem, if the pclone fails, trap the original packet to the agent without holding it in the HOLD queue. While this will mean that the first packet does not reach the destination, it is a better alternative to not trapping the packet and wait for agent to do the audit after a minute or so. Such trapped packets are captured under a new counter "Original Packet Trapped" in the dropstats output. Change-Id: I34c19abf935d9b06f55e875b76a4859350743c2b Partial-Bug: #1628175

This fix is IPV6 equivalent of https://review.opencontrail.org/#/c/11506/. In in-network service instance case, when service scaling is more than one, all the instances are going to have the same mac address and IP address. Inet route points to ECMP composite nexthop but FDB route does not point to ECMP nexthop. So when neighbour request is received from service VM Vrouter needs to send response on the incoming interface rather bridging the reply packet, as FDB route points to arbitrarly any of the service VM encap L2 nexthop. After preparing the reply packet,bridge lookup is done and outgoing interface is compared against ingress interface. If they are not same, reply packet is transmitted on ingress interface. Change-Id: Ib0113227019d7a452541129843813a647d388b36 closes-bug: #1461882

While calculating the UDP source port for a multicast packet, the hash needs to be on Ethernet header and VRF Id. For packet arriving from VM and Tor, there are no extra header other than transport headers. So the hash is calculated rightly for these packets. But for packets arriving from other compute node, there are additional header other than transport headers. The hash is wrongly calculated in this case on thsese additional headers leading to different hash for the same Ethernet headers, resulting in different UDP sourcd port. The offset is correctly calculated and right ethernet header is passed to hash function as fix. Change-Id: I93ae0d96e536875988da6a525cdba4783c1e4d6e closes-bug: #1635046

The neighbour request packets are typically Multicast packets and there is no flow processing for these. When a neighbour request is converted to neighbour advertisement, we continue to use the same packet buffer and same packet flags for this advertisement too. This ends up in not creating a flow for neighbour advertisement too as the original packet is marked as multicast packet. But the fix we gave https://review.opencontrail.org/#/c/24973/ for bug: #1461882 for V6 resulted in creating new packet flags for advertisement and this is resulting in flow being created. The flow processing is dropping the response as the neighbour advertisement should have ideally come from different interface. We can fix this issue either by manipulating the source interface to match the interface on which neighbour is falling so that flow processing succeeds or by disabling the flow processing for advertisements. The fix now disables the flow processing for advertisements Change-Id: Ic91f0fb794c13912e43d8c96c726bd80e559b7fb closes-bug: #1635931

For all types of virtual interfaces, the location from where we tap packets for RX port mirroring is in a common function where all the common virtual processing happens. For a vlan based sub-interface, this means that the port mirroring happens post untagging the packet and hence this is not a correct behavior. Hence, move the tap locations to the lowest layers of vif receive. Also, during transmission, since the tap is in the lowest common layer, sub interface mirroring does not work. Hence, move the taps to the highest layer of vif transmit. Also, add taps for vhost and physical interfaces. Change-Id: Ie42bfcedde2eba1167fa49e7ecd627b7762363e9 Closes-Bug: #1630889

In order to make sure that we mirror the packet the way we received it, i.e before removing any data from the packet, such as vlan tags, mirroring tap points were moved to the very beginning of packet processing in vRouter. Unfortunately, packet type classification does not happen before the mirror tap point. Since no classification happens, options like COPY SIP/DIP does not work and hence packets can get dropped in the flow block. Fix this by explicitly classifying the packet in the vif_mirror. Change-Id: Id310b31bc1990e6671882b447b8243767ada24ee Closes-Bug: #1630889

During transmit port mirror, we seem to be using the mirror values from the flow of the original packet. This is not a right behavior, since port mirroring shouldn't be depending on flow based mirror values. Hence reset the flow index for the port mirrored packet Change-Id: I1aff969b08a2f247ccf1fc636d6689350e819f29 Closes-Bug: #1636638

Right now the TTL of the packet is not overwritten by Vrouter. It is only decremented like a hop, for the required packets. But BGP packets in VM (due to BGP as service), can come to Vrouter to with ttl of 1. In a multi hop environment, we require this ttl to be more than 1. For this purpose flow entry is added with another ttl field and if Agent sets this, vrouter unconditionally sets this ttl in the packet and calculates the checksum again. Change-Id: I4619e81ad324827e29dc487857020ddd9e429f2c partial-bug: #1567586

Change-Id: I31ff88e6f46600b917c62ac967d112f9bd0949c5 Closes-BUG: 1630772

Change-Id: Id7bd20a040a1aa3c1ad4b1c6f68c10a9836d6108 Closes-Bug: 1641254

For sandesh message of vrouter build information, null termination character is not accounted while populating sandesh message. Due to this, extra characters are observed at collector and contrail-alarm-gen. Modified copy of string by taking null termination character into consideration. Change-Id: I249c5fc168d301b718e63530a2022bbe67e080f0 Closes-Bug: #1646693

A unicast neighbor solicitation/response undergoes a flow lookup. In many cases the source IP address that is used will be the link local IP which will get dropped because of RPF checks. Consistent with the way we treat ARP requests, we will now make sure that the solicitation and advertisements do not undergo flow lookup. Change-Id: Ib2ab306b0bdd6f201c35faa1918f97324238be1e Closes-Bug: #1643906 (cherry picked from commit 961a878)

with high number of flows. Skip flow setup rate calculations for the first time. Change-Id: Ib17bda69b7e1994a460512035be09888ad78500c Closes-Bug:#1654941

__skb_checksum_complete calculates checksum for the whole packet, including the ethernet padding if any. This will result in wrong checksum verification at the receiving end and thus a wrong packet drop. Use __skb_checksum_complete_head instead and pass the length calculated from IP header as the size over which checksum has to be calculated. Change-Id: I48efbb66bc171c590784d6f1931be9abd81198ae Closes-Bug: #1658576

The existing code that handles neighbor solicitations, bridges all requests to link local destinations. This behavior does not work for gateway link local address since . a broadcast request to the link local address will not make to the agent . a unicast request to the link local address will not be handled by the agent. Agent does not handle neighbor solicitations since vRouter is expected to respond back with the proper mac address based on the flags in the route. Hence, make all link local neighbor solicitations to go through vRouter's logic of proxying and flooding based on flags set on the route. Change-Id: I4f23d1c980e887d044776d49bbc71b9faaf2f0a9 Closes-Bug: #1657483 Set the 'Router' bit if proxying with Router MAC For ICMPv6 advertisements that carry the Router MAC, set the 'Router' bit in the advertisements. Change-Id: Iccc5d1ca025c71f55bfc85c6e12f4cf6944a9dfc Closes-Bug: #1657483

Change-Id: I724e58d9ecee113ee7627f9d4b2b52f0033c2d16 Partial-Bug:1664734

While adding the V4 and V6 routes to Mtrie, bits that are set in prefix beyond the mask length are reset to 0. While doing this for V6 prefixes, it is assumed that V6 mask len is byte boundary. If not, thre prefix corresponding to those bits are also reset to zero. As a fix, this constraing has been removed Change-Id: I2c61ea9829d37b78d8570411bc3db02c0ef1cded closes-bug: #1657766 (cherry picked from commit 9b60ccf)

In a typical ECMP use case, by the time packet destined to the ECMP destination hits the flow, the ECMP index to be used to select the best among equals would already have been decided by the agent and programmed accordingly in the vRouter. However, in some not so common scenarios, the first packet can be trapped to agent for resolving the nexthop to be used among the nexthops that form the ECMP destination. In such cases, we should hold and trap the packet rather than just trapping so that first packet loss can be avoided. Change-Id: I09e785254ee0ef187099152e2808b8a67850d31d Closes-Bug: #1648027 (cherry picked from commit 21a6b46)

to define end of parsing by getopt_long hence preventing segfault. Change-Id: I018e5af02e5842d25494b2d93412c8c37dd4b316 Closes-Bug:#1651153

Add all zeros to define end of long_options array. Change-Id: I1ba57bc0ed6bb2d3ddce3f72dcbd87e2d4e30b53 Closes-Bug: #1679489

Right now, the NH's refcount is not incremented atomically as reference to NH is always taken from a single thread in nelink context. This is becuase Routes, Labels, and other NH's access an NH only when Agent performs config changes. But with data path mac learning in place because of PBB, reference to NH, esp L2 Mutlicast NH, happens from multiple threads leading to requirement incrementing it atomically. Not doing it causes the ref count to go bad for that particular NH and results in double free of NH. Change-Id: I0afa517b56eeac0371670ee76241e392be762f20 closes-bug: #1672293 closes-bug: #1672373 closes-bug: #1673663

On a given physical interface there can only be one subinterface per given Vlan ID. If Agent attempts to add multiple subinterfaces with same Valn ID, vrouter rejects that addition with -EEXIST error. As part of this rejection, the temproary subinterface need to be properly cleaned up. Right now the clean up is not proper w.r.t memory and the refcount of the physical interface is lso not handled well. This is leading to either memory leak/corruption. Also currently the ref counting of VIF is not symetric w.r.t increment and decrement. Decrement of ref count is done in atomic way but not increment. This as such can lead to issues when RCU callbacks of NH (which eventually decrement VIF ref count in atomic manner) are executed in parallel to new NH addition from Agent (which increments VIF ref count non atomic manner). As a fix, sub VIF is cleaned up properly and VIF ref count is incremented in atomic manner Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482 closes-bug: #1677571

If the composite nexthop does not have any component nexthops, we set the reach_nh to discard. One side effect of this behavior is that multicast ARPs are dropped, since ARP processing happens in the composite nexthop handling. Hence, retain the composite nexthop functionality, even if there are no component nexthops. As part of this change, we also make sure that if a change in any nexthop fails, we do not disrupt the way it works. Change-Id: Ic9ea091e3f0cbe659389bc57bea2a1e214bbd771 Closes-Bug: #1686452 (cherry picked from commit d686b42)

Closes-Bug:#1630651 Change-Id: I99a090f26083b0804a2e0147d24f24a17d5be0ce

skb_segment casts part of skb->cb to some structure to determine tunnel header length. Having remnants of packet structure in cb results in wrong tunnel header length calculation and thus wrong memory copies to random memory locations. This problem is with newer kernels (4.x). Change-Id: Ifc2c7a4c5ea448cd57df88e51bb82a7f62c97cc6 Closes-Bug: #1685181

When skb_gro_receive merges skbs/fragments, it does not take a reference to the underlying page, thus transferring the original reference it had to the first packet of the packet bundle. This will work fine for non-cloned packets. However, for cloned packets, when the gro-ed packet is eventually freed (because the original skb was not cloned and hence reference was 1), the merged skb's frags also get freed (put_page-ed) without taking into account the other references that were held for the fragments (dataref). This will result in bad page problems. Hence, avoid GRO for cloned packets. Change-Id: I867dc6eac82e1ebc7a0f496b928d66ff62409542 Closes-Bug: #1683305

When fragments are recieved out of order, assembler code caches them till head fragment is received. While holding the packet, packet's nh is not cached to avoid taking a reference to NH. Once the head fragment is received, these out of order fragments are released for flow processing. While doing this, packet's nh is not filled if the encap is Vxlan packet as this requires a mac address lookup. For Mpls packets, label is looked up to extract the NH. Not filling the NH, is resulting in Flow lookup failure as NH id is also a key for the flow table. Failure to look up the original flow entry (of head fragment) results in either creating the new flow entry (with wrong key nh id) in Hold state, or dropping the packet without flow processing. To avoid this, bridge look up is done in the given VRF, to extract the NH. closes-bug: #1698986 Conflicts: dp-core/vr_bridge.c dp-core/vr_flow.c include/vr_bridge.h Change-Id: I62d1f0a2bd2112636a43ec00a25a9530838bd8ae

vif get to be called only when argument is a number . Change-Id: Iac358d6e99e38fb3a08a61e39d325aed1d5f4d5e Closes-Bug: #1752783

vmahuli and others added 30 commits February 18, 2016 13:12

Adding R3.0 branch to CI Closes-Bug: 1547034

8e61639

Change-Id: I7a24151be54ccc2c3c9505ea7afb8437752ef25f

Vrouter to return error for invalid flow op

99763a1

If a flow operation is requested for an index which is not active, vrouter should return an ENOENT error for such operation Closes-Bug: 1548678 Change-Id: I42bbd58f41a868b1ce630046ae40e911df0d9491 (cherry picked from commit 1cf3761)

Do not consider evicted flows as active flow in flow utility

4d01316

Change-Id: Ia9ae864af98a645b60c5880db29ba8ff1b3c494f Closes-Bug: #1517665

Merge "DPDK: fix (again) hashing of fragmented packets" into R3.0

8759e5d

Merge "DPDK: fix UDP fragments handling" into R3.0

5ae73c8

Merge "Do not consider evicted flows as active flow in flow utility" …

81476e1

…into R3.0

Merge "List flows based on match criteria" into R3.0

c3a9f52

DPDK: make Sandesh quiet

1db4e11

Insufficient space in external memory buffer is not actually an error for the vRouter/DPDK, so we supress those messages by using -DSANDESH_QUIET. Change-Id: I5bc0a51f7c4659f2733a72e9b45322d117265a87 Closes-bug: #1546291

Reduce the size of the RSS mempool to improve performance.

c17b8fe

The number of TX/RX descriptors are also decreased to account for this. Change-Id: Ib29da97e229fbf6a77644e449d2a81281f438058 Closes-Bug: 1550073

Merge "Reduce the size of the RSS mempool to improve performance. The…

16c05b7

… number of TX/RX descriptors are also decreased to account for this." into R3.0

Merge "Using the correct FMD and setting pkt_type correctly for mirro…

c959ff9

…red packets" into R3.0

Merge "Dont mirror the same packet again" into R3.0

01dbe86

Merge "Disable Eviction of flow in Vrouter" into R3.0

1696925

DPDK: fix high pkt drop for 16 lcores

c40b395

I have no hardware with 16 cores, but I guess with lots of lcores assigned to vRouter/DPDK some of them might get sleep due to no RX queues assigned. Change-Id: I7e37e14ef9aebfc3613ef5e908091ee35093d1bb Closes-bug: #1505472

Merge "Do not swap ports for an ICMP packet inside an ICMP error" int…

3dd9596

…o R3.0

anandhk-juniper and others added 30 commits September 30, 2016 15:04

If packet clone needs to be modified, create a copy of the DPDK mbuf.

8365502

Change-Id: I31ff88e6f46600b917c62ac967d112f9bd0949c5 Closes-BUG: 1630772

Disable vlan offload as it causes performance degradation.

72d4ef3

Change-Id: Id7bd20a040a1aa3c1ad4b1c6f68c10a9836d6108 Closes-Bug: 1641254

Merge "Manipulating the TTL of the Packet" into R3.0

81cf048

Make average setup rate as uint64 to prevent overflow in calculation

3cd1586

with high number of flows. Skip flow setup rate calculations for the first time. Change-Id: Ib17bda69b7e1994a460512035be09888ad78500c Closes-Bug:#1654941

Don't look for UDP header in second fragment and later.

16df717

Change-Id: I724e58d9ecee113ee7627f9d4b2b52f0033c2d16 Partial-Bug:1664734

Terminate the long_options array with an entry that is all zeros,

bd8e19a

to define end of parsing by getopt_long hence preventing segfault. Change-Id: I018e5af02e5842d25494b2d93412c8c37dd4b316 Closes-Bug:#1651153

Fix segmentation fault in dropstats , vrmemstats utilities.

11d43bd

Add all zeros to define end of long_options array. Change-Id: I1ba57bc0ed6bb2d3ddce3f72dcbd87e2d4e30b53 Closes-Bug: #1679489

Merge "Cleanup SubVIF properly if addition fails" into R3.0

46032c0

Remove Mtu from vif display .

0727200

Closes-Bug:#1630651 Change-Id: I99a090f26083b0804a2e0147d24f24a17d5be0ce

vif interfaces are numbered , add check for

937d329

vif get to be called only when argument is a number . Change-Id: Iac358d6e99e38fb3a08a61e39d325aed1d5f4d5e Closes-Bug: #1752783

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R3.0 #89

R3.0 #89

jhickman8x3 commented Feb 26, 2016

R3.0 #89

Are you sure you want to change the base?

R3.0 #89

Conversation

jhickman8x3 commented Feb 26, 2016