Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R3.0 #89

Open
wants to merge 166 commits into
base: master
Choose a base branch
from
Open

R3.0 #89

wants to merge 166 commits into from

Conversation

jhickman8x3
Copy link

No description provided.

vmahuli and others added 30 commits February 18, 2016 13:12
Change-Id: I7a24151be54ccc2c3c9505ea7afb8437752ef25f
Vrouter to generate EEXIST error when flow add request
failed due to race between agent and datapath, where
flow entry was already created due to packet trap

Closes-Bug: 1547501
Change-Id: I0d2073622b06998dd5778fb94749958afc09bac0
(cherry picked from commit c17b4d4)
We can not hash L4 ports as fragments other than the first have
no L4 headers. In result fragments of the same packets may have
been distributed onto different lcores and potentially be routed
out of order.

The fix was originally committed by Michal Dubiel (commit d59ca66)
For a reason the fix was dropped in the commit 1056396

Change-Id: Iac607e6e14a139a0c015eeb5c9ff8c3de04854c6
Closes-Bug: #1474432
If a flow operation is requested for an index which
is not active, vrouter should return an ENOENT error
for such operation

Closes-Bug: 1548678
Change-Id: I42bbd58f41a868b1ce630046ae40e911df0d9491
(cherry picked from commit 1cf3761)
Change-Id: Ia9ae864af98a645b60c5880db29ba8ff1b3c494f
Closes-Bug: #1517665
The root cause is mbuf starvation due to lots of fragments.
The mbufs get stuck in the fragment assembler.

To fix the issue we disable fragment assembler allocations if the
number of allocated fragment elements is greater than 1024.
This basically temporarily drop of out of order fragments.

+ Fix memory allocation statistics.

Change-Id: Ife775ac3073a9c5981672bd64952661c28380a7c
Closes-bug: #1545798
Add a new option "--match" to match flows based on certain criteria.
With the match option, one will be able to match based on source ip
or destination ip (with or without ports) or vrf or protocol or any
of them together with any other of them.

e.g.:
--match 1.1.1.1:20
--match "1.1.1.1:20,2.2.2.2:22"
--match "[fe80::225:90ff:fec3:afa]:22
--match "10.204.217.10:56910 & vrf 0 & proto tcp"
--match "10.204.217.10:56910,169.254.0.3:22 & vrf 0 & proto tcp"

Change-Id: Id70e6d60950babe44612e6aded036c5d1bec6f92
Partial-BUG: #1513001
Insufficient space in external memory buffer is not actually
an error for the vRouter/DPDK, so we supress those messages
by using -DSANDESH_QUIET.

Change-Id: I5bc0a51f7c4659f2733a72e9b45322d117265a87
Closes-bug: #1546291
…kets

If port mirroring is enabled, cloned packet is subjected to mirroring
using the original packet's forwarding metadata. If mirroring code
changes the metadata content, original packet will be forwarded as per
the changed fmd and results in wrong forwarding.

In the current case, mirroring is to a VM in different VN (hence new VRF)
and mirroring code is modifying the fmd's dvrf to new VRF.
The original ARP packet's L2 and L3 looksups are happening on the
modified VRF resulting in ARP getting dropped.

Also the type of packet is identified using vr_pkt_type() after packet
is mirrored. This is resulting in wrong pkt_type being used for mirrored
packet hence the source IP packet the mirrored packet is not correctly
computed.

As a fix, new FMD is used for mirrored packet and packet type is
identified before mirroring itself.

Change-Id: I1fbe7d7e83c7e66c3287b8082444483578117d14
closes-bug: #1549727
The number of TX/RX descriptors are also decreased to account
for this.

Change-Id: Ib29da97e229fbf6a77644e449d2a81281f438058
Closes-Bug: 1550073
… number of TX/RX descriptors are also decreased to account for this." into R3.0
If mirroring is enabled on analyzing VM's port itself, one copy of the
original packet gets mirrored to that port. But this mirrored packet
should not be mirroed again. Right now without verifying whether the
packet is mirrored or not, it is attempted for mirroring again.

When a packet is mirrored it is marked with flag VP_FLAG_FROM_DP. If
this flag is set it is not mirrored again.

Change-Id: Ie4cca916945878b77bc4db7666992f3933db3e13
closes-bug: #1549761
Currently the L2 header used for both L2 and L3 GRO processing is only
next hop id. Post GRO, the nexthop is retrieved from this id and packet
is subjected to this nexthop for further processing. The ingress
interface filled in the packet is nh_dev of this nexthop. This is
resulting in ingress interface as Tap interface though packet acttually ingressed
on Fabric interface. This can potentially lead to issues in the packet's
further processing as Tap interface is not the real ingress
interface.

As a fix, the ingress interface id is also pushed as L2 header.

Change-Id: I6fbe055fff7da4c957fde29d132b56ea19bd2f3f
closes-bug: #1550632
When Transmit port mirroring is enabled, packet received on Fabric
interface is right now mirrored using the source IP of the inner packet.
This results in RPF failure on Analyzer VM's compute node because the
compute node which is doing the port mirroring is using other compute
node's VM IP.

As a fix, if mirroring is Tax mirroring, rather using inner packets
source ip, dest ip is used, so that Analyzer VM's RPF will not have any
issues

Change-Id: I5beaa0dd0cc3c886a1e77f244c8003595ed348e2
closes-bug: #1550312
Vrouter will track tcp flags, but will not evict the flow
entry once the session is closed and let vrouter agent
look at the flag to delete the flow eventually.

This is required till vrouter agent handles evict event
in a more stable way

Partial-Bug: 1550759
Change-Id: Ic7074442b22fc2c11362274e4e390654f3f2400a
We call vr_dpdk_virtio_stop() from uvhm_client_munmap() which in turn
is called from vr_uvh_nl_vif_del_handler().

These are as tiny as possible changes based on the the abandoned
refactoring commit: https://review.opencontrail.org/#/c/17178/

Closes-bug: #1551093
Change-Id: I4653601635dbb7f29d7af9ce15a18193904f583f
was tunneled or not

Once the packet is submitted for GRO, all datapath information is
lost. Post GRO, only values that are saved in the packet are the
vif and the nexthop. vif is a recent addition to the saved
information that helped us to identify which interface the packet
came from originally. Once the vif value was set properly, the logic
that checked whether the packet should be trapped to agent or not
based on the presence of label (basically fabric arp responses should
be trapped or not), misbehaved since label information is not saved
pre-GRO and hence not available in the metadata post GRO. For now,
fix the specific logic by checking whether the egress vrf is different
from the ingress vrf, which will be the case since physical interface
vrf will not be the same as vm's vrf.

Change-Id: Iba000889039bc8a5020fc11a462ba1b1a68ce1c8
Closes-BUG: #1551382
A mirror action needs its own forwarding metadata since the forwarding
parameeters that are used for mirroring will be different from those
that are used for packet forwarding. In this particular case, the vrf
to which the packets are mirrored are different from the vrf to which
packets need to be classified, and hence the packets were dropped in the
flow module with no source route as the drop reason.

Change-Id: I21930818a5cdec560818acb79f671ce29ac8bc85
Closes-BUG: #1552101
Currently when a flow entry is deleted, the corresponding hash entry is
kept at the head of the free list and after successfully inserting into
the list the entry's flags are reset. If any other core is
attempting to fetch a free entry, there is a likely chance that just
added entry will be given as free entry before the flags are reset as
this entry is at the head of the list. Because of this there is a race
condition for setting the flags between deletion of entry and additin of
entry. If deletion overwrites the addition flags, the Valid flags gets
overwritten. Because of this the entry would not be in free list but
still would not be considered a valid entry. Searching this entry by
index fails as this is not a valid entry.

As a fix, the flags while inserting the entry to free list are modified
before inserting so that there would not be any race condition.

Change-Id: I67b1077e75a8a9f4ca5ddcfe3a7513061fd20681
closes-bug: #1552544
I have no hardware with 16 cores, but I guess with lots
of lcores assigned to vRouter/DPDK some of them might get
sleep due to no RX queues assigned.

Change-Id: I7e37e14ef9aebfc3613ef5e908091ee35093d1bb
Closes-bug: #1505472
In case of ICMP error packets for ICMP errors, we were not initializing
flow key and trying to form a flow out of that key, resulting in wrong
key length and corrupted flow entry(s).

We will drop such packets.

Change-Id: Idae46a7e128482ad89da8b5bd1bd0ef6b17ef28e
Closes-BUG: #1556363
For an ICMP packet, the port numbers remain the same in either direction
i.e.: source port remains the same for both foward and the reverse flows
as does the destination port. Hence, we should not be swapping ports in
calculating the flow key while trying to tag an ICMP error packet to a
flow set up for the original stream.

Change-Id: Ic5df8aec5f1009441aefd3d177568d55f3cb0d2c
Closes-BUG: #1554236
anandhk-juniper and others added 30 commits September 30, 2016 15:04
In the process of indicating a flow miss to the agent, if a pclone
fails, vRouter returns without unsetting the NEW flag, a flag that
indicates that the datapath is in the process of creating a new flow.
The side effect of not unsetting this flag is that agent can not
modify the entry and this leads to perennial HOLD flows in the
system.

To fix this problem, if the pclone fails, trap the original packet
to the agent without holding it in the HOLD queue. While this will
mean that the first packet does not reach the destination, it is
a better alternative to not trapping the packet and wait for agent
to do the audit after a minute or so. Such trapped packets are
captured under a new counter "Original Packet Trapped" in the
dropstats output.

Change-Id: I34c19abf935d9b06f55e875b76a4859350743c2b
Partial-Bug: #1628175
This fix is IPV6 equivalent of
https://review.opencontrail.org/#/c/11506/.

In in-network service instance case, when service scaling
is more than one, all the instances are going to have the
same mac address and IP address. Inet route points to
ECMP composite nexthop but FDB route does not point to ECMP
nexthop. So when neighbour request is received from service VM
Vrouter needs to send response on the incoming interface
rather bridging the reply packet, as FDB route points
to arbitrarly any of the service VM encap L2 nexthop.

After preparing the reply packet,bridge lookup is done
and outgoing interface is compared against ingress interface.
If they are not same, reply packet is transmitted on
ingress interface.

Change-Id: Ib0113227019d7a452541129843813a647d388b36
closes-bug: #1461882
While calculating the UDP source port for a multicast packet, the hash
needs to be on Ethernet header and VRF Id. For packet arriving from VM
and Tor, there are no extra header other than transport headers. So the
hash is calculated rightly for these packets. But for packets arriving
from other compute node, there are additional header other than
transport headers. The hash is wrongly calculated in this case on thsese
additional headers leading to different hash for the same Ethernet
headers, resulting in different UDP sourcd port.

The offset is correctly calculated and right ethernet header is passed
to hash function as fix.

Change-Id: I93ae0d96e536875988da6a525cdba4783c1e4d6e
closes-bug: #1635046
The neighbour request packets are typically Multicast packets and there
is no flow processing for these. When a neighbour request is converted
to neighbour advertisement, we continue to use the same packet buffer
and same packet flags for this advertisement too. This ends up in not
creating a flow for neighbour advertisement too as the original packet
is marked as multicast packet. But the fix we gave
https://review.opencontrail.org/#/c/24973/ for bug: #1461882 for V6
resulted in creating new packet flags for advertisement and this is
resulting in flow being created. The flow processing is dropping the
response as the neighbour advertisement should have ideally come from
different interface.

We can fix this issue either by manipulating the source interface to
match the interface on which neighbour is falling so that flow
processing succeeds or by disabling the flow processing for
advertisements.

The fix now disables the flow processing for advertisements

Change-Id: Ic91f0fb794c13912e43d8c96c726bd80e559b7fb
closes-bug: #1635931
For all types of virtual interfaces, the location from where we tap
packets for RX port mirroring is in a common function where all
the common virtual processing happens. For a vlan based sub-interface,
this means that the port mirroring happens post untagging the packet
and hence this is not a correct behavior.

Hence, move the tap locations to the lowest layers of vif receive.

Also, during transmission, since the tap is in the lowest common
layer, sub interface mirroring does not work. Hence, move the taps
to the highest layer of vif transmit.

Also, add taps for vhost and physical interfaces.

Change-Id: Ie42bfcedde2eba1167fa49e7ecd627b7762363e9
Closes-Bug: #1630889
In order to make sure that we mirror the packet the way we received
it, i.e before removing any data from the packet, such as vlan tags,
mirroring tap points were moved to the very beginning of packet
processing in vRouter. Unfortunately, packet type classification
does not happen before the mirror tap point. Since no classification
happens, options like COPY SIP/DIP does not work and hence packets
can get dropped in the flow block.

Fix this by explicitly classifying the packet in the vif_mirror.

Change-Id: Id310b31bc1990e6671882b447b8243767ada24ee
Closes-Bug: #1630889
During transmit port mirror, we seem to be using the mirror values
from the flow of the original packet. This is not a right behavior,
since port mirroring shouldn't be depending on flow based mirror
values.

Hence reset the flow index for the port mirrored packet

Change-Id: I1aff969b08a2f247ccf1fc636d6689350e819f29
Closes-Bug: #1636638
Right now the TTL of the packet is not overwritten by Vrouter. It is
only decremented like a hop, for the required packets. But BGP packets
in VM (due to BGP as service), can come to Vrouter to with ttl of 1. In
a multi hop environment, we require this ttl to be more than 1. For this
purpose flow entry is added with another ttl field and if Agent sets
this, vrouter unconditionally sets this ttl in the packet and calculates
the checksum again.

Change-Id: I4619e81ad324827e29dc487857020ddd9e429f2c
partial-bug: #1567586
Change-Id: I31ff88e6f46600b917c62ac967d112f9bd0949c5
Closes-BUG: 1630772
Change-Id: Id7bd20a040a1aa3c1ad4b1c6f68c10a9836d6108
Closes-Bug: 1641254
For sandesh message of vrouter build information, null termination character
is not accounted while populating sandesh message. Due to this, extra characters
are observed at collector and contrail-alarm-gen. Modified copy of string by taking
null termination character into consideration.

Change-Id: I249c5fc168d301b718e63530a2022bbe67e080f0
Closes-Bug: #1646693
A unicast neighbor solicitation/response undergoes a flow lookup.
In many cases the source IP address that is used will be the link
local IP which will get dropped because of RPF checks. Consistent
with the way we treat ARP requests, we will now make sure that
the solicitation and advertisements do not undergo flow lookup.

Change-Id: Ib2ab306b0bdd6f201c35faa1918f97324238be1e
Closes-Bug: #1643906
(cherry picked from commit 961a878)
with high number of flows.
Skip flow setup rate calculations for the first time.

Change-Id: Ib17bda69b7e1994a460512035be09888ad78500c
Closes-Bug:#1654941
__skb_checksum_complete calculates checksum for the whole packet,
including the ethernet padding if any. This will result in wrong
checksum verification at the receiving end and thus a wrong packet
drop.

Use __skb_checksum_complete_head instead and pass the length
calculated from IP header as the size over which checksum has to
be calculated.

Change-Id: I48efbb66bc171c590784d6f1931be9abd81198ae
Closes-Bug: #1658576
The existing code that handles neighbor solicitations, bridges
all requests to link local destinations. This behavior does not
work for gateway link local address since

. a broadcast request to the link local address will not make
  to the agent

. a unicast request to the link local address will not be handled
  by the agent.

Agent does not handle neighbor solicitations since vRouter is
expected to respond back with the proper mac address based on
the flags in the route.

Hence, make all link local neighbor solicitations to go through
vRouter's logic of proxying and flooding based on flags set on
the route.

Change-Id: I4f23d1c980e887d044776d49bbc71b9faaf2f0a9
Closes-Bug: #1657483

Set the 'Router' bit if proxying with Router MAC

For ICMPv6 advertisements that carry the Router MAC, set the
'Router' bit in the advertisements.

Change-Id: Iccc5d1ca025c71f55bfc85c6e12f4cf6944a9dfc
Closes-Bug: #1657483
Change-Id: I724e58d9ecee113ee7627f9d4b2b52f0033c2d16
Partial-Bug:1664734
While adding the V4 and V6 routes to Mtrie, bits that are set in prefix
beyond the mask length are reset to 0. While doing this for V6 prefixes,
it is assumed that V6 mask len is byte boundary. If not, thre prefix
corresponding to those bits are also reset to zero.

As a fix, this constraing has been removed

Change-Id: I2c61ea9829d37b78d8570411bc3db02c0ef1cded
closes-bug: #1657766
(cherry picked from commit 9b60ccf)
In a typical ECMP use case, by the time packet destined to the ECMP
destination hits the flow, the ECMP index to be used to select the
best among equals would already have been decided by the agent and
programmed accordingly in the vRouter. However, in some not so common
scenarios, the first packet can be trapped to agent for resolving the
nexthop to be used among the nexthops that form the ECMP destination.

In such cases, we should hold and trap the packet rather than just
trapping so that first packet loss can be avoided.

Change-Id: I09e785254ee0ef187099152e2808b8a67850d31d
Closes-Bug: #1648027
(cherry picked from commit 21a6b46)
to define end of parsing by getopt_long hence preventing segfault.

Change-Id: I018e5af02e5842d25494b2d93412c8c37dd4b316
Closes-Bug:#1651153
Add all zeros to define end of long_options array.

Change-Id: I1ba57bc0ed6bb2d3ddce3f72dcbd87e2d4e30b53
Closes-Bug: #1679489
Right now, the NH's refcount is not incremented atomically as reference
to NH is always taken from a single thread in nelink context. This is
becuase Routes, Labels, and other NH's access an NH only when Agent
performs config changes. But with data path mac learning in place
because of PBB, reference to NH, esp L2 Mutlicast NH, happens from
multiple threads leading to requirement incrementing it atomically. Not
doing it causes the ref count to go bad for that particular NH and
results in double free of NH.

Change-Id: I0afa517b56eeac0371670ee76241e392be762f20
closes-bug: #1672293
closes-bug: #1672373
closes-bug: #1673663
On a given physical interface there can only be one subinterface per
given Vlan ID. If Agent attempts to add multiple subinterfaces with same
Valn ID, vrouter rejects that addition with -EEXIST error. As part of
this rejection, the temproary subinterface need to be properly
cleaned up. Right now the clean up is not proper w.r.t memory and the
refcount of the physical interface is lso not handled well. This is
leading to either memory leak/corruption.

Also currently the ref counting of VIF is not symetric w.r.t increment
and decrement. Decrement of ref count is done in atomic way but not
increment. This as such can lead to issues when RCU callbacks
of NH (which eventually decrement VIF ref count in atomic manner) are
executed in parallel to new NH addition from Agent (which increments VIF
ref count non atomic manner).

As a fix, sub VIF is cleaned up properly and VIF ref count is
incremented in atomic manner

Change-Id: I2a9d41f2b0152f700e2c5c96c9f39825c22ce482
closes-bug: #1677571
If the composite nexthop does not have any component nexthops,
we set the reach_nh to discard. One side effect of this behavior
is that multicast ARPs are dropped, since ARP processing happens
in the composite nexthop handling. Hence, retain the composite
nexthop functionality, even if there are no component nexthops.

As part of this change, we also make sure that if a change in
any nexthop fails, we do not disrupt the way it works.

Change-Id: Ic9ea091e3f0cbe659389bc57bea2a1e214bbd771
Closes-Bug: #1686452
(cherry picked from commit d686b42)
Closes-Bug:#1630651

Change-Id: I99a090f26083b0804a2e0147d24f24a17d5be0ce
skb_segment casts part of skb->cb to some structure to determine
tunnel header length. Having remnants of packet structure in cb
results in wrong tunnel header length calculation and thus wrong
memory copies to random memory locations. This problem is with
newer kernels (4.x).

Change-Id: Ifc2c7a4c5ea448cd57df88e51bb82a7f62c97cc6
Closes-Bug: #1685181
When skb_gro_receive merges skbs/fragments, it does not take
a reference to the underlying page, thus transferring the
original reference it had to the first packet of the packet
bundle. This will work fine for non-cloned packets.

However, for cloned packets, when the gro-ed packet is
eventually freed (because the original skb was not cloned and
hence reference was 1), the merged skb's frags also get freed
(put_page-ed) without taking into account the other references
that were held for the fragments (dataref). This will result in
bad page problems.

Hence, avoid GRO for cloned packets.

Change-Id: I867dc6eac82e1ebc7a0f496b928d66ff62409542
Closes-Bug: #1683305
When fragments are recieved out of order, assembler code caches them
till head fragment is received. While holding the packet, packet's nh is
not cached to avoid taking a reference to NH. Once the head fragment is
received, these out of order fragments are released for flow processing.
While doing this, packet's nh is not filled if the encap is Vxlan packet
as this requires a mac address lookup. For Mpls packets, label is looked
up to extract the NH. Not filling the NH, is resulting in Flow lookup
failure as NH id is also a key for the flow table. Failure to look up
the original flow entry (of head fragment) results in either creating
the new flow entry (with wrong key nh id) in Hold state, or dropping the
packet without flow processing.

To avoid this, bridge look up is done in the given VRF, to extract the
NH.

closes-bug: #1698986

Conflicts:
	dp-core/vr_bridge.c
	dp-core/vr_flow.c
	include/vr_bridge.h

Change-Id: I62d1f0a2bd2112636a43ec00a25a9530838bd8ae
vif get to be called only when argument is a number .

Change-Id: Iac358d6e99e38fb3a08a61e39d325aed1d5f4d5e
Closes-Bug: #1752783
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet