Add TcpSorter to support sorting TCP segments.#337
Add TcpSorter to support sorting TCP segments.#337rickyzhang82 wants to merge 2 commits intoseladb:masterfrom
Conversation
Signed-off-by: Ricky Zhang <rickyzhang@gmail.com>
|
Thanks for working on this! Could you please check the CI failures? I had a quick look and it seems it's failing because you're using C++11 features. PcapPlusPlus is intended to support older C++ versions as well |
|
C++11 is fully supported by GCC 4.8.1. Any Linux distribution and FreeBSD which are NOT in end of life provide GCC version above 4.8.1. I won't take away C++11 feature in this PR. It is up to you if you want to add compiler flag |
|
Unfortunately I don't think I'll add the |
|
It is just two weeks away that we will start our next decade. I'm surprised that if we still say no to C++11 standard. Anyway, it is a choice you made. I respect that. But I won't patch it by myself. So please drop the PR if nobody is going to make it C++11 free. I used the following C++11 features to save me from typing and troubles:
From 1-3, you can replace it mechanically. The 4th bullet point involves more work. I consider it is a crime in 2020 to ask any programmers to free up memory manually. |
|
Thanks @rickyzhang82 for all your work and efforts. Since the rest of the library is not C++11 compatible I prefer not to use C++11 in just specific modules. I'll try to make those changes you suggest to make it C++98 compatible. Thanks again for all of your contribution to this project |
Signed-off-by: Ricky Zhang <rickyzhang@gmail.com>
|
I patched one minor bug I found. The missing packet callback should pass in expected sequence number, instead. It is a pity that you don't enable C++11 for older GCC. I have to fork your repository to maintain mine now. |
|
@rickyzhang82: let me to propose you to use a |
|
But I need the ordering of (seq + TCP payload size) when flushing the packets by other’s side ACK. Does RB tree that slow? I’m curious about this. Because each TCP connection should not hold that many unacknowledged TCP packets at any given time. I did find that the longer callback function execution, the higher probability the libpcap loses the packet in capturing. I’m thinking about using multithreaded solution. See my discussion issue #342 |
|
The |
|
That’s true. Will change it. |
|
Change the signature of callback from typedef void (*OnTcpPacketReady)(int side, ConnectionData connData, SPRawPacket spRawPacket, void* userCookie);to typedef void (*OnTcpPacketReady)(int side, const ConnectionData& connData, SPRawPacket spRawPacket, void* userCookie); |
|
The following code is also heavyweight: // get packet's source and dest IP address
IPAddress* srcIP = nullptr;
IPAddress* dstIP = nullptr;
IPv4Address srcIP4Addr = IPv4Address::Zero; // !!! mem alloc
IPv6Address srcIP6Addr = IPv6Address::Zero; // !!! mem alloc
IPv4Address dstIP4Addr = IPv4Address::Zero; // !!! mem alloc
IPv6Address dstIP6Addr = IPv6Address::Zero; // !!! mem alloc
if (ipLayer->getProtocol() == IPv4)
{
srcIP4Addr = (dynamic_cast<IPv4Layer*>(ipLayer))->getSrcIpAddress(); // !!! mem alloc
srcIP = &srcIP4Addr;
dstIP4Addr = (dynamic_cast<IPv4Layer*>(ipLayer))->getDstIpAddress(); // !!! mem alloc
dstIP = &dstIP4Addr;
}
else if (ipLayer->getProtocol() == IPv6)
{
srcIP6Addr = (dynamic_cast<IPv6Layer*>(ipLayer))->getSrcIpAddress();
srcIP = &srcIP6Addr;
dstIP6Addr = (dynamic_cast<IPv6Layer*>(ipLayer))->getDstIpAddress();
dstIP = &dstIP6Addr;
} |
|
Look into PR #308 in which I have proposed the performance improvements. You can use it in your code in place where a hash is calculated. |
|
@rickyzhang82 I'll review the code soon, but one thing that is missing is UTs. Could you please add some to either You can take a look at the UTs for TcpReassembly: PcapPlusPlus/Tests/Pcap++Test/main.cpp Lines 4692 to 5370 in ed30c80 |
|
@seladb Please let me add UT and send another PR later. |
|
Do you have a empirical test to verify your claim that
Also, how about insertion and deletion? |
|
Regarding to the copy cost, it makes sense to squeeze the performance by passing as const reference. I also change all methods that pass in shared pointer into pass in as const reference as well. |
Yes, I did the tests but did not test the deletion. You can make the tests in your environment. Test results: Win10x64, VS2019, Core i7-9700K Memory usage (valgrind result): CentOS 7, g++ 8.3.1 key: uint32_t, value: uint64_t, size: 500 000 map: unordered_map: #include <vector>
#include <map>
#include <cstdint>
#include <chrono>
#include <iostream>
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <unordered_map>
#include <random>
void testHashMapAndMap()
{
constexpr auto tryCount = 15000000U;
constexpr uint32_t numToSearch = 1U;
constexpr auto maxSize = 100000U;
// create a shuffled unique set
std::unordered_map<uint32_t, uint32_t> shuffledMap, shuffledErrMap;
shuffledMap.reserve(maxSize);
shuffledErrMap.reserve(maxSize);
using RandomGenerator = std::mt19937;
std::random_device dev;
RandomGenerator randomGenerator(dev());
std::uniform_int_distribution<RandomGenerator::result_type> dist(1, maxSize * 3);
while(shuffledMap.size() < maxSize)
{
auto rndValue = dist(randomGenerator);
if(shuffledMap.find(rndValue) == shuffledMap.cend())
shuffledMap.insert(std::make_pair(rndValue, rndValue));
}
while(shuffledErrMap.size() < maxSize)
{
auto rndValue = dist(randomGenerator);
if(shuffledMap.find(rndValue) == shuffledMap.cend() && shuffledErrMap.find(rndValue) == shuffledErrMap.cend())
shuffledErrMap.insert(std::make_pair(rndValue, rndValue));
}
std::vector<uint32_t> keys, errKeys;
keys.reserve(shuffledMap.size());
errKeys.reserve(shuffledErrMap.size());
for(auto const &pair : shuffledMap)
keys.push_back(pair.first);
for(auto const &pair : shuffledErrMap)
errKeys.push_back(pair.first);
shuffledMap.clear();
shuffledErrMap.clear();
{
// fill the map
std::map<uint32_t, uint64_t> map;
auto startTime = std::chrono::steady_clock::now();
for(auto key : keys)
map.insert(std::make_pair(key, key));
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime);
std::cout << "Fill the map: duration(ms): " << duration.count() << ", count: " << map.size() << '\n';
// lookup in the map
auto count = 0U;
startTime = std::chrono::steady_clock::now();
for(auto i = 0U; i < tryCount; ++i)
if(map.find(keys[i % maxSize]) != map.cend())
++count;
duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime);
std::cout << "Lookup in the map: duration(ms): " << duration.count() << ", iterations: " << count << '\n';
// insuccessfull lookup in the map
count = 0U;
startTime = std::chrono::steady_clock::now();
for(auto i = 0U; i < tryCount; ++i)
if(map.find(errKeys[i % maxSize]) == map.cend())
++count;
duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime);
std::cout << "Insuccessfull lookup in the map: duration(ms): " << duration.count() << ", iterations: " << count << '\n';
std::cout << '\n';
}
{
// fill the unordered map
std::unordered_map<uint32_t, uint64_t> unorderedMap;
//hashMap.reserve(keys.size());
auto startTime = std::chrono::steady_clock::now();
for(auto key : keys)
unorderedMap.insert(std::make_pair(key, key));
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime);
std::cout << "Fill the unordered map: duration(ms): " << duration.count() << ", count: " << unorderedMap.size() << '\n';
// lookup in the hash map
auto count = 0U;
startTime = std::chrono::steady_clock::now();
for(auto i = 0U; i < tryCount; ++i)
if(unorderedMap.find(keys[i % maxSize]) != unorderedMap.cend())
++count;
duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime);
std::cout << "Lookup in the unordered map: duration(ms): " << duration.count() << ", iterations: " << count << '\n';
// insuccessfull lookup
count = 0U;
startTime = std::chrono::steady_clock::now();
for(auto i = 0U; i < tryCount; ++i)
if(unorderedMap.find(errKeys[i % maxSize]) == unorderedMap.cend())
++count;
duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - startTime);
std::cout << "Insuccessfull lookup in the unordered map: duration(ms): " << duration.count() << ", iterations: " << count << '\n';
std::cout << '\n';
}
}
constexpr auto maxSize = 500000U;
void mapTestMem()
{
std::map<uint32_t, uint64_t> map;
for(auto i = 0U; i < maxSize; ++i)
map.emplace(i, i);
}
void unorderedMapTestMem()
{
std::unordered_map<uint32_t, uint64_t> map;
map.reserve(maxSize);
for(auto i = 0U; i < maxSize; ++i)
map.emplace(i, i);
}
int main(int argc, char *argv[])
{
testHashMapAndMap();
return 0;
} |
| return; | ||
| } | ||
| // parse the packet | ||
| Packet packet(spRawPacket.get(), false); |
There was a problem hiding this comment.
You can make a small optimization if it is applicable: Packet packet(spRawPacket.get(), false, TCP)
| // get IP layer | ||
| Layer* ipLayer = nullptr; | ||
| if (packet.isPacketOfType(IPv4)) | ||
| ipLayer = dynamic_cast<Layer*>(packet.getLayerOfType<IPv4Layer>()); | ||
| else if (packet.isPacketOfType(IPv6)) | ||
| ipLayer = dynamic_cast<Layer*>(packet.getLayerOfType<IPv6Layer>()); | ||
|
|
||
| if (ipLayer == nullptr) | ||
| return; | ||
|
|
||
| // Ignore non-TCP packets | ||
| TcpLayer* tcpLayer = packet.getLayerOfType<TcpLayer>(); | ||
| if (tcpLayer == nullptr) | ||
| return; |
There was a problem hiding this comment.
Possible optimizations:
// get IP layer
Layer* ipLayer;
if (packet.isPacketOfType(IPv4))
ipLayer = dynamic_cast<Layer*>(packet.getLayerOfType<IPv4Layer>());
else if (packet.isPacketOfType(IPv6))
ipLayer = dynamic_cast<Layer*>(packet.getLayerOfType<IPv6Layer>());
else
return;
// Ignore non-TCP packets
TcpLayer* tcpLayer = packet.getNextLayerOfType<TcpLayer>(ipLayer);
if (tcpLayer == nullptr)
return;| return; | ||
| } | ||
| // create a TcpSorterData object and add it to the active connection list | ||
| tcpSorterData = std::make_shared<TcpSorterData>(); |
There was a problem hiding this comment.
If a unique_ptr is acceptable in this case you should use it instead of shared_ptr
| else if (tcpSorterData->numOfSides == 2) | ||
| { | ||
| // check if packet matches side 0 | ||
| if (tcpSorterData->twoSides[0].srcIP->equals(srcIP) && tcpSorterData->twoSides[0].srcPort == srcPort) |
There was a problem hiding this comment.
Small optimization: change the order of the checks
| sndIdx = 0; | ||
| } | ||
| // check if packet matches side 1 | ||
| else if (tcpSorterData->twoSides[1].srcIP->equals(srcIP) && tcpSorterData->twoSides[1].srcPort == srcPort) |
There was a problem hiding this comment.
Small optimization: change the order of the checks
|
|
||
| // collect flowKeys from TCP connection list | ||
| auto iterKey = m_ConnectionList.begin(), iterKeyEnd = m_ConnectionList.end(); | ||
| std::vector<uint32_t> accessKeys; |
There was a problem hiding this comment.
Append the following: accessKeys.reserve(m_ConnectionList.size());
|
hi @rickyzhang82 I'm wondering if we can revive this PR. I think you almost completed the work, why shouldn't we complete the remaining work and merge it? |
|
There are 3 road blocks:
If you can live with all 3, I will send a PR again. |
|
with number 3 I can live and we can work together on numbers 1 and 2. Specifically for number 2 - I can help if you'd like me to. |
|
C++ 11 is a deal breaker to me. The app |
|
You can still run your own branch of |
|
I prefer to have one single code base. Otherwise, neither of us share the benefit of the merging. I may find bugs and patch it in future. But the fix may not be in your repository and vice versa. |
|
the issue itself is still open (#100) so at some point someone else might want to implement the TCP reorder/sort functionality. That will be a pity because you already implemented it, and we'll end up having 2 implementations of the same thing. Can we find any way to bridge this gap and get your code merged? If I remember correctly there aren't a lot of places you're using C++11 specific features so maybe we can work something out |
|
I'm not convinced that removing smart pointer is a good idea in year 2020. So I can't compromise on C++ 11. One of the callback function makes a copy of packet and passes it down to the end user application. It is better leave it to the smart pointer to do clean up. I have no interest to exercise memory leak detection again in Valgrind. |
|
Ok I give up. If you think of any way we can bridge the gaps and get your PR merged please let me know. Otherwise let's keep it as is |
|
Please let me know when you allow C++ 11 in PcapPlusPlus. All latest Linux and FreeBSD comes with GCC and Clang supports C++ 11. Thanks! |
|
hi @rickyzhang82 do you mind if I use your code and make some necessary changes to support older versions of C++? |
|
@seladb Please feel free to use it from my my-master branch: https://github.com/rickyzhang82/PcapPlusPlus Thanks! |
|
Thanks! since the branch you created the PR from is gone, the only way I see is to copy the code into my branch and then do the necessary changes. However that way your name won't be in it. Are you ok with this or do you want to create another PR that I'll later modify? |
|
@rickyzhang82 did you get my previous comment? |
|
No problem. Feel free to do whatever you want! Thanks! |
|
Thank you @rickyzhang82 , I'll keep you posted |
I have performed online capturing test and offline pcap file test. The result looks good.
Also, I ran valgrind to check memory leak for hours. The shared pointer makes my life easier. I did find pcap library itself has memory leak when lists NIC. I didn't look deeper into the issue since I can't fix libpcap here.
Regarding to the TCP sorter logic, the TCP packet is flushed to the user once another side sends ACK. You can find more technical detail in the Doxygen header.