Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about how modules work in simlator. #3

Open
btlcmr0702 opened this issue Nov 1, 2016 · 46 comments
Open

Some questions about how modules work in simlator. #3

btlcmr0702 opened this issue Nov 1, 2016 · 46 comments

Comments

@btlcmr0702
Copy link

Hi.
I have run the main.exe correctly and now i want to implement some algorithm on simulator.
Can i know how the simulator work , like the relationship of qbb-device and broadcom-node, and the role of qbb-device in simulation.
I also want to know where and who decides when to send PFC packet.
Thank you for your reply

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 1, 2016

Each qbb-net-device is a L2 device, which can be a switch OR a NIC. If it's switch, it will have a broadcom-node (m_broadcom) and a broadcom queue (m_queue) attached to it. (This is not accurate, see below)

If it's a switch, upon receiving packets, it will ask m_broadcom whether there is still space for this new packet. If so, it pushes this packet into m_queue and asks m_braodcom whether the PFC threshold is met. If so, it will send a PFC PAUSE to upstream.

Upon sending packets, it gets a packet from m_queue, and ask m_broadcom whether the queue length falls below PFC threshold. If so, it will send a PFC RESUME.

If it's a NIC, it will perform DCQCN rate control.

You can find most of these details in QbbNetDevice::Receive and QbbNetDevice::DequeueAndTransmit

@btlcmr0702
Copy link
Author

btlcmr0702 commented Nov 1, 2016

Oh, so it's very different from the switch mechanism in NS3 like bridge module , and every port of switch is also abstract , just counts how many bytes the port recveives?
OK,but i find the pause_time parameter , so the upstream resumes sending packets depends on resume message form donwstream or pause_time , maybe they work together i mean the pause time is the last barrier for droping packet?
And some terms like sp , pg ,rpr ,what's mean of them ? SP for strict priority ? pg for priority guarantee ?
Thank u very much :)

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 1, 2016

Sorry, I think I made a small mistake above. qbb-net-device is ether a NIC port or a switch port. Please bare with me... wrote these codes long ago. broadcom-node and broadcom queue is attached to the node, not the qbb-net-device. A node can have multiple qbb-net-device (especially on a switch), which share the same m_broadcom and m_queue.

pg is Priority Group, some people call it priority class, or simply "priority".
sp is Service Pool. It's a shared buffer that multiple pg can share.
These are the terms from Broadcom.

rpr.. stuff is from QCN standard. You can refer to http://www.cs.ucr.edu/~mart/204/802-1au-d2-4.pdf (page 96, figure 32-2)

@btlcmr0702
Copy link
Author

Thanks first.
I study the code but i find i am confused.
I see the QbbNetDevice::Receive but i can't find where to ask m_broadcom about checking space situation ,while i found these in QbbNetDevice::send() but it seems unreasonable because it should in Receive() and i don't know who and where invoke send( ) .
And i eager to knowing how and where the PFC was triggered.
It seems very late in the USA , have a good night.

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 1, 2016

You are right.. they are in send(), because send() is where the packets are put into the sending queue (MMU) and ready to be sent to the next hop. This is invoked by the upper layer.

The name "send" is inherited from point-to-point-device. The reason it's called "send", is because the packet is actually leaving this qbb-net-device (the ingress port). On a switch, the packet is heading towards another qbb-net-device (the egress port on the same switch). For this purpose, it must be buffered in the broadcom queue module, and waits for the egress qbb-net-device to invoke dequeueandtransmit() and get this packet.

dequeueandtransmit() is where the packet actually leaves the switch.

@btlcmr0702
Copy link
Author

what you say the upper layer who invoke send() is refer to broadcom-node or broadcom queue? But i can't find where to invoke the send().
For implementing my algorithm , i eager to knowing where the PFC was triggered , how the PFC mechanism works?

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 1, 2016

PFC generation is in QbbNetDevice::CheckQueueFull.

QbbNetDevice::Receive handles received PFC.

The implemented PFC mechanism is just standard. Once a port is paused, it either waits for a RESUME, or waits for PFC pause timeout.

The send() is invoked by common NS-3 pipeline. Nothing special either. It's just overriding its parent class method PointToPointNetDevice::Send()

@btlcmr0702
Copy link
Author

yeah, i found the BroadcomNode::GetPauseClasses decides whether a priority queue should send PFC.
And i also notice the dynamtic_pfc_threshold , so what's the difference between dynamtic and common pfc? what's the mean of parameter m_pg_shared_alpha_cell in dynamtic situation ?
Can i know the mean of these codes?
image

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 1, 2016

It's dynamic PFC threshold from Broadcom. In DCQCN paper http://yibozhu.com/doc/dcqcn-sigcomm15.pdf, "The Trident II chipset in our switch allows us to configure a parameter β such that"

The β is the m_pg_shared_alpha_cell. The paper calls it β because DCQCN already has an \alpha. Broadcom calls this value alpha.

If you don't understand it and don't need it, you may disable dynamic threshold in config.txt

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 2, 2016

The first picture you show is in udp client. It is pushing packets to the lower layer (and be buffered there), instead of actually transmitting the packet out at layer 2. Just think of what happens in real OS, when you call a socket send(), the packet is not immediately sent out. It is buffered in local OS waiting for the NIC to actually handle it.

Because the simulator is testing full throughput, we keep the buffer at end host non-empty. As a result, the NIC at layer 2 will just transmit the packets at line rate, regardless the timing that UDP client sends these packets. I add random interval here just to help the case where multiple UDP flows start from the same end host.

If you want to test applications that randomly send out a few packets, you need to edit application layer.
If you want to test when NIC or switch misbehaves and makes the packet interval random, you need to edit qbb-net-device (layer 2).

Pick the right layer to work on... Keep in mind that when you see a send(), it just means sending from THIS layer to another layer (unless you are at layer 1/2). It does not mean sent out from a device.

@btlcmr0702
Copy link
Author

hi,it's nice of you to answer me every time.
Now , i want to realize the experiment you do in the DCQCN paper(2015 sigcomm), except the topology and dataRate i need set in config.txt , any other parameters i need set ? such as the parameters in broadcom-node.h , there are many threshold parameters.

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 2, 2016

You don't need to modify anything in the code. Just edit config files.

@btlcmr0702
Copy link
Author

OK , i have try yo midify the flow.txt ,topology.txt trace.txt.
can i know the meaning of output in mix.tr
for example this is the line in mix.tr of the default config
2.000002 /1 1.2>1.1 u 32795 0 3 what's mean of every segment?
And by the way , where to modify the output content above ?

@btlcmr0702
Copy link
Author

btlcmr0702 commented Nov 3, 2016

hi
i want to know where to set the parameter "feedback_delay"?
Is the NP_SAMPLING_INTERVAL ? or what does NP_SAMPLING_INTERVAL use for ?

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 4, 2016

what is "feedback_delay"? You can add link latency in topology.txt. If you want the NIC to delay sending ACKs, you can take a look at qbb-net-device, edit the place where NIC generates ACK.

NP_SAMPING_INTERVAL was for modeling older (and weaker) NICs. Sometimes they cannot capture all ECN marks. For example, when they capture one ECN, they have to spend some time processing it and cannot capture another ECN within some interval. New NICs do not have this limitation. So just keep it 0.

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 4, 2016

The trace format:
timestamp, node_being_traced, src_ip>dst_ip, u=udp, port#, sequence#, priority

The code for trace output is scattered in different xxx-header.cc files. For example, at IP layer, you can find the src_ip>dst_ip part here:

https://github.com/bobzhuyb/ns3-rdma/blob/master/src/internet/model/ipv4-header.cc

in Ipv4Header::Print() method

If you configure a node to be traced in trace.txt, NS-3 automatically calls all headers' Print() function on every packet on this node.

@btlcmr0702
Copy link
Author

btlcmr0702 commented Nov 8, 2016

hi
i want get the latency of every packet , so where i should work on and how to get the timestamp of packet ?
i have get the packet_id and timestamp from header seqTs , but i don't know the format of the timestamp , it seems that it is Timestep class , how to transform it to the comparable value with Simulator::now().

image

@bobzhuyb
Copy link
Owner

bobzhuyb commented Nov 8, 2016

TimeStep is NS-3's data structure. You should not ask it here. Search online... Or use Visual Studio's "go to definition" or "find all references"

@btlcmr0702
Copy link
Author

hi
i found the this value m_pg_shared_alpha_cell((double)m_buffer_cell_limit_sp - m_usedIngressSPBytes[GetIngressSP(port, qIndex)]* will become negative when i have 10 40Gb flows and set QCN 0 , dynamic PFC 1.Why it become negative?
So can i have some instruction or manual about the switch configuration? I don't know the actual meaning of these variables.

@bobzhuyb
Copy link
Owner

total buffer = guaranteed buffer + shared buffer + headroom buffer (you can search "PFC headroom" to learn more about it)

buffer_cell_limit_sp is the threshold for guaranteed + shared buffer. When the buffer is very full, and some headroom is used, this will become negative.

I am sorry that I cannot send you Broadcom's confidential documents. You could directly ask them for the document of the chipset you want.

@btlcmr0702
Copy link
Author

hi
I notice there are different mechanisms for packet dequeue like dequeueNIC ,dequeueQCN,dequeueRR
Can i know the difference and how they work together?
Thank you!

@bobzhuyb
Copy link
Owner

Check qbb-net-device.cc, where these functions are called.

Some of them are for NIC, some of them are for switches. Some of them use round robin (i.e., RR) to decide which priority should send the packet, some of them use strict priority.

@btlcmr0702
Copy link
Author

hi
I want to know when the packet enter and depart the switch , so where i should print the time ?
Thank you

@bobzhuyb
Copy link
Owner

Check the NS-3 tracing methods, like m_phyTxEndTrace, m_snifferTrace, ... etc. in qbb-net-device.cc

@btlcmr0702
Copy link
Author

hi
I met some strange problems
I thought the delay between when packet was generated and when packet entered the switch equals to the link_delay i set in configuration file(only one hop), but i found it is not.
Then i try to find out the reason , but i can't understand some codes in function send() in file udp-client.cc , :
image
Why the send_interval vary each time , is this for adjusting the speed? And what does parameter buffer
mean?
Thank you!

@bobzhuyb
Copy link
Owner

bobzhuyb commented Dec 5, 2016

This piece of the code is controlling how the packets enter the udp sender's local buffer. Since this buffer usage can change, the time it takes for a packet to reach the first switch can vary.

Please check again this answer: #3 (comment)

@btlcmr0702
Copy link
Author

Thank you for your patient reply :)
Now, for doing some more experiments , i need to run the simulation on Linux.
So how can i do this , can I just copy the src codes to the original NS3 code on Linux and compile ?

@bobzhuyb
Copy link
Owner

bobzhuyb commented Dec 7, 2016

You need to edit the wscript file of each module, e.g., https://github.com/bobzhuyb/ns3-rdma/blob/master/src/point-to-point/wscript

There may also be slight differences between gcc and vc++. But it should not be hard to fix the code.

The most convenient way is actually to use WINE and run exe binary directly. WINE 1.6.2 (you can install it using apt-get in Ubuntu 16.04) should work just fine. Put the binary and config files in the same folder, and run:

wine64 main.exe config.txt

@btlcmr0702
Copy link
Author

oh, do you mean i still use the VS to fix and compile the source coed, and then i will get main.exe every time .Then i copy the .exe file and config file to Linux and use WINE to run ?
I will try , thank you

@bobzhuyb
Copy link
Owner

bobzhuyb commented Dec 8, 2016

Yes.

@btlcmr0702
Copy link
Author

btlcmr0702 commented Dec 8, 2016

hi,i try to use wine , but it can't run correctly like this (if it runs correctly it will print many logs i set):
image
I have copied the .exe , .lib and config files in the same folder. it still reminds me :
image
What's the problem?By the way,my Ubuntu is 14.04 .

@bobzhuyb
Copy link
Owner

bobzhuyb commented Dec 8, 2016

These warnings should not matter.

Have you fixed the paths of TOPOLOGY_FILE, FLOW_FILE, etc? If you put all config files in the same folder, the path should be
TOPOLOGY_FILE xxx.txt
instead of
TOPOLOGY_FILE mix/xxx.txt

Also, I don't know whether other WINE versions except 1.6.2 would work or not.

@btlcmr0702
Copy link
Author

I change the path of the TOPOLOGY_FILE,etc , and it seems work well.
I use the wine 1.7
Thank you ! 👍
But every time i have to switch from windows(fix the code) to Linux(run the simulation), it's inconvenient. I will try the other way you said before, just fix the wscript file is enough , and how to fix ?

@bobzhuyb
Copy link
Owner

bobzhuyb commented Dec 9, 2016

You need to add the files that were not in the original NS-3 version to wscript. This is more of an NS-3 problem. You can search questions like "how to add a new module to NS-3" online. I haven't compiled it in Linux for years. So you are pretty much on your own.

@btlcmr0702
Copy link
Author

btlcmr0702 commented Dec 13, 2016

OK, i will try and maybe i can write some guide for how to run the project on Linux.
Today i found some parameters are strange , such as :
m_pg_hdrm_limit = 100 * 1030; //ingress pg headroom
The switch you use in the NS3 has 64 ports and 8 priorities(queues),so when every port and queue use the headroom, the space will be 64 * 8 * 100 * 1030 = 51.2M! But the total buffer of switch is 9MB. It seems unreasonable.
Thank you for your reply.

@bobzhuyb
Copy link
Owner

You can lower the threshold, but be careful it's related to link delay. If it's too low, PFC will still drop packets.

Because I never run that many ports and priorities, it did not matter for me. This is what happens in reality -- the number of lossless priorities is very limited in practice due to lack of buffer!

@btlcmr0702
Copy link
Author

hi
I can't understand some parameters and codes and there are no comment abt them , maybe you can explain the meaning of them?
qbb-net-device.h:
uint32_t m_findex_udpport_map[fCnt];
uint32_t m_findex_qindex_map[fCnt];
QbbNetDevice::GetUsedBuffer(uint32_t port, uint32_t qIndex) (when qcnenabled)
broadcom-egress-queue.h:
uint32_t m_fcount;
uint32_t m_rrlast;
uint32_t m_qlast;

Thank you!

@bobzhuyb
Copy link
Owner

These are data structures that help track some internal states in the NICs/switches. You can read the meanings from the names. "f" means flow, "q" means queue. e.g., findex is flow index. fcount is flow count. rrlast is the last queue in round robin scheduler. qlast is the last queue.

@btlcmr0702
Copy link
Author

hi
I notice that in your paper about DCQCN, you set the pMAX 1%
but in the simulation, you set pMAX 1
why they are different?

@bobzhuyb
Copy link
Owner

There is no particular reason. The config.txt is just an example. You can edit it to make it consistent with the paper, or try your own parameters.

@btlcmr0702
Copy link
Author

Thank you for your reply.
I am confused with some parameters like inDev and m_ifindex, it seems both of them refer to the port number.
In function BroadcomNode::ShouldSendCN(uint32_t indev, uint32_t ifindex, uint32_t qIndex)
You don't use the parameter indev while use the ifindex in this m_usedEgressQSharedBytes[ifindex][qIndex]
Does this parameter mean the all packets in egress queue ? And what does m_usedEgressSPBytes[GetEgressSP(indev, qIndex)] mean?

@bobzhuyb
Copy link
Owner

bobzhuyb commented Jan 3, 2017

inDev is the ingress port, ifindex is the egress port.

m_usedEgressSPBytes[GetEgressSP(indev, qIndex)] is Service Pool (SP) buffer usage. This is some Broadcom stuff.

@btlcmr0702
Copy link
Author

btlcmr0702 commented Jan 6, 2017

Thank you for your reply first 👍
Now, i find the there will be some delay between a packet was generated and enter the switch. The delay will up to hundred of microseconds. So i wonder why and i try to follow the process of transmission of a packet.
Firstly the packet was generated in udp-client.cc and then invoke function m_socket->Send (p)
Then it go to socket.cc and do this
image
And i can't find the next step , VS remind me it invoke the function in packet-socket.cc , but it seems not.it confused me :(

I find that a packet will wait for a long time (hundred of microseconds) in NIC before it leaves

@btlcmr0702
Copy link
Author

btlcmr0702 commented Jan 8, 2017

hi
Sorry to ask you again.
I trace the number of packets in m_queue and i find it will often become 0 suddenly and this is caused by initialization of m_queue, why it will often initialize and where it do this?

@btlcmr0702
Copy link
Author

hi
I meet a problem that how can i set the path i want for a packet to the end ?
I mean if node A connect to B and B connect to C and D,how can i let a packet go A-B-D rather than A-B-C ?
I want to test some victim flow example.
Thank you!

@bobzhuyb
Copy link
Owner

You can add static routes to the routers.

https://www.nsnam.org/doxygen/classns3_1_1_ipv4_static_routing.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants