Some questions about how modules work in simlator. #3

btlcmr0702 · 2016-11-01T02:11:56Z

Hi.
I have run the main.exe correctly and now i want to implement some algorithm on simulator.
Can i know how the simulator work , like the relationship of qbb-device and broadcom-node, and the role of qbb-device in simulation.
I also want to know where and who decides when to send PFC packet.
Thank you for your reply

bobzhuyb · 2016-11-01T02:59:08Z

Each qbb-net-device is a L2 device, which can be a switch OR a NIC. If it's switch, it will have a broadcom-node (m_broadcom) and a broadcom queue (m_queue) attached to it. (This is not accurate, see below)

If it's a switch, upon receiving packets, it will ask m_broadcom whether there is still space for this new packet. If so, it pushes this packet into m_queue and asks m_braodcom whether the PFC threshold is met. If so, it will send a PFC PAUSE to upstream.

Upon sending packets, it gets a packet from m_queue, and ask m_broadcom whether the queue length falls below PFC threshold. If so, it will send a PFC RESUME.

If it's a NIC, it will perform DCQCN rate control.

You can find most of these details in QbbNetDevice::Receive and QbbNetDevice::DequeueAndTransmit

btlcmr0702 · 2016-11-01T03:14:36Z

Oh, so it's very different from the switch mechanism in NS3 like bridge module , and every port of switch is also abstract , just counts how many bytes the port recveives？
OK,but i find the pause_time parameter , so the upstream resumes sending packets depends on resume message form donwstream or pause_time , maybe they work together i mean the pause time is the last barrier for droping packet?
And some terms like sp , pg ,rpr ,what's mean of them ? SP for strict priority ? pg for priority guarantee ?
Thank u very much :)

bobzhuyb · 2016-11-01T06:55:09Z

Sorry, I think I made a small mistake above. qbb-net-device is ether a NIC port or a switch port. Please bare with me... wrote these codes long ago. broadcom-node and broadcom queue is attached to the node, not the qbb-net-device. A node can have multiple qbb-net-device (especially on a switch), which share the same m_broadcom and m_queue.

pg is Priority Group, some people call it priority class, or simply "priority".
sp is Service Pool. It's a shared buffer that multiple pg can share.
These are the terms from Broadcom.

rpr.. stuff is from QCN standard. You can refer to http://www.cs.ucr.edu/~mart/204/802-1au-d2-4.pdf (page 96, figure 32-2)

btlcmr0702 · 2016-11-01T07:10:53Z

Thanks first.
I study the code but i find i am confused.
I see the QbbNetDevice::Receive but i can't find where to ask m_broadcom about checking space situation ,while i found these in QbbNetDevice::send() but it seems unreasonable because it should in Receive() and i don't know who and where invoke send( ) .
And i eager to knowing how and where the PFC was triggered.
It seems very late in the USA , have a good night.

bobzhuyb · 2016-11-01T07:28:30Z

You are right.. they are in send(), because send() is where the packets are put into the sending queue (MMU) and ready to be sent to the next hop. This is invoked by the upper layer.

The name "send" is inherited from point-to-point-device. The reason it's called "send", is because the packet is actually leaving this qbb-net-device (the ingress port). On a switch, the packet is heading towards another qbb-net-device (the egress port on the same switch). For this purpose, it must be buffered in the broadcom queue module, and waits for the egress qbb-net-device to invoke dequeueandtransmit() and get this packet.

dequeueandtransmit() is where the packet actually leaves the switch.

btlcmr0702 · 2016-11-01T08:01:44Z

what you say the upper layer who invoke send() is refer to broadcom-node or broadcom queue? But i can't find where to invoke the send().
For implementing my algorithm , i eager to knowing where the PFC was triggered , how the PFC mechanism works?

bobzhuyb · 2016-11-01T09:59:20Z

PFC generation is in QbbNetDevice::CheckQueueFull.

QbbNetDevice::Receive handles received PFC.

The implemented PFC mechanism is just standard. Once a port is paused, it either waits for a RESUME, or waits for PFC pause timeout.

The send() is invoked by common NS-3 pipeline. Nothing special either. It's just overriding its parent class method PointToPointNetDevice::Send()

btlcmr0702 · 2016-11-01T10:15:10Z

yeah, i found the BroadcomNode::GetPauseClasses decides whether a priority queue should send PFC.
And i also notice the dynamtic_pfc_threshold , so what's the difference between dynamtic and common pfc? what's the mean of parameter m_pg_shared_alpha_cell in dynamtic situation ?
Can i know the mean of these codes?

bobzhuyb · 2016-11-01T16:40:02Z

It's dynamic PFC threshold from Broadcom. In DCQCN paper http://yibozhu.com/doc/dcqcn-sigcomm15.pdf, "The Trident II chipset in our switch allows us to configure a parameter β such that"

The β is the m_pg_shared_alpha_cell. The paper calls it β because DCQCN already has an \alpha. Broadcom calls this value alpha.

If you don't understand it and don't need it, you may disable dynamic threshold in config.txt

bobzhuyb · 2016-11-02T02:47:06Z

The first picture you show is in udp client. It is pushing packets to the lower layer (and be buffered there), instead of actually transmitting the packet out at layer 2. Just think of what happens in real OS, when you call a socket send(), the packet is not immediately sent out. It is buffered in local OS waiting for the NIC to actually handle it.

Because the simulator is testing full throughput, we keep the buffer at end host non-empty. As a result, the NIC at layer 2 will just transmit the packets at line rate, regardless the timing that UDP client sends these packets. I add random interval here just to help the case where multiple UDP flows start from the same end host.

If you want to test applications that randomly send out a few packets, you need to edit application layer.
If you want to test when NIC or switch misbehaves and makes the packet interval random, you need to edit qbb-net-device (layer 2).

Pick the right layer to work on... Keep in mind that when you see a send(), it just means sending from THIS layer to another layer (unless you are at layer 1/2). It does not mean sent out from a device.

btlcmr0702 · 2016-11-02T11:35:41Z

hi,it's nice of you to answer me every time.
Now , i want to realize the experiment you do in the DCQCN paper(2015 sigcomm), except the topology and dataRate i need set in config.txt , any other parameters i need set ? such as the parameters in broadcom-node.h , there are many threshold parameters.

bobzhuyb · 2016-11-02T21:43:13Z

You don't need to modify anything in the code. Just edit config files.

btlcmr0702 · 2016-11-03T07:15:05Z

OK , i have try yo midify the flow.txt ,topology.txt trace.txt.
can i know the meaning of output in mix.tr
for example this is the line in mix.tr of the default config
2.000002 /1 1.2>1.1 u 32795 0 3 what's mean of every segment?
And by the way , where to modify the output content above ?

btlcmr0702 · 2016-11-03T13:49:15Z

hi
i want to know where to set the parameter "feedback_delay"?
Is the NP_SAMPLING_INTERVAL ? or what does NP_SAMPLING_INTERVAL use for ?

bobzhuyb · 2016-11-04T03:59:02Z

what is "feedback_delay"? You can add link latency in topology.txt. If you want the NIC to delay sending ACKs, you can take a look at qbb-net-device, edit the place where NIC generates ACK.

NP_SAMPING_INTERVAL was for modeling older (and weaker) NICs. Sometimes they cannot capture all ECN marks. For example, when they capture one ECN, they have to spend some time processing it and cannot capture another ECN within some interval. New NICs do not have this limitation. So just keep it 0.

bobzhuyb · 2016-11-04T04:07:40Z

The trace format:
timestamp, node_being_traced, src_ip>dst_ip, u=udp, port#, sequence#, priority

The code for trace output is scattered in different xxx-header.cc files. For example, at IP layer, you can find the src_ip>dst_ip part here:

https://github.com/bobzhuyb/ns3-rdma/blob/master/src/internet/model/ipv4-header.cc

in Ipv4Header::Print() method

If you configure a node to be traced in trace.txt, NS-3 automatically calls all headers' Print() function on every packet on this node.

btlcmr0702 · 2016-11-08T13:14:31Z

hi
i want get the latency of every packet , so where i should work on and how to get the timestamp of packet ?
i have get the packet_id and timestamp from header seqTs , but i don't know the format of the timestamp , it seems that it is Timestep class , how to transform it to the comparable value with Simulator::now().

bobzhuyb · 2016-11-08T16:20:50Z

TimeStep is NS-3's data structure. You should not ask it here. Search online... Or use Visual Studio's "go to definition" or "find all references"

btlcmr0702 · 2016-11-10T11:32:57Z

hi
i found the this value m_pg_shared_alpha_cell((double)m_buffer_cell_limit_sp - m_usedIngressSPBytes[GetIngressSP(port, qIndex)]* will become negative when i have 10 40Gb flows and set QCN 0 , dynamic PFC 1.Why it become negative?
So can i have some instruction or manual about the switch configuration? I don't know the actual meaning of these variables.

bobzhuyb · 2016-11-10T14:55:21Z

total buffer = guaranteed buffer + shared buffer + headroom buffer (you can search "PFC headroom" to learn more about it)

buffer_cell_limit_sp is the threshold for guaranteed + shared buffer. When the buffer is very full, and some headroom is used, this will become negative.

I am sorry that I cannot send you Broadcom's confidential documents. You could directly ask them for the document of the chipset you want.

btlcmr0702 · 2016-11-14T05:05:45Z

hi
I notice there are different mechanisms for packet dequeue like dequeueNIC ,dequeueQCN,dequeueRR
Can i know the difference and how they work together?
Thank you!

bobzhuyb · 2016-11-14T05:18:43Z

Check qbb-net-device.cc, where these functions are called.

Some of them are for NIC, some of them are for switches. Some of them use round robin (i.e., RR) to decide which priority should send the packet, some of them use strict priority.

btlcmr0702 · 2016-11-22T13:43:42Z

hi
I want to know when the packet enter and depart the switch , so where i should print the time ?
Thank you

bobzhuyb · 2016-11-23T22:11:01Z

Check the NS-3 tracing methods, like m_phyTxEndTrace, m_snifferTrace, ... etc. in qbb-net-device.cc

btlcmr0702 · 2016-12-05T13:46:10Z

hi
I met some strange problems
I thought the delay between when packet was generated and when packet entered the switch equals to the link_delay i set in configuration file(only one hop), but i found it is not.
Then i try to find out the reason , but i can't understand some codes in function send() in file udp-client.cc , :

Why the send_interval vary each time , is this for adjusting the speed? And what does parameter buffer
mean？
Thank you!

bobzhuyb · 2016-12-05T18:19:59Z

This piece of the code is controlling how the packets enter the udp sender's local buffer. Since this buffer usage can change, the time it takes for a packet to reach the first switch can vary.

Please check again this answer: #3 (comment)

btlcmr0702 · 2016-12-07T12:51:20Z

Thank you for your patient reply :)
Now, for doing some more experiments ， i need to run the simulation on Linux.
So how can i do this , can I just copy the src codes to the original NS3 code on Linux and compile ?

bobzhuyb · 2016-12-07T18:43:34Z

You need to edit the wscript file of each module, e.g., https://github.com/bobzhuyb/ns3-rdma/blob/master/src/point-to-point/wscript

There may also be slight differences between gcc and vc++. But it should not be hard to fix the code.

The most convenient way is actually to use WINE and run exe binary directly. WINE 1.6.2 (you can install it using apt-get in Ubuntu 16.04) should work just fine. Put the binary and config files in the same folder, and run:

wine64 main.exe config.txt

btlcmr0702 · 2016-12-08T04:33:40Z

oh, do you mean i still use the VS to fix and compile the source coed, and then i will get main.exe every time .Then i copy the .exe file and config file to Linux and use WINE to run ?
I will try , thank you

bobzhuyb · 2016-12-08T05:15:49Z

Yes.

btlcmr0702 · 2016-12-08T08:45:50Z

hi,i try to use wine , but it can't run correctly like this (if it runs correctly it will print many logs i set):

I have copied the .exe , .lib and config files in the same folder. it still reminds me :

What's the problem?By the way,my Ubuntu is 14.04 .

bobzhuyb · 2016-12-08T08:52:36Z

These warnings should not matter.

Have you fixed the paths of TOPOLOGY_FILE, FLOW_FILE, etc? If you put all config files in the same folder, the path should be
TOPOLOGY_FILE xxx.txt
instead of
TOPOLOGY_FILE mix/xxx.txt

Also, I don't know whether other WINE versions except 1.6.2 would work or not.

btlcmr0702 · 2016-12-08T09:22:39Z

I change the path of the TOPOLOGY_FILE,etc , and it seems work well.
I use the wine 1.7
Thank you ! 👍
But every time i have to switch from windows(fix the code) to Linux(run the simulation), it's inconvenient. I will try the other way you said before, just fix the wscript file is enough , and how to fix ?

bobzhuyb · 2016-12-09T08:24:16Z

You need to add the files that were not in the original NS-3 version to wscript. This is more of an NS-3 problem. You can search questions like "how to add a new module to NS-3" online. I haven't compiled it in Linux for years. So you are pretty much on your own.

btlcmr0702 · 2016-12-13T02:27:18Z

OK, i will try and maybe i can write some guide for how to run the project on Linux.
Today i found some parameters are strange , such as :
m_pg_hdrm_limit = 100 * 1030; //ingress pg headroom
The switch you use in the NS3 has 64 ports and 8 priorities(queues),so when every port and queue use the headroom, the space will be 64 * 8 * 100 * 1030 = 51.2M! But the total buffer of switch is 9MB. It seems unreasonable.
Thank you for your reply.

bobzhuyb · 2016-12-13T03:44:12Z

You can lower the threshold, but be careful it's related to link delay. If it's too low, PFC will still drop packets.

Because I never run that many ports and priorities, it did not matter for me. This is what happens in reality -- the number of lossless priorities is very limited in practice due to lack of buffer!

btlcmr0702 · 2016-12-15T03:21:56Z

hi
I can't understand some parameters and codes and there are no comment abt them , maybe you can explain the meaning of them?
qbb-net-device.h:
uint32_t m_findex_udpport_map[fCnt];
uint32_t m_findex_qindex_map[fCnt];
QbbNetDevice::GetUsedBuffer(uint32_t port, uint32_t qIndex) (when qcnenabled)
broadcom-egress-queue.h:
uint32_t m_fcount;
uint32_t m_rrlast;
uint32_t m_qlast;

Thank you!

bobzhuyb · 2016-12-16T06:55:37Z

These are data structures that help track some internal states in the NICs/switches. You can read the meanings from the names. "f" means flow, "q" means queue. e.g., findex is flow index. fcount is flow count. rrlast is the last queue in round robin scheduler. qlast is the last queue.

btlcmr0702 · 2016-12-29T13:06:45Z

hi
I notice that in your paper about DCQCN, you set the pMAX 1%
but in the simulation, you set pMAX 1
why they are different?

bobzhuyb · 2016-12-29T19:09:41Z

There is no particular reason. The config.txt is just an example. You can edit it to make it consistent with the paper, or try your own parameters.

btlcmr0702 · 2017-01-02T02:43:12Z

Thank you for your reply.
I am confused with some parameters like inDev and m_ifindex, it seems both of them refer to the port number.
In function BroadcomNode::ShouldSendCN(uint32_t indev, uint32_t ifindex, uint32_t qIndex)
You don't use the parameter indev while use the ifindex in this m_usedEgressQSharedBytes[ifindex][qIndex]
Does this parameter mean the all packets in egress queue ? And what does m_usedEgressSPBytes[GetEgressSP(indev, qIndex)] mean?

bobzhuyb · 2017-01-03T20:51:27Z

inDev is the ingress port, ifindex is the egress port.

m_usedEgressSPBytes[GetEgressSP(indev, qIndex)] is Service Pool (SP) buffer usage. This is some Broadcom stuff.

btlcmr0702 · 2017-01-06T07:26:36Z

Thank you for your reply first 👍
Now, i find the there will be some delay between a packet was generated and enter the switch. The delay will up to hundred of microseconds. So i wonder why and i try to follow the process of transmission of a packet.
Firstly the packet was generated in udp-client.cc and then invoke function m_socket->Send (p)
Then it go to socket.cc and do this

And i can't find the next step , VS remind me it invoke the function in packet-socket.cc , but it seems not.it confused me :(

I find that a packet will wait for a long time (hundred of microseconds) in NIC before it leaves

btlcmr0702 · 2017-01-08T13:08:44Z

hi
Sorry to ask you again.
I trace the number of packets in m_queue and i find it will often become 0 suddenly and this is caused by initialization of m_queue, why it will often initialize and where it do this?

btlcmr0702 · 2017-05-20T01:34:15Z

hi
I meet a problem that how can i set the path i want for a packet to the end ?
I mean if node A connect to B and B connect to C and D,how can i let a packet go A-B-D rather than A-B-C ?
I want to test some victim flow example.
Thank you!

bobzhuyb · 2017-05-21T08:26:58Z

You can add static routes to the routers.

https://www.nsnam.org/doxygen/classns3_1_1_ipv4_static_routing.html

bobzhuyb mentioned this issue Mar 8, 2017

How to read and analyse the output trace file? #5

Closed

daihuichen mentioned this issue Feb 27, 2018

Can I turn off buffer sharing among ports on a switch? #15

Open

bobzhuyb mentioned this issue May 21, 2018

Can you tell me what the output is? #16

Closed

Some questions about how modules work in simlator. #3

Some questions about how modules work in simlator. #3

Comments

btlcmr0702 commented Nov 1, 2016

bobzhuyb commented Nov 1, 2016 • edited Loading

btlcmr0702 commented Nov 1, 2016 • edited Loading

bobzhuyb commented Nov 1, 2016

btlcmr0702 commented Nov 1, 2016

bobzhuyb commented Nov 1, 2016 • edited Loading

btlcmr0702 commented Nov 1, 2016

bobzhuyb commented Nov 1, 2016 • edited Loading

btlcmr0702 commented Nov 1, 2016

bobzhuyb commented Nov 1, 2016

bobzhuyb commented Nov 2, 2016

btlcmr0702 commented Nov 2, 2016

bobzhuyb commented Nov 2, 2016

btlcmr0702 commented Nov 3, 2016

btlcmr0702 commented Nov 3, 2016 • edited Loading

bobzhuyb commented Nov 4, 2016

bobzhuyb commented Nov 4, 2016 • edited Loading

btlcmr0702 commented Nov 8, 2016 • edited Loading

bobzhuyb commented Nov 8, 2016

btlcmr0702 commented Nov 10, 2016

bobzhuyb commented Nov 10, 2016

btlcmr0702 commented Nov 14, 2016

bobzhuyb commented Nov 14, 2016

btlcmr0702 commented Nov 22, 2016

bobzhuyb commented Nov 23, 2016

btlcmr0702 commented Dec 5, 2016

bobzhuyb commented Dec 5, 2016

btlcmr0702 commented Dec 7, 2016

bobzhuyb commented Dec 7, 2016 • edited Loading

btlcmr0702 commented Dec 8, 2016

bobzhuyb commented Dec 8, 2016

btlcmr0702 commented Dec 8, 2016 • edited Loading

bobzhuyb commented Dec 8, 2016 • edited Loading

btlcmr0702 commented Dec 8, 2016

bobzhuyb commented Dec 9, 2016

btlcmr0702 commented Dec 13, 2016 • edited Loading

bobzhuyb commented Dec 13, 2016

btlcmr0702 commented Dec 15, 2016

bobzhuyb commented Dec 16, 2016

btlcmr0702 commented Dec 29, 2016

bobzhuyb commented Dec 29, 2016

btlcmr0702 commented Jan 2, 2017

bobzhuyb commented Jan 3, 2017

btlcmr0702 commented Jan 6, 2017 • edited Loading

btlcmr0702 commented Jan 8, 2017 • edited Loading

btlcmr0702 commented May 20, 2017

bobzhuyb commented May 21, 2017

bobzhuyb commented Nov 1, 2016 •

edited

Loading

btlcmr0702 commented Nov 1, 2016 •

edited

Loading

bobzhuyb commented Nov 1, 2016 •

edited

Loading

bobzhuyb commented Nov 1, 2016 •

edited

Loading

btlcmr0702 commented Nov 3, 2016 •

edited

Loading

bobzhuyb commented Nov 4, 2016 •

edited

Loading

btlcmr0702 commented Nov 8, 2016 •

edited

Loading

bobzhuyb commented Dec 7, 2016 •

edited

Loading

btlcmr0702 commented Dec 8, 2016 •

edited

Loading

bobzhuyb commented Dec 8, 2016 •

edited

Loading

btlcmr0702 commented Dec 13, 2016 •

edited

Loading

btlcmr0702 commented Jan 6, 2017 •

edited

Loading

btlcmr0702 commented Jan 8, 2017 •

edited

Loading