# Problem 1

nasdaq.pcap is a packet capture (PCAP) of market data. It’s a series of UDP packets containing
NASDAQ ITCH5 market data, transmitted over the MoldUDP64 protocol. The MoldUDP64 packets
have sequence numbers. Python has libraries for working with PCAP files as well as packing and
unpacking binary data. You can find information about these file formats online/on NASDAQ&#39;s
website.

Write a program that checks if the sequence numbers of the MoldUDP64 packets are
ordered. Determine the first and last sequence numbers of the sample data, as well as any
missing sequence numbers/gaps.

In [47]:
import pandas as pd
import sys
import re
from scapy.all import *
import struct

In [2]:
p = rdpcap("nasdaq.pcap")

In [3]:
p

<nasdaq.pcap: TCP:0 UDP:2047 ICMP:0 Other:0>

In [4]:
len(p)

2047

In [126]:
pkt = p[1]
hexdump(pkt)
hexdump(pkt.load)
hexdump(pkt.load[11:18])

0000  01 00 5E 36 0C 6F 00 1C 73 26 69 1D 08 00 45 00  ..^6.o..s&i...E.
0010  00 56 F9 AC 40 00 15 11 27 F2 CE C8 7F 8A E9 36  .V..@...'......6
0020  0C 6F CB 76 67 6D 00 42 98 B9 30 30 30 30 30 37  .o.vgm.B..000007
0030  35 32 35 42 00 00 00 00 1C DE 03 BF 00 01 00 24  525B...........$
0040  41 1F E4 00 00 37 24 E0 59 E1 88 00 00 00 00 1A  A....7$.Y.......
0050  0A F5 B8 42 00 00 02 BC 58 49 56 20 20 20 20 20  ...B....XIV     
0060  00 04 F3 30                                      ...0
0000  30 30 30 30 30 37 35 32 35 42 00 00 00 00 1C DE  000007525B......
0010  03 BF 00 01 00 24 41 1F E4 00 00 37 24 E0 59 E1  .....$A....7$.Y.
0020  88 00 00 00 00 1A 0A F5 B8 42 00 00 02 BC 58 49  .........B....XI
0030  56 20 20 20 20 20 00 04 F3 30                    V     ...0
0000  00 00 00 1C DE 03 BF                             .......


In [133]:
setE = {'a', 'a', 'b', 'v'}
setE.add('b')
setE

{'a', 'b', 'v'}

In [154]:
seq = []
seqClean = []
missingData = []
print("these are the missing numbers by index in dataset:")
for i in range(len(p)):
    tmp = p[i].load[11:18]
    tmp = int.from_bytes(tmp, "big")
    if tmp == 0:
        print(i + 1)
        missingData.append(i)
    else:
        seqClean.append(tmp)
    seq.append(tmp)


these are the missing numbers by index in dataset:
1
94
161
224
358
520
574
734
886
1052
1341


In [148]:
print(f"first sequence number: {min(seqClean)}")
print(f"last sequence number: {max(seqClean)}")

first sequence number: 484311999
last sequence number: 484314230


In [151]:
if sorted(seqClean) == seqClean:
    print("the sequence numbers are ordered")
else:
    print("the sequence numbers are not sorted")

the sequence numbers are ordered


### sites
<ul>
    <li><a href = "https://github.com/boundary/wireshark/blob/master/epan/dissectors/packet-moldudp64.c">wireshark dissector code</a></li>
    <li><a href="https://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/moldudp64.pdf">NASDAQ docs</a></li>
    <li><a href = "https://www.youtube.com/watch?v=oKUkbMz5q7Y">Youtube tutorial</a></li>
    <li><a href = "https://github.com/Amay22/NASDAQ-ITCH-5.0-Parser"> NASDAQ ITCH parser</a></li>
    <li><a href = "https://www.uv.mx/personal/angelperez/files/2018/10/sniffers_texto.pdf">Packet sniffer ex </a></li>
</ul>
    
### other functions
<ul>
    <li>ls(pkt)</li>
    <li>pkt.show()</li>
    <li>pkt.summary()</li>
</ul>

# Problem 2

Let’s define a trading strategy by the tuple of parameters (X, Y, Z) where X is the starting price of a
security, Y is the “adjustment” per unit, and Z is the “required edge”. At each quantum of time, the
trading strategy maintains a two-sided quote at distance Z around X – P * Y, where P is its current
position. Assume that price starts at X and in each period either ticks up or down by 1 unit. If when
ticking, the price hits a strategy’s order, then the strategy is filled (position P is updated, and the two-
sided quote moves accordingly).

We can see that every time price reverts to X, the strategy will have P=0 and have made positive
profit. Let’s say that at the end of the simulation, the security ends at price E. Then, the strategy
must sell (or buy, if P &lt; 0) all of its position at this price.
Assume no position limits (P is unconstrained). You can sell the security without owning it (You can
have negative position) and make any other assumptions necessary to get an elegant answer.

Example 1: (X = 0, Y = 1, Z= 1)

One example of a path length where L = 4 would be selling every time, any combination is possible.
For this example it is 4 Sells consecutively (S,S,S,S). It could just as likely be any combination,
(B,S,B,B) for example.

Step 1
We pay = -1 Theoretical value = 0 We sell = 1
We sell here at a price of 1

Step 2
We pay = 0 Theoretical value = 1 We sell = 2
We sell here at a price of 2

Step 3
We pay = 1 Theoretical value = 2 We sell = 3
We sell here at a price of 3

Step 4
We pay = 2 Theoretical value = 3 We sell = 4

We sell here at a price of 4

After this sequence we would have to buy 4 units (due to 4 sales, no buys), paying 4 (E) to get back
to having no position.

Determine the profitability of the (0, 1, 1) strategy as a function of its ending price E and its
path length L (assuming it moves up or down every quanta of time, then L is the number of
quanta considered).