Skip to content

Feature/marine dissect all packet fields#59

Merged
tomlegkov merged 3 commits intotomlegkov:masterfrom
ScarletMetal:feature/marine-dissect-all-packet-fields
Jan 15, 2022
Merged

Feature/marine dissect all packet fields#59
tomlegkov merged 3 commits intotomlegkov:masterfrom
ScarletMetal:feature/marine-dissect-all-packet-fields

Conversation

@ScarletMetal
Copy link
Copy Markdown
Contributor

@ScarletMetal ScarletMetal commented Dec 20, 2021

Should add parse_all_fields to simple_marine by tomorrow.
Please review the rest, as it would just be a small wrapper around Marine.parse_all_fields.

@ScarletMetal ScarletMetal force-pushed the feature/marine-dissect-all-packet-fields branch from 4cb6a4c to 20c0291 Compare December 25, 2021 12:59
Copy link
Copy Markdown
Collaborator

@yehonatanz yehonatanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here, and in marine-core too.

marine/marine.py Outdated
self,
packet: bytes,
encapsulation_type: Optional[int] = encap_consts.ENCAP_ETHERNET,
) -> dict:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dict[str, WhateverTheValueCanBe]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

marine/marine.py Outdated
return bool(value.bool_value)

if value_type == MarinePacketFieldValueType.STR:
return value.str_value.decode()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify encoding explicitly please.
Is the encoding identical for fields?

}


def test_parse_all_fields_int_value(tcp_packet_fields):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test all types please

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot float :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've seen, nearly no built-in dissector utilizes float values.
I would remove this value type, and make it fall back to string.

@yehonatanz
Copy link
Copy Markdown
Collaborator

Rebase this one too, make each commit atomic, meaningful and blacked

@ScarletMetal ScarletMetal force-pushed the feature/marine-dissect-all-packet-fields branch from 20c0291 to 4c0415d Compare January 5, 2022 14:46
@ScarletMetal
Copy link
Copy Markdown
Contributor Author

Rebase this one too, make each commit atomic, meaningful and blacked

Done

@ScarletMetal ScarletMetal force-pushed the feature/marine-dissect-all-packet-fields branch 2 times, most recently from c4a078e to 9a71adc Compare January 5, 2022 14:59
Copy link
Copy Markdown
Owner

@tomlegkov tomlegkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice feature!

The main thing that worries me here is reducing the amount of consts we duplicate from the C to the Python, so that writing clients in new languages won't be too annoying :)

def test_parse_all_fields_list_value(tcp_packet_fields):
ip_addr = tcp_packet_fields["ip"]["ip.addr"]
assert isinstance(ip_addr, list)
assert "10.0.0.255" in ip_addr
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you know exactly what the list looks like, consider comparing it directly to the list

assert ["10.0.0.255", "21.53.78.255"] == ip_addr

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot know ahead of time the order of the list produced by the dissector, So I though checking it that way would be safer.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it always src, dst?

def test_parse_all_fields_bool_value(tcp_packet_fields):
tcp_ack_flag = tcp_packet_fields["tcp"]["tcp.flags_tree"]["tcp.flags.ack"]
assert isinstance(tcp_ack_flag, bool)
assert tcp_ack_flag
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For safety, maybe add another assert here for a flag that is 0?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in - the bit extracted from the dissector is 0?
Doesn't isn't the assert on line 1052 enough? it would fail if the flag is False.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant checking that a different tcp flag is 0

}


def test_parse_all_fields_int_value(tcp_packet_fields):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot float :)

marine/marine.py Outdated
def _load_child_fields(cls, children):
child_fields = {}
for child in children.data[: children.len]:
name = child.name.decode("ascii")
Copy link
Copy Markdown
Owner

@tomlegkov tomlegkov Jan 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really always ascii? Probably yes, but let's make sure :)

Maybe going for utf-8 is safer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to utf-8.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is slow in benchmarks, consider adding a cache :)

marine/marine.py Outdated

@classmethod
def _load_marine_packet(cls, field):
children = field.children
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this variable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

]


class MarinePacketFieldValueType:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the enum that is defined in marine.h for this? Similar to how we use MARINE_ALREADY_INITIALIZED_ERROR_CODE for example.

Copy link
Copy Markdown
Contributor Author

@ScarletMetal ScarletMetal Jan 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, this doesn't seem to be very trivial :(
The value you referenced, MARINE_ALREADY_INITIALIZED_ERROR_CODE is an int const, while marine_value_type is an enum, and I would prefer it to remain that way (separate type, automatic index generation etc).
I didn't find a simple way to expose all values of this enum from the shared library (I even found some resources that outright suggest duplicating the enum), So doing this would most likely be the lesser evil in this situation.

@ScarletMetal ScarletMetal force-pushed the feature/marine-dissect-all-packet-fields branch from 9a71adc to 1aebd7a Compare January 8, 2022 14:01

if value_type == MarinePacketFieldValueType.LIST:
return [cls._load_field_value(v) for v in value.list_value[:value_len]]

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want an else here to throw an exception or something?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

def test_parse_all_fields_bool_value(tcp_packet_fields):
tcp_ack_flag = tcp_packet_fields["tcp"]["tcp.flags_tree"]["tcp.flags.ack"]
assert isinstance(tcp_ack_flag, bool)
assert tcp_ack_flag
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant checking that a different tcp flag is 0

@ScarletMetal ScarletMetal force-pushed the feature/marine-dissect-all-packet-fields branch from d1e7919 to b854ead Compare January 10, 2022 20:23
Copy link
Copy Markdown
Collaborator

@yehonatanz yehonatanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I had some more notes but nothing serious.

I'm worried about the CI, not running, so when you're done fixing stuff can you please attach a screenshot of the linter and tests pass?

def test_parse_all_fields_list_value(tcp_packet_fields):
ip_addr = tcp_packet_fields["ip"]["ip.addr"]
assert isinstance(ip_addr, list)
assert "10.0.0.255" in ip_addr
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it always src, dst?

+ ip.IP(src_s="10.0.0.255", dst_s="21.53.78.255")
+ tcp.TCP(sport=16424, dport=41799)
+ tcp.TCP(
sport=16424, dport=41799, flags=0b00010000, body_bytes=tcp_payload
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some tcp.ACK const somewhere in pypacker, use it and remove the comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


def parse_all_fields(
self, packets: List[bytes], encapsulation_type: Optional[int] = None
) -> List[Dict[str, Any]]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use ParsedPacket here too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@classmethod
def _parse_all_fields(
cls, packet: bytes, encapsulation_type: Optional[int] = None
) -> Dict[str, Any]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParsedPacket?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


def parse_all_packet_fields(
packet: bytes, encapsulation_type: Optional[int] = None
) -> Dict[str, Any]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParsedPacket?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ScarletMetal ScarletMetal force-pushed the feature/marine-dissect-all-packet-fields branch from b854ead to 30f7da3 Compare January 14, 2022 08:33
@ScarletMetal ScarletMetal force-pushed the feature/marine-dissect-all-packet-fields branch from 30f7da3 to 5bdb79e Compare January 14, 2022 09:37
@ScarletMetal
Copy link
Copy Markdown
Contributor Author

ScarletMetal commented Jan 14, 2022

@yehonatanz regarding your qusetion about src, dst.
This is technically an implementation detail of Wireshark's IP dissector, So I think it is better to depend on it as little as possible, So I would rather not guess the order of it's output.

@tomlegkov tomlegkov merged commit b847143 into tomlegkov:master Jan 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants