Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for more straightforward network metrics fields. #179

Merged
merged 2 commits into from
Dec 6, 2018

Conversation

webmat
Copy link
Contributor

@webmat webmat commented Nov 16, 2018

Prior to this PR, the network metrics are defined like this:

  • network.inbound.bytes
  • network.inbound.packets
  • network.outbound.bytes
  • network.outbound.packets
  • network.total.bytes
  • network.total.packets

Discussions around ambiguity of inbound/outbound naming have come up multiple times,
more or less directly in #2, #51, #63, and at various other times internally.

The issues with inbound/outbound metrics:

  • Inbound/outbound can mean relative to the host, or to the organization's network.
  • From a host POV, mapping of source => destination to either "inbound" or "outbound"
    has to change, depending on whether the host is receiving the connection inbound,
    or initiating a connection outbound.
  • From the POV of a network device, inbound/outbound to the organization's network
    can usually be determined accurately, but the device has nowhere to store
    metrics about internal only traffic.
    • Similarly, an ISP handling traffic between two external endpoints has nowhere
      to store metrics about the traffic.

The solution proposed here is to store metrics in fields that carry less ambiguous
meaning.

  • source.bytes = sent by source

  • destination.bytes = sent by destination

  • network.bytes = total

  • source.packets = sent by source

  • destination.packets = sent by destination

  • network.packets = total

The case where source and destination cannot be accurately determined is still
not fully addressed, other than setting network.direction:unknown. It may be
useful to eventually allow for storing the heuristic used to define which end
was assigned the name "source" and "destination".

This PR introduces breaking changes by removing the old fields.

  • The new field for total went from network.total.bytes/packets to network.bytes/packets,
    to be as concise as the other new fields.
  • A case could be made that the inbound/outbound fields could remain, and only be
    populated when the situation permits. If they can be populated reliably in a
    given environment, they may help at visualizing inbound/outbound more directly.

@farrp
Copy link

farrp commented Nov 16, 2018

Mathieu, I disagree with your contention that the use of inbound and outbound change in the context of the connection direction. In normal network usage, including when looking at a host, the direction of the connection setup is distinct from the direction of the traffic. What you are measuring here is traffic: from the POV of the measured device, did it send the packet or did it receive the packet? The direction of the connection (i.e. who initiated it, and who was listening for a connection) is irrelevant. I find the new definitions much more vague and confusing.

@tsg
Copy link
Contributor

tsg commented Nov 16, 2018

@farrp The issue we had with the inbound/outbound terminology is that to several people on the team intepreted them as "inbound/outbound to my network" or "inbound/outbound to my service". That is generally harder to establish, and we can't know in Packetbeat, for example. From what I understand, that wasn't actually the intention of the fields in ECS, but the chosen names made it seem that way.

We considered several alternative options, for example source.sent.bytes and destination.sent.bytes, which might have made it a bit more clear, but figured that the extra sent. is not that important. The definition is clear, though, it's the number of bytes sent by the source. Does that make sense?

@webmat
Copy link
Contributor Author

webmat commented Nov 16, 2018

@farrp Your answer seems to focus more on the host-based monitoring point of view.

The situation is different for a network device (e.g. router or firewall), where packets are received and passed along, so they are essentially both inbound and outbound :-) One could say that only the total should be stored in this case, but this approach loses information. Which side is generating the most traffic? (e.g. small request that triggers a big transfer, such as a download vs big request like an upload, that triggers a small response). The proposed approach supports storing the details of both sides trivially.

I agree that the more low level metrics may make it harder to figure out the big picture. For example, when inbound/outbound can be accurately determined for a given organization's situation, it will be more straightforward to figure out total inbound/outbound traffic, for example.

Nothing prevents someone from determining inbound/outbound however they like (host-based, network boundary-based), adding that to their events in addition to these raw fields. ECS is different from most other schemas, in that people can add fields around the "official" fields :-)

@webmat
Copy link
Contributor Author

webmat commented Nov 16, 2018

Ping for opinions: @robgil, @urso, @dcode, @ave19, @willemdh and @robcowart.

@farrp
Copy link

farrp commented Nov 16, 2018

@webmat - not at all. My focus is actually more network-centric than server-centric. From the perspective of a network interface on a router, switch, server, whatever - the concept of receive and send are very straightforward and never change context. This appears to be what you are measuring so I fail to see the source of the confusion. You seem to be conflating the concepts of session and traffic. A session connection has a direction (from initiator to listener) and a relationship (client-server or peer-peer), but bytes and packets flow either in or out regardless of the session characteristics.

@farrp
Copy link

farrp commented Nov 16, 2018

@tsg I do get the difficulty in the packet beat scenario since it is watching the middle of a session. In that case though the source and destination make even less sense, unless you put it in the context of a session... but then what about ICMP and UDP? I agree you have a problem in this case with send and receive, but don't throw away perfectly good terms for one specific use case. In all situations except packet beats the terminology works fine.

@tsg
Copy link
Contributor

tsg commented Nov 16, 2018

I do get the difficulty in the packet beat scenario since it is watching the middle of a session. [...] In all situations except packet beats the terminology works fine. I agree you have a problem in this case with send and receive, but don't throw away perfectly good terms for one specific use case.

It's not just Packetbeat, it's the same for Suricata, Zeek, and any other tool based on capturing the traffic.

Also, consider for example the traffic between two Docker containers on the same host. Is that incoming or outgoing? It's really neither, it's "internal" from the host PoV, so using incoming/outgoing is going to be a source of confusion. Depending on what you consider the network border, the same can be applied to the communication between internal hosts, switches, etc.

In that case though the source and destination make even less sense, unless you put it in the context of a session... but then what about ICMP and UDP?

Not sure I understand this, the new fields are not tied to sessions/connections any more than the previous fields were. In case of uni-directional traffic, only source.bytes would be filled.

What you are measuring here is traffic: from the POV of the measured device, did it send the packet or did it receive the packet? The direction of the connection (i.e. who initiated it, and who was listening for a connection) is irrelevant. I find the new definitions much more vague and confusing.

We're just making a rename of network.inbound.bytes -> source.bytes and network.outbound.bytes -> destination.bytes so that the field names make sense in more scenarios. So there really is no change in how we intend these fields to be used. Perhaps the new descriptions should be a bit more verbose and make that more clear?

@dcode
Copy link
Contributor

dcode commented Nov 17, 2018

I like the new approach. I'm looking at this primarily from a network-centric (i.e. network sensor via tap) point of view. As has been discussed in other issues, source and destination are always available on a network connection as they are typically tied to unidirectional traffic (i.e. packet level). What I like about the proposed schema is that it keeps all details per direction in a single object and doesn't require a lot of black magic at the passive sensor level to figure out which is inbound and which is outbound (or which is neither!). Something I've done in the past is create a kibana calculated field that populated inbound/outbound byte count based on the Zeek conn log boolean field of local_orig and local_resp, which it determines based on a CIDR lookup against a list of known local networks. This would be trivial to do in Logstash (just the bool field) on a common ECS field schema (...which I'm going to add to RockNSM now that I've thought of it 💡!... maybe source.is_local )

Even on an endpoint, when you have the netstat table, there's still a source and destination, though it's often labelled local/remote. How that translates to network data is probably implementation dependent. I'm still in favor of maintaining client/server for sessionized data when the context is available (i.e. zeek, server access logs, sessionized flow data). I think the convention your proposing could equally apply to that model.

It all comes out in the wash when using community_id, which is going to be about the only reliable way to align up network connections from multiple heterogenous log sources.

@farrp
Copy link

farrp commented Nov 17, 2018

I went back into source and looked it all in context and I realize I misunderstood the original comments. My apologies.

@webmat
Copy link
Contributor Author

webmat commented Nov 19, 2018

Thanks for your feedback, @dcode.

Yes, we realize that even if this may create duplication, adding client/server besides source/destination will likely simplify the consumption of the data later on. So this is still under serious consideration.

Let us know what comes out of your experiments in tagging the local side of the connection. This is also something we need to look into, and find a good way to represent.

@ave19
Copy link

ave19 commented Nov 19, 2018

I like source and destination. Usually, those are pretty straight forward. inbound and outbound are harder to know if you're just a device. (A reverse shell is... uh... )

I'm not as keen on ditching the network prefix. I mean, do you think that networks are the only thing that have sources and destinations? I think if you go source as top level, this means you expect all source properties to always be the same, right?

@urso
Copy link

urso commented Nov 19, 2018

For tools dealing with a bidirectional streams, but having no exact idea about originator/inbound/server, neither really makes sense and is always up for interpretation.

E.g. packetbeat flows use source/destination, but it has another field to identify the originator of the current stream (at least internally), because it doesn't really know the difference between any of the 2 addresses. Field names like source/destination/inbound/outbound/client/server also imply some kind of known 'direction' or 'order'. This is where the confusion in packetbeat comes from. Packetbeat treats streams as bidirectional, so to have both directional stats in one document (some kind of denormalization). It has no real idea about 'order', but rather tracks both addresses as endpoints (which from a 'session'). Unfortunately we didn't use "endpoints": [ { ... }, { ... }] in packetbeat.

There seems to be a preference in one or the other naming depending on actual use-case in mind (kind of device/network/software in use). Do we have a collection of these use-cases + proposed namings (+ documentation why)? Even if something sounds natural to you (for now), there will always be someone being confused if the scenario/deployment type is unknown.

E.g. I really like the idea of inbound/outbound (or local/remote on a network level), but the order of these might be very well different from client/server or source/destination (on the connection level).

@MikePaquette
Copy link
Contributor

@dcode I'm having a look at mapping the Zeek conn log to ECS. Question: how do you populate the ECE fields: source.ip, destination.ip, source.port, and destination.port? Are you extracting these from the 'id:' field? And if so, how do you know which is source.* and which is destination.* ? Also, would you add these IP's to related.ip if you had that field available.?

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think this should work well for unidirectional and bidirectional flow.

@dcode
Copy link
Contributor

dcode commented Dec 2, 2018

@MikePaquette Zeek/Bro actually uses source and destination in most of the detection types of logs and also unified2.log. It's used instead of (or in the case of notice.log to supplement) originator and responder for signature events in the signature.log, traceroute.log, notice.log, and notice_alarm.log in order to specify the direction of the connection that the triggered event took place.

And I do add these to the related.ip field if an IP appears anywhere in a log (and is parsable). Other notable locations are tx/rx hosts (who sent and who recieved files, could be multiple in protocols like BitTorrent) in the files.log, answers in dns.log, and san_ip (subject alternate name as an IP) in the x509.log. Adding these all to the related.ip field makes it easy to find this data that doesn't necessarily involve a connection with that IP as an endpoint, but rather the conversation is about the IP.

@webmat
Copy link
Contributor Author

webmat commented Dec 3, 2018

@robgil @MikePaquette @ruflin I'm requesting each of your review, to make sure this is in line with everyone's expectation, based on recent discussions. Does this work for you? I'd like to merge this in this week.

@@ -5,6 +5,11 @@ All notable changes to this project will be documented in this file based on the

### Breaking changes

* Rename `network.total.bytes` to `network.bytes` and `network.total.packets`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about this breaking change.

@andrewkroh How much will this affect packetbeat / auditbeat?
@webmat Did you check how much this effects Metricbeat / Filebeat?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn’t previously used by Auditbeat or Packetbeat AFAIK. I have
elastic/beats#9121 to use the new network.bytes in packetbeat.

@ruflin
Copy link
Member

ruflin commented Dec 4, 2018

@MikePaquette @webmat @robgil With this PR we would break our previous promise not to introduce any further breaking changes. Are all of you aware of that?

@webmat
Copy link
Contributor Author

webmat commented Dec 4, 2018

@ruflin The current state is pretty difficult to use, since it's using the terms inbound/outbound, which mean different things in different situations, and cannot be used in some other situations (e.g. networking device reporting on internal traffic).

So in a sense yeah this is a breaking change vs the first Beta of ECS. However I would argue that this part of the spec was not really useable.

I'd rather do this breaking change now, while we're still in Beta.

@webmat webmat merged commit 983befa into elastic:master Dec 6, 2018
@webmat webmat deleted the raw-network-metrics branch December 6, 2018 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants