Skip to content
Argus Monitor edited this page Jul 24, 2024 · 31 revisions

Welcome

Welcome to the Argus clients wiki! Here we'll try to use the powers of GitHub to develop and manage new features of argus data processing.

The Argus project is composed of two efforts.

  1. Network flow data generation
  2. Network flow data processing

The network flow data generation is referred to as the Argus Server or Sensor, and the processing component is referred to as the Argus Clients. Over the years, we've moved to referring to the Argus as the network flow sensor, and Clients as the sensor data processing elements.

Argus clients can be categorized into 5 basic groups:

  1. Data Collection and Distribution
  2. Data Management
  3. Data Processing
  4. Data Analytics
  5. Data Visualization

Argus Clients 5.0

In Argus 5.0 we made contributions to all 5 groups, with a focus on analytics. We moved some of the commercial ArgusPro features into the open source, and we've added a significant amount of tech into the clients distribution to demonstrate the features. This includes 128-bit Argus source id's, Argus events, expanded behavioral analytics, json processing, converting foreign flow data into the Argus processing system, enhanced content capture and processing, and new tunnel support.

Argus 5.0 is focused on generating argus data in as many points in the network as possible, including external and internal high speed links, workgroup edges, endpoints and wireless access points. This is important to addressing the cyber security challenges that enterprises face today. This involves granular visibility inside the enterprise, to support effective cyber detection and forensics. With increased network visibility inside the enterprise, there are new opportunities for sophisticated detections by correlating data from multiple points in the network at or near the same time.

Because Argus has already been ported to most endpoint operating systems and OpenWRT access points, we have a good start on getting a lot of sensors into an environment. As a part of improving visibility throughout the network, we're also going to import data from other flow systems. Argus already processes NetFlow and IPFIX records, but there are a lot of other flow data strategies out there. In particular, we'll want to import Zeek connection logs, as many organizations generate Zeek data, Google VPC flows, and possibly some of the single letter flows, like Qflow, Jflow, and maybe Kflow records.

Endpoint Argus Support

The open source argus code is very portable, and runs in a number of operating systems, including Linux and it's variants, RHEL, Rocky, Ubuntu, Debian, Kali, FreeBSD, CentOS, Fedora, OpenSUSE, and all of these sub-variants, Windows, MacOS, AIX, SunOS, HPUX, Solaris, IRIX, CrayOS, VxWorks, PSoS, and OpenWRT, so we have a good start.

Argus 5.0 sensors run great in endpoints. They have a very small footprint, and the CPU resources needed are very small. And with a few changes to libpcap, Argus can attain < 0.5% avg CPU utilization for an argus daemon on most commercial endpoints (Windows, MacOS, Linux). There are specific features that are useful to achieve complete network accountability on endpoints, as there are can be a lot of physical and virtual interface types that we all would like to monitor. BlueTooth interfaces, RadioTaps, USB devices, VPNs, even VMs and Docker interfaces are fair game for monitoring in an endpoint. And of course there are a lot of different types of endpoints now ... cloud based VMs and containers are an important part of the mix.

The key to a successful endpoint argus is reliability, low resource utilization and zero-configuration / management. In the Argus Project, we have addressed these issues and Argus should be a good candidate for generating network visibility in the endpoint.

Argus Sensor Reliability

The Open Argus sensor is a very mature network flow monitor with exceptional stability, performance and reliability. The source code has matured, is well reviewed and is running in 100's of operational environments, including national critical infrastructure. It has been operational in the US DoD for over 20 years, and operates at 100Gbps in several leading FFRDC's, Supercomputer Centers, and US Universities. It is this same sensor that we are advocating to run in endpoints, workstations, laptops, tablets, wireless routers and Android phones, generating flow data for every packet sent and received, on all interfaces, physical and virtual, with local storage of the network audit data.

Low Resource Utilization

The Open Argus sensor is a very mature network flow monitor with a very low resource footprint. The vanilla open source binary is only 600K bytes in size when 'strip'ed, and its footprint can be reduced to below 120K with aggressive minimization of static memory allocations.

The Argus Project offers binary MacOS and Windows sensors, and the average CPU utilization on an Apple iMac running macOS Sonoma 14.5 with 21 network interfaces is 0.5% CPU. That seems like a lot of interfaces for an iMac, but this is pretty standard for a Mac. On Windows 11, the average utilization is around 0.25%.

Memory usage is based on the number of concurrent flows that argus is tracking plus fixed memory needed to monitor each interface (hash tables, input buffers). On my iMac mentioned above, when tracking about 225 simultaneous flows, vanilla open source argus is using 125MB of ram. On Windows, the memory usage reporting is different (shared vs dynamic memory?) and the task manager reports 36MB of memory to the same environment within a VM.

Zero-Configuration

Argus Source ID Modifications

To improve managing large numbers of endpoint sensors, argus supports using the hostuuid as the argus source id. With this feature, argus can be deployed as a zero-configuration daemon (no conf file mods needed), and to improve visibility on endpoints, argus will generally add the 'inf' to the flow key of every flow it monitors. This means that argus-clients should expect from endpoint argi flow data that has a 128-bit source id, and a 4-char interface identifier, where the flow was monitored.

128-bit ARGUS_MONITOR_ID's are pretty unwieldy. To make data processing easier, all ra* programs can use a RA_SRCID_ALIAS file to alias short names for the big uuid identifiers. The aliases are "node"s and can be printed, filtered, etc ...

[carter@red clients]$ ra -S localhost -up 3 -s stime dur proto saddr dir daddr pkts bytes node inf
       StartTime    Dur  Proto          SrcAddr   Dir            DstAddr  TotPkts                                    Sid  Node  Inf 
  1719085355.895  0.000    arp    192.168.1.254   who       192.168.1.49        2   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719085355.895  4.343    tcp    192.168.1.254   <?>       192.168.1.49       36   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719085356.150  0.000   igmp     192.168.1.17    ->    239.255.255.250        1   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719085358.107  0.000    udp    192.168.1.131    ->    239.255.255.250        1   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5

And filtering using the 'node' is supported in all ra* programs ... HOWEVER ... when reading realtime data from a remote argus, because the remote argus may not have access to the RA_SRCID_ALIAS file, you should apply the filter to the calling ra*, using the 'local' filter directive.

[carter@red clients]$ ra -S remote:561 -up 3 -s stime dur:6 proto saddr:16 dir daddr pkts sid:38 node:5 inf - local node red
       StartTime    Dur  Proto          SrcAddr   Dir            DstAddr  TotPkts                                    Sid  Node  Inf 
  1719089205.139  4.061    tcp    192.168.1.254   <?>       192.168.1.49       50   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089205.347  3.005    udp    192.168.1.131    ->    239.255.255.250        7   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089206.771  1.934 ipv6-*               ::    ->                 ::        3   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089209.139  4.004    udp    192.168.1.131    ->    239.255.255.250        3   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089209.207  0.001    udp     192.168.1.49   <->        192.168.1.1        2   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089209.207  0.158    udp     192.168.1.49   <->        192.168.1.1        2   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5

JSON Input / Output Format

All ra* programs can now support reading and writing data as Json. This is in contrast to argus-3.x, which supported space separated, comma separated, any char based token separated files, along with XML output. XML has lost its usefulness with the introduction of Json, and with the use of CSV and Json as the primary input formats for AI/ML routines, argus-clients have made the shift to supporting Json as its primary output.

Argus has also modified its ArgusLabel structure to support Json formatted buffers. ra* programs still support the basic Metadata standards for Labels.

Converting Foreign Flow Data to Argus

Argus 5.0 clients provide a reconvert.1 program to convert ascii flow data to argus binary format. This has been used to control flow data export by converting argus binary data to ascii format, so that it can be inspected and reduced, and then converted back to a binary format for processing. One would expect this in highly controlled environments, or when sharing flow data with an external partner. By converting an ascii format back to binary, you 'know' what data will be in the binary file, or rather, you know what is not in the binary. This is important for excluding DSRs that may contain content, dns names, etc ...

This facility support converting CSV files, as well a json formatted data.

####Zeek conn.logs to Argus Records

Argus can natively read Netflow V 4,5 and flow-tools flow formats. And as of argus-clients.3.0.8.4 argus can convert json formatted Zeek conn.logs into Argus binary formats using our existing program raconvert.1 ... Json because we added json processing into the argus client library, but we can just as easily do non-json formats as well.

We extended raconvert.1 to take in a conversion map, using the '-f conversion.map' command-line option. And the specific support for converting zeek con logs is done through the support/Config/raconvert.zeek.conf file. This sample raconvert conversion map, should work for all the basic zeek conn.log variables, and as new are added, this file will need to be updated.

Converting Google VPC Logs to Argus Records

raconvert.1 can convert any json formatted string into a flow record, if it contains a minimum set of flow identifiers. Start time, an IP address or name, some metrics and optionally some metadata, is all that is needed.

This approach should work very well with Google VPC flow logs. If we can find some real examples of VPC flow logs, we can generate a raconvert.google.conf conversion map. Should be pretty easy ...