Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing the Compliance reports sent for Automate ingestion #3087

Closed
alexpop opened this issue Mar 12, 2020 · 3 comments · Fixed by #3302
Closed

Reducing the Compliance reports sent for Automate ingestion #3087

alexpop opened this issue Mar 12, 2020 · 3 comments · Fixed by #3302
Assignees

Comments

@alexpop
Copy link
Contributor

alexpop commented Mar 12, 2020

I looked into the reports we ingest in Automate from InSpec, trying to understand what it would take to reduce the size of these reports.

Reducing the size of the reports will:

  • Help with the Compliance report ingestion limitation of 4 MB
  • Increase ingestion performance for a shared Automate or large enterprise setup.
  • Reduce the data transfer fees and bottlenecks that can be caused by lots of data coming in.

Let's have a look at how Automate is getting the compliance reports.
The audit cookbook or effortless execute an InSpec scan using profiles pulled from Automate, HTTP URL, git, supermarket or local filesystem. The scans use the InSpec json-automate or automate reporters. When Automate receives the scan report it looks to see if the profile that was used for the scan already exists in the compliance-profiles ElasticSearch index. If it doesn't, it will extract the profile metadata from the report (i.e. profile title, license, copyrights, controls titles, descriptions, code, impact) and store it in the compliance-profiles index. Then, only the report specific data (date of the scan, control run status, and failure message) will be stored in the daily scan indices.

Let's look at an example. I ran cis-windows2012r2-level2-memberserver v2.3.0-14 against a newly launched Windows 2012r2 server in AWS. This stock server produced a json-automate inspec report of 1322 KB in size.

Here's a section of it, with just one control:

{
  "platform": {
    "name": "windows_server_2012_r2_standard",
    "release": "6.3.9600"
  },
  "profiles": [
    {
      "name": "cis-windows2012r2-level2-memberserver",
      "version": "2.3.0-14",
      "sha256": "9af6461af74932a0cc0ca27d0d37f19c12f326171048a2793e9dbf217671c4cd",
      "title": "CIS Microsoft Windows Server 2012 R2 Benchmark Level 2 - Member Server",
      "maintainer": "Chef Software, Inc.",
      "summary": "CIS Microsoft Windows Server 2012 R2 Benchmark Level 2 - Member Server translated from SCAP",
      "license": "Proprietary, All rights reserved",
      "copyright": "Chef Software, Inc.",
      "copyright_email": "support@chef.io",
      "supports": [
        {
          "platform-family": "windows"
        }
      ],
      "attributes": [],
      "controls": [
        {
          "id": "xccdf_org.cisecurity.benchmarks_rule_1.1.1_L1_Ensure_Enforce_password_history_is_set_to_24_or_more_passwords",
          "title": "(L1) Ensure 'Enforce password history' is set to '24 or more password(s)'",
          "desc": "This policy setting determines the number of renewed, unique passwords that have to be associated with a user account before you can reuse an old password. The value for this policy setting must be between 0 and 24 passwords. The default value for Windows Vista is 0 passwords, but the default setting in a domain is 24 passwords. To maintain the effectiveness of this policy setting, use the Minimum password age setting to prevent users from repeatedly changing their password.\n\nThe recommended state for this setting is: 24 or more password(s).\n\nRationale: The longer a user uses the same password, the greater the chance that an attacker can determine the password through brute force attacks. Also, any accounts that may have been compromised will remain exploitable for as long as the password is left unchanged. If password changes are required but password reuse is not prevented, or if users continually reuse a small number of passwords, the effectiveness of a good password policy is greatly reduced.\n\nIf you specify a low number for this policy setting, users will be able to use the same small number of passwords repeatedly. If you do not also configure the Minimum password age setting, users might repeatedly change their passwords until they can reuse their original password.",
          "descriptions": [
            {
              "label": "default",
              "data": "This policy setting determines the number of renewed, unique passwords that have to be associated with a user account before you can reuse an old password. The value for this policy setting must be between 0 and 24 passwords. The default value for Windows Vista is 0 passwords, but the default setting in a domain is 24 passwords. To maintain the effectiveness of this policy setting, use the Minimum password age setting to prevent users from repeatedly changing their password.\n\nThe recommended state for this setting is: 24 or more password(s).\n\nRationale: The longer a user uses the same password, the greater the chance that an attacker can determine the password through brute force attacks. Also, any accounts that may have been compromised will remain exploitable for as long as the password is left unchanged. If password changes are required but password reuse is not prevented, or if users continually reuse a small number of passwords, the effectiveness of a good password policy is greatly reduced.\n\nIf you specify a low number for this policy setting, users will be able to use the same small number of passwords repeatedly. If you do not also configure the Minimum password age setting, users might repeatedly change their passwords until they can reuse their original password."
            }
          ],
          "impact": 1,
          "refs": [],
          "tags": {
            "cce": "CCE-37166-6"
          },
          "code": "control \"xccdf_org.cisecurity.benchmarks_rule_1.1.1_L1_Ensure_Enforce_password_history_is_set_to_24_or_more_passwords\" do\n  title \"(L1) Ensure 'Enforce password history' is set to '24 or more password(s)'\"\n  desc  \"\n    This policy setting determines the number of renewed, unique passwords that have to be associated with a user account before you can reuse an old password. The value for this policy setting must be between 0 and 24 passwords. The default value for Windows Vista is 0 passwords, but the default setting in a domain is 24 passwords. To maintain the effectiveness of this policy setting, use the Minimum password age setting to prevent users from repeatedly changing their password.\n    \n    The recommended state for this setting is: 24 or more password(s).\n    \n    Rationale: The longer a user uses the same password, the greater the chance that an attacker can determine the password through brute force attacks. Also, any accounts that may have been compromised will remain exploitable for as long as the password is left unchanged. If password changes are required but password reuse is not prevented, or if users continually reuse a small number of passwords, the effectiveness of a good password policy is greatly reduced.\n    \n    If you specify a low number for this policy setting, users will be able to use the same small number of passwords repeatedly. If you do not also configure the Minimum password age setting, users might repeatedly change their passwords until they can reuse their original password.\n  \"\n  impact 1.0\n  tag cce: \"CCE-37166-6\"\n  describe security_policy do\n    its(\"PasswordHistorySize\") { should be >= 24 }\n  end\nend\n",
          "source_location": {
            "line": 3,
            "ref": "controls/translated-controls.rb"
          },
          "waiver_data": {},
          "results": [
            {
              "status": "failed",
              "code_desc": "Security Policy PasswordHistorySize is expected to be >= 24",
              "run_time": 0.007939,
              "start_time": "2020-03-09T15:33:59+00:00",
              "message": "expected: >= 24\n     got:    0"
            }
          ]
        }
      ]
    }
  ]
}

As can be seen, much of the information in the json report is static profile information. This metadata will be ignored if Automate already has the profile or the metadata was saved from a previously ingested report. So, if we already have the profile metadata, we could ingest only the report specific information and offer the same functionality in Automate. This is how such a reduced report json could look like:

{
  "platform": {
    "name": "windows_server_2012_r2_standard",
    "release": "6.3.9600"
  },
  "profiles": [
    {
      "name": "cis-windows2012r2-level2-memberserver",
      "version": "2.3.0-14",
      "sha256": "9af6461af74932a0cc0ca27d0d37f19c12f326171048a2793e9dbf217671c4cd",
      "controls": [
        {
          "id": "xccdf_org.cisecurity.benchmarks_rule_1.1.1_L1_Ensure_Enforce_password_history_is_set_to_24_or_more_passwords",
          "results": [
            {
              "status": "failed",
              "code_desc": "Security Policy PasswordHistorySize is expected to be >= 24",
              "run_time": 0.007939,
              "start_time": "2020-03-09T15:33:59+00:00",
              "message": "expected: >= 24\n     got:    0"
            }
          ]
        }
      ]
    }
  ]
}

After removing the metadata, the 1322 KB json-automate report comes down to just 247 KB, a reduction of over 81%. Let's call this json-automate-reduced.

I also tried the json-min reporter as well, designed in InSpec to provide a minimal report-only output, without the profile metadata. The json-min output doesn't provide the run_time and start_time fields for the controls.

In Automate we store the run_time and start_time fields for each control result, but we don't expose it in the UI at all. The json-automate-reduced report can be further reduced to 208 KB if we remove run_time and start_time from control results. This will be a reduction of approx 85% from the original size of 1322 KB. We talked about that the two fields might be valuable in the future to flag long-running controls. A good compromise solution is to report the run_time and start_time for a control result only if it exceeds a threshold, say 1 second by default, changeable via an audit cookbook/effortless attribute.

To overcome the situation where a scan is done with a profile that Automate doesn't have the metadata for, audit cookbook / effortless could query an Automate endpoint dedicated for checking if a profile (sha256) metadata exists. If it doesn't, it sends the full report 1322 KB (minus the run_time and start_time fields). If Automate already has the profile metadata, it sends the 208 KB report without the metadata. This will also allow legacy audit cookbook / effortless deployments to continue to work, while those that want to benefit for these improvements will have to upgrade both Automate and the effortless / audit cookbook version used.

I'm attaching here all the reports I mentioned above:

@trickyearlobe
Copy link
Contributor

Hi @alexpop , I have a customer that also has a problem with large reports, but the majority of data is diffs (in their case, a single CIS control with 9MB of data about packages that can be upgraded).

I think that we could consider automatic field truncation in diffs (and maybe other output) when the report size exceeds 4MB (probably with some kind of truncation marker). Thoughts?

Would it make more sense to do that in core Inspec, or in the tools that pass it back to Automate?

@alexpop
Copy link
Contributor Author

alexpop commented Apr 8, 2020

Very good point Richard. Could you obtain such a report or the part of it that causes it to grow to such sizes? We could do it in core inspec or in the software sending the report in ingestion (e.g. audit / effortless)

@trickyearlobe
Copy link
Contributor

@alexpop I’ll send a private link for the problematic report as it likely has customer info in it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants