Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Reporting]: Check if CSV cells (including headers) start with known formula characters #37930

Merged
merged 5 commits into from
Jul 1, 2019
Merged

[Reporting]: Check if CSV cells (including headers) start with known formula characters #37930

merged 5 commits into from
Jul 1, 2019

Conversation

joelgriffith
Copy link
Contributor

@joelgriffith joelgriffith commented Jun 3, 2019

Hover effect in reporting list

Screen Shot 2019-06-14 at 3 38 08 PM

Toast notification with text

Screen Shot 2019-06-14 at 3 38 46 PM

CSV Download button now warns when there's a potential formula involved in the output (=, -, +, and @ chars). See OWASP: https://www.owasp.org/index.php/CSV_Injection

interface IFlattened {
[header: string]: string;
}

This comment was marked as outdated.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@joelgriffith joelgriffith changed the title Foundations for doing CSV formula checks [Reporting]: Check if CSV cells (including headers) start with known formula characters Jun 5, 2019
@joelgriffith joelgriffith marked this pull request as ready for review June 5, 2019 15:41
@kertal kertal self-requested a review June 7, 2019 17:00
Copy link
Member

@kertal kertal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason this doesn't work, but I'm sure, it's me. generated a CSV without warning:

"=1","_id","_index","_score","_type","test.test",test4
"=323223",1,csv,0,"_doc",,
,2,csv,0,"_doc","=323",
,3,csv,0,"_doc","=323",
,4,csv,0,"_doc","=SUM323",
,5,csv,0,"_doc",,"+ 32SUM323"

Example Payload:

PUT csv/_doc/2
{
  "=2":{"=323223": "?323"}
}

PUT csv/_doc/4
{
  "test":{"test": "=SUM323"}
}

PUT csv/_doc/5
{
  "test4":"=SUM323"
}

PUT csv/_doc/5
{
  "test4":"+ 32SUM323"
}

BTW: When I tried to export a CSV from Discover, the link Report List didn't work (In chrome)
Bildschirmfoto 2019-06-07 um 20 06 01

@joelgriffith
Copy link
Contributor Author

Good find! I did have it succeed on most of the stuff I threw at it. Let me check here.

@joelgriffith
Copy link
Contributor Author

@kertal I'm able to get the warnings in the UI with your docs. What are you using to generate the CSV (Discover or API?). You might also have to start with a fresh ES instance since this does have index changes.

@kertal
Copy link
Member

kertal commented Jun 7, 2019 via email

@joelgriffith
Copy link
Contributor Author

It might be... I'm not sure how Kibana does these kinds of upgrades (I'll have to check)

@joelgriffith
Copy link
Contributor Author

OK, so I've verified that the new mapping does get added when KBN starts up. However, the warning only applies to CSV's generated after the mapping has been updated. Here's what I did:

  • Started KBN/ES on master, created @kertal's index, generated a CSV -- no warnings (as expected).
  • Killed KBN, changed to my branch, started KBN.
  • Navigated to reporting, no warnings on prior report (as expected).
  • Re-generated report using Discover -- same settings.
  • Navigated back to reporting, and the newly minted CSV report has a warning.

I think this is ready for one last pass @kertal, thanks again for the thoroughness!

@joelgriffith
Copy link
Contributor Author

Reports are located /api/reporting/jobs/list?page=0, and have a shape like:

[
  {
    "_index": ".reporting-2019.06.09",
    "_type": "_doc",
    "_id": "jwqnkoev0alq9d006270fn2f",
    "_score": null,
    "_source": {
      "kibana_name": "eMac.local",
      "browser_type": "chromium",
      "created_at": "2019-06-10T17:31:09.079Z",
      "priority": 10,
      "jobtype": "csv",
      "created_by": "elastic",
      "timeout": 120000,
      "kibana_id": "5b2de169-2785-441b-ae8c-186a1936b17d",
      "output": {
        "content_type": "text/csv",
        "size": 144,
        "warnings": [
          {
            "code": "security_csv_contains_known_formulas",
            "message": "Your CSV contains characters which spreadsheet applications interpret as formulas."
          }
        ],
        "max_size_reached": false
      },
      "process_expiration": "2019-06-10T17:33:12.066Z",
      "completed_at": "2019-06-10T17:31:12.135Z",
      "payload": {
        "headers": "ufgpQmopcODWEDp//axe7fLgOMPcikV2GWF6v6AtdA+mywgevajb4vkcIH3UBX3bhl/ZeEhaTo7/yw3ZUUi9vVwCHvswXDDud9XCjisywCB+AygRjIar3tCp8E/lTyH6BQ3Dh55WBO3JDBIhtD7bcwkqiVGB2M/IZcrz2r14lBlsGyTC/ugu3ymgGVf9PweuF8nPuV2HkikHE0TRYbNGmr4EsQfK64MO0YaOXbxXj7x/mjKiH+J3kBUFLqM1S11rYXvjTF0p4GHgbC77CNfm7sMtbQKh8vBZ9Jk42FoL0LBSTssbZU0Kz9bQwOr9HTim+bR70E0vFExtiFTNh/Vei7CX31tnV2BUjJPVWJ0cluKAMbo+l5oN4ehc9ZiqxgaQc+wTZanlE7ElY6QPTHxUW6rs42yaLQ/j7ZQDtoNkFe+9yQJSBxV9Y0P3LZHWnVA6D9wCGBriGNV6vD2H4OidGmNg4QrBrRt3Uk25M3WtmH/DnLmU/3mOIj0D8flcsc5wI6LjsTGh/HlYTWbj5Cy18eyBA+E3P9hFPr3tfYlbTLdEak3mY9/BM+z0rplq4QSSC44HlSqMXJ28KviN3UQ7CjaquygkAdVFaimA0a3mcb9OQ+T3tfpFc7v9vfpDn9nqaWCoP6ThUTbj/OEd4fXamZNrm7/sUs71W+G0gsKsVKoke98HQr5lST4JpB7Gpu1JuGslWsE27Wwv4HGvGQSZIIZ72MHkt9JPCTTehy4ftphBVSmobwZ9IviYsxKPMpW12remQ+4qqFhJPJEUSA93f/czlg087o/28LGWmzS7mrlw/UQGrVwN70mM971ViUe4K4Aa9Bj21ne07BCe1jWJ3U/kiK7Vc4ol630xBi86LY3AJpzSq6HX3TLqNfsQaPqIaqCOwyLsxfgv+l2H+kpJWtgA4EpRT7pzfuvNb6cZmJCYpd8+23HxoighoNnDFZ5cMRPlhtd1/H70K/at+IuKDPMW2E0/2fGj0GczxNaEmcaeokvbqxgzhz2A52hV8U5OhhQCO3h1xh5sFixzsrtmePTuy0/q8kISwi7/+u3kJizlzCquC2S5FB9RFqBqXv9qpcvbjkTheQvTSZcmo55WfOPZvKXko7IJwtLibUzbiXo+3fXKa6cT+GrcqJAuhf8+yyk/XXyl0p5TVhwla6umwYworbmzQazeiDYFmP1DkiGXwWBMoFbe63vJPAWxVS1YUYraIAQEo5TrDu7HyE2afEN6RUS/kvAxDcXGzDwe7j93YzK5ob2mAWMpz82uNXnLL45evfU9rfnUV/m1zzxDojM=",
        "searchRequest": {
          "index": "csv*",
          "body": {
            "stored_fields": [
              "=2.=323223",
              "_id",
              "_index",
              "_score",
              "_type",
              "test.test",
              "test4"
            ],
            "query": {
              "bool": {
                "filter": [
                  {
                    "match_all": {}
                  }
                ],
                "must_not": [],
                "should": [],
                "must": []
              }
            },
            "script_fields": {},
            "_source": {
              "excludes": [],
              "includes": [
                "=2.=323223",
                "_id",
                "_index",
                "_score",
                "_type",
                "test.test",
                "test4"
              ]
            },
            "docvalue_fields": [],
            "sort": [
              {
                "_score": {
                  "order": "desc"
                }
              }
            ],
            "version": true
          }
        },
        "indexPatternSavedObject": {
          "migrationVersion": {
            "index-pattern": "6.5.0"
          },
          "updated_at": "2019-06-10T17:30:58.877Z",
          "references": [],
          "attributes": {
            "title": "csv*",
            "fields": "[{\"name\":\"=2.=323223\",\"type\":\"string\",\"esTypes\":[\"text\"],\"count\":1,\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"=2.=323223.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"=2.=323223\",\"subType\":\"multi\"},{\"name\":\"_id\",\"type\":\"string\",\"esTypes\":[\"_id\"],\"count\":1,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"_index\",\"type\":\"string\",\"esTypes\":[\"_index\"],\"count\":1,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"_score\",\"type\":\"number\",\"count\":1,\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"_source\",\"type\":\"_source\",\"esTypes\":[\"_source\"],\"count\":0,\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"_type\",\"type\":\"string\",\"esTypes\":[\"_type\"],\"count\":1,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"test.test\",\"type\":\"string\",\"esTypes\":[\"text\"],\"count\":1,\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"test.test.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"test.test\",\"subType\":\"multi\"},{\"name\":\"test4\",\"type\":\"string\",\"esTypes\":[\"text\"],\"count\":1,\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"test4.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"test4\",\"subType\":\"multi\"}]"
          },
          "id": "797387b0-8ba5-11e9-b7d5-93bc205b27db",
          "type": "index-pattern",
          "version": "WzEyLDFd"
        },
        "basePath": "",
        "conflictedTypesFields": [],
        "indexPatternId": "797387b0-8ba5-11e9-b7d5-93bc205b27db",
        "metaFields": [
          "_source",
          "_id",
          "_type",
          "_index",
          "_score"
        ],
        "fields": [
          "=2.=323223",
          "_id",
          "_index",
          "_score",
          "_type",
          "test.test",
          "test4"
        ],
        "title": "New Saved Search",
        "type": "search"
      },
      "meta": {
        "layout": "none",
        "objectType": "search"
      },
      "max_attempts": 3,
      "started_at": "2019-06-10T17:31:12.066Z",
      "attempts": 1,
      "status": "completed"
    },
    "sort": [
      1560187869079
    ]
  }
]

Path should be _source.output.warnings -- where warnings might be missing on older versions.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@kertal kertal self-requested a review June 11, 2019 11:41
Copy link
Member

@kertal kertal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, it works, Code LGTM, Tested with Chrome and Firefox.
One Add-on: Warning would also make sense when you directly download the CSV in Discover

@joelgriffith
Copy link
Contributor Author

Going to wait for @jakelandis to comment that this will work before we commit to it :)

@jakelandis
Copy link
Contributor

I'm testing this out and was curious if the following download link (bottom right from discover) should also carry the warning ?
image

Also, pretty sure that this report does not contain formulas (it is the logs sample data built into kibana) and pretty sure this only started to show this message after I added a report that included data to trigger the warning.
image

@joelgriffith
Copy link
Contributor Author

@jakelandis great point on the toast notification, I'll look to rectify that.

The sample data for logs I think does contain a - in some cell values (which is one of the characters listed here https://www.owasp.org/index.php/CSV_Injection). @kobelb how do we want to handle that -- my feeling is that we'll likely get a few false positives with the - char.

@kobelb
Copy link
Contributor

kobelb commented Jun 12, 2019

@joelgriffith, is the - appearing at the start of a "cell"?

@jakelandis
Copy link
Contributor

@joelgriffith - I am starting to look at implementing something similar in the Watcher email. After taking a closer look, it seems that we don't ever call the jobs/list api, rather we make the initial call to reporting API [1] and grab the URL for the final payload [2], then busy wait (poll) for the response to be a 200. Once we find a success we just pull the bytes and use that as the attachment.

Is it possible to include the warnings as part the response body ?

Perhaps "kb_reporting_warnings" : "comma separated list of messages"

I realize that in prior conversations I asked for an array of objects. However, for a variety of reasons (happy to go into more detail if you care), I don't think I will be able to make use "code", and a flat list of string(s) is sufficient.

I can make an additional call to get the warnings, but would love if I could just pull them from the response header of payload response [2].

example requests
[1] http://localhost:5601/oqh/api/reporting/generate/csv?jobParams=%28conflictedTypesFields%3A%21%28%29%2Cfields%3A%21%28_id%2C_index%2C_score%2C_type%2Cfoo%29%2CindexPatternId%3A%27161601e0-8c89-11e9-b1b9-391c1d6eacce%27%2CmetaFields%3A%21%28_source%2C_id%2C_type%2C_index%2C_score%29%2CsearchRequest%3A%28body%3A%28_source%3A%28excludes%3A%21%28%29%29%2Cdocvalue_fields%3A%21%28%29%2Cquery%3A%28bool%3A%28filter%3A%21%28%29%2Cmust%3A%21%28%28match_all%3A%28%29%29%29%2Cmust_not%3A%21%28%29%2Cshould%3A%21%28%29%29%29%2Cscript_fields%3A%28%29%2Csort%3A%21%28%28_score%3A%28order%3Adesc%29%29%29%2Cstored_fields%3A%21%28%27*%27%29%2Cversion%3A%21t%29%2Cindex%3A%27attack*%27%29%2Ctitle%3Aattackme%2Ctype%3Asearch%29
[2] http://localhost:5601/oqh/api/reporting/jobs/download/jwtk4y7l0ao664cc335v5ikv

@jakelandis
Copy link
Contributor

In addition to the above request to return the warning as part of the download response, could you return an abbreviated warning ? For example:

Your CSV contains characters which spreadsheet applications interpret as formulas.

Doesn't read very well in the context of an email's attachment. I was thinking the email attachment should read closer to

* Warning - The attached file [report.csv] has been flagged as suspicious due to a potential for [CSV injection]. Use caution to open the attachment.

In this case you would pass the string "CSV injection" back with the payload.

@joshbressers - Can you provide some guidance here on how the email version of this should read ?

@joelgriffith
Copy link
Contributor Author

@kobelb yes, some logs tend to start with -, seems to be a common character for ID's (some variation of GUID).

@jakelandis I'll take a look at your comments tomorrow -- currently on support duty today but will follow up with details tomorrow.

@joelgriffith
Copy link
Contributor Author

but would love if I could just pull them from the response header of payload response

I think doing it in a header makes the most sense, otherwise the overall structure of the payload wouldn't work (since it's a CSV there's no other places to store meta-data). I think for programmatic usage a new header is a good idea, I'll see if we have a convention in Kibana for doing so.

@kobelb
Copy link
Contributor

kobelb commented Jun 13, 2019

yes, some logs tend to start with -, seems to be a common character for ID's (some variation of GUID).

Interesting, I know at some point we talked about leaving these warnings "off" by default, and allowing users to opt-in to them. If this does end up being super common, perhaps we should consider this approach?

@joelgriffith
Copy link
Contributor Author

OK, given our usage of this and the findings from it, I'm going to do the following:

  • Update the JSON payload to have a unique property csvHasFormulas: true on it. This makes it easier to write docs and parse out for downstream clients vs having an array of warnings.

  • I'll also add a header to the response (TBD), so that programmatic usage can get a sense of the issue. Also might add a header for max-sized-reached as well just to have parity.

  • Finally, I'll throw a config flag at this and make it configurable since we do get some false positives.

Sound good?

@kobelb
Copy link
Contributor

kobelb commented Jun 14, 2019

Sounds good to me!

@jakelandis
Copy link
Contributor

Update the JSON payload to have a unique property csvHasFormulas: true on it. This makes it easier to write docs and parse out for downstream clients vs having an array of warnings.

Which JSON payload ? For the final payload we consume api/reporting/jobs/download/<id> which I am pretty sure we just read that as bytes and attach those bytes (no parsing).

I'll also add a header to the response (TBD), so that programmatic usage can get a sense of the issue. Also might add a header for max-sized-reached as well just to have parity.

+1

Finally, I'll throw a config flag at this and make it configurable since we do get some false positives.

+1 - ES will also have a config to disable/enable this. I will follow your lead on the default.

@joelgriffith
Copy link
Contributor Author

Which JSON payload ? For the final payload we consume api/reporting/jobs/download/ which I am pretty sure we just read that as bytes and attach those bytes (no parsing).

Yeah -- not on the CSV payload itself, but the meta-data that comes from the GET on all reports (which is a JSON API). CSV API will have no changes.

@joelgriffith
Copy link
Contributor Author

Ok folks -- I'm out all next week so I tried to get this as far as possible. My TODO's are still:

  • Add config flag
  • Add tests for the new headers

Other than that this is pretty much feature-complete. Headers are:

kbn-csv-contains-formulas: true | false and
kbn-max-size-reached: true | false

Let me know if it's easier to just detect their presence versus parsing out their values...

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@joelgriffith
Copy link
Contributor Author

@jakelandis let me know if there's anything you needed from this PR that I haven't covered. Going to get it up to speed and hopefully merge soon

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@joelgriffith
Copy link
Contributor Author

Config flag is now added, and is default to true. To turn off, simply do:

xpack.reporting.csv.checkForFormulas: false

This will have reporting not check for CSV formulas during report generation, which means we won't store meta-data related to CSV's having formulas or not when this flag is set to false.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@joelgriffith
Copy link
Contributor Author

@jakelandis not sure if you've had time to look at this, but let me know when you're comfortable with it.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@joelgriffith joelgriffith merged commit 7b67613 into elastic:master Jul 1, 2019
joelgriffith added a commit that referenced this pull request Jul 1, 2019
…known formula characters (#37930) (#40081)

* [Reporting]: Check if CSV cells (including headers) start with known formula characters (#37930)

* Re-working csv injection issue into master

* Config flag for checking if CSV's contain formulas

* Fixing snapshots

* Fixing bad merge conflict with get_document_payload
jakelandis added a commit to elastic/elasticsearch that referenced this pull request Aug 14, 2019
This commit introduces a Warning message to the emails generated by 
Watcher's reporting action. This change complements Kibana's CSV 
formula notifications (see elastic/kibana#37930). 

This is implemented by reading a header (kbn-csv-contains-formulas) 
provided by Kibana to notify to attach the Warning to the email. 
The wording of the warning is borrowed from Kibana's UI and may 
be overridden by a dynamic setting
xpack.notification.reporting.warning.kbn-csv-contains-formulas.text.
This warning is enabled by default, but may be disabled via a 
dynamic setting xpack.notification.reporting.warning.enabled.
jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Aug 14, 2019
…c#44460)

This commit introduces a Warning message to the emails generated by 
Watcher's reporting action. This change complements Kibana's CSV 
formula notifications (see elastic/kibana#37930). 

This is implemented by reading a header (kbn-csv-contains-formulas) 
provided by Kibana to notify to attach the Warning to the email. 
The wording of the warning is borrowed from Kibana's UI and may 
be overridden by a dynamic setting
xpack.notification.reporting.warning.kbn-csv-contains-formulas.text.
This warning is enabled by default, but may be disabled via a 
dynamic setting xpack.notification.reporting.warning.enabled.
jakelandis added a commit to elastic/elasticsearch that referenced this pull request Aug 26, 2019
#45557)

* Watcher add email warning if CSV attachment contains formulas (#44460)

This commit introduces a Warning message to the emails generated by 
Watcher's reporting action. This change complements Kibana's CSV 
formula notifications (see elastic/kibana#37930). 

This is implemented by reading a header (kbn-csv-contains-formulas) 
provided by Kibana to notify to attach the Warning to the email. 
The wording of the warning is borrowed from Kibana's UI and may 
be overridden by a dynamic setting
xpack.notification.reporting.warning.kbn-csv-contains-formulas.text.
This warning is enabled by default, but may be disabled via a 
dynamic setting xpack.notification.reporting.warning.enabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants