Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Table - CSV export formulas warning #56081

Closed
kobelb opened this issue Jan 27, 2020 · 17 comments · Fixed by #105221
Closed

Data Table - CSV export formulas warning #56081

kobelb opened this issue Jan 27, 2020 · 17 comments · Fixed by #105221
Labels
enhancement New value added to drive a business result Feature:Data Table Data table visualization feature Feature:Inspector Inspector infrastructure and implementations Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@kobelb
Copy link
Contributor

kobelb commented Jan 27, 2020

When exporting a CSV using the Data Table visualization, we should warn users when the CSV contains characters which some spreadsheet applications interpret as formulas. Any cells, including the header, which start with the following characters can be interpreted as a formula: =, +, -, @.

Reporting implemented a similar protection in #37930, where the following modal is displayed after a report completes processing which contains formulas, and the user much click on "Download" after confirmation:

Screen Shot 2020-01-27 at 3 05 24 PM

@kobelb kobelb added enhancement New value added to drive a business result Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Jan 27, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app (Team:KibanaApp)

@arisonl
Copy link
Contributor

arisonl commented Mar 31, 2020

In a recent customer call, the customer asked for the ability to escape these characters, cc: @AlonaNadler

@AlonaNadler
Copy link

Adding @alexfrancoeur as he is responsible for reporting. I see Kibana behavior similar to most solutions in the market in this case

@alexfrancoeur
Copy link

@AlonaNadler when you say Kibana behavior, are you referring to what Kibana's behavior is today with no warning? Or what's being proposed here?

@arisonl when you say escape, do you mean escaping them like this\=? Or actually excluding these types of symbols with the export.

@arisonl
Copy link
Contributor

arisonl commented Apr 8, 2020

@alexfrancoeur my understanding is excluding them, but I can reach out to clarify

@AlonaNadler
Copy link

Most products in the market don't support this request, they go ahead and export the CSV even if there are special characters. I don't see us reviewing all data and removing these, seem to like a rare request.

@arisonl
Copy link
Contributor

arisonl commented Apr 8, 2020

Agreed but if we detect them like what this issue describes, how much extra effort is it to exclude them? It could be small (depending on the details of course). I think that if we implement the warning that this issue describes, we can consider exclusion in a subsequent iteration as an option for the user provided that it is a small incremental step.

@joshbressers
Copy link

CSV data is meant to be escaped by adding a single quote to the start of the field.

So for example

=cmd|' /C calc'!A0

Becomes

'=cmd|' /C calc'!A0

You can find more details about this problem here
https://owasp.org/www-community/attacks/CSV_Injection

It is fundamentally a spreadsheet and policy problem. Spreadsheets launching executables is ridiculous.

@alexfrancoeur
Copy link

I agree that generally, this is a spreadsheet problem, but we're also seeing members of our community putting ownership back on the stack as well. It feels like there is some middle ground here. As Alona suggested, for the majority of our users and competing products in the analytics space, this is not common practice. That being said, more and more the stack is being deployed into financial, government and health organizations in which the primary consumer of Kibana are end users that may not fully understand the technical implications of CSV export.

I propose we consider a Kibana config setting that would enable escaping characters at the beginning of a cell that could be interpreted as formulas ( =, +, -, @). This makes more sense over an advanced setting as it's applied across Kibana, likely the level of control an administrator of Kibana would want.

@elastic/kibana-reporting-services what would the level of effort be to introduce this type of functionality? I don't believe we manipulate data exported anywhere at the moment.

@alexfrancoeur
Copy link

Linking to @joelgriffith's PR: #63645

@legrego
Copy link
Member

legrego commented Apr 21, 2020

@joelgriffith's PR solved this for Reporting's CSV generation, but this the Data Table CSV export is something else altogether. Data Table's CSV is actually located here, and does not benefit from the work that Joel did in #37930 or #63645

@legrego
Copy link
Member

legrego commented Apr 21, 2020

@elastic/kibana-app how much effort do you think it would take to replicate Reporting's functionality for Data Table export?

@timroes timroes added Feature:Data Table Data table visualization feature Feature:Inspector Inspector infrastructure and implementations labels Apr 21, 2020
@timroes
Copy link
Contributor

timroes commented Apr 21, 2020

@legrego Should be a medium sized effort to get this into Kibana App code (it's not only the data table, but also the inspector data CSV export I assume that would need that). Since we're basically running clientside we would also need to check the full table to render that button with a warning (in case we want similar behavior) for every datatable. That could have significant performance impacts, since we know users are creating data tables with a lot of columns, and we sometimes even run into problems parsing the resulting nested JSON returned by ES for that. So I'd assume doing another time of iterating over every cell might cause some performance drawbacks on every dashboard having a data table visible, so not sure if we would want to show that warning, or just take that advanced setting into account.

@flash1293
Copy link
Contributor

There's a config flag for reporting escaping cell which might include formulas:

const formulasEscaped = escapeFormulas && cellHasFormulas(val) ? "'" + val : val;

We could use the same setup for all of the remaining client side CSV exports.

@timroes
Copy link
Contributor

timroes commented Feb 22, 2021

currently that setting is coming from a reportnig specific names setting. We should either generalize that, or: I would neither be opposed to just always escape formulars. We know that Kibana itself will never create "purposfully" any formular in a CSV, and I don't see any use-case why a user might want to do that via a field value. Thus I think we'd be fine just escaping all cells beginning with = via ' in any export even without a config flag. @legrego any thoughts around it?

@flash1293
Copy link
Contributor

I guess for reporting the decision to solve it via flag was made because previously it would be possible to export CSVs with embedded formulas - something a user could do on purpose via a number of ways (e.g. by using scripted fields to embed functioning hyperlinks in generated reports for use from within Excel). Defaulting to escaping without providing a way to turn it off would cut off those users. The flag should definitely be set to true by default. IMHO this case is so similar to the reporting one we should behave in the same way.

@legrego
Copy link
Member

legrego commented Feb 22, 2021

currently that setting is coming from a reportnig specific names setting. We should either generalize that, or: I would neither be opposed to just always escape formulars. We know that Kibana itself will never create "purposfully" any formular in a CSV, and I don't see any use-case why a user might want to do that via a field value. Thus I think we'd be fine just escaping all cells beginning with = via ' in any export even without a config flag. @legrego any thoughts around it?

This is only partially related, but I've long wanted to consolidate the CSV generation logic, so that data tables, reporting, and others(?) could share the same code for creating CSVs.

I think we need to make the escaping of formulas optional across the board, so I'm +1 to generalizing the configuration setting. I think we should generate CSVs without escaping by default, but allow administrators to opt-in to this behavior if they require it. Our CSVs should strive to be an unadulterated export of the user's data (while still conforming to the CSV specification). However, if an administrator wants us to escape certain character sequences to prevent certain types of attacks, then that's something we should allow them to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Data Table Data table visualization feature Feature:Inspector Inspector infrastructure and implementations Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
10 participants