# Video: Security Concerns with Query and Eval

This video points out some security risks from carelessly constructed query strings.

In [None]:
import pandas as pd
import sys

In [None]:
abalone = pd.read_csv("https://raw.githubusercontent.com/bu-cds-omds/dx602-examples/main/data/abalone.tsv", sep="\t")


## Referencing Python Variables is Convenient

```
target_sexes = ['M', 'F']
abalone.query("Sex in @target_sexes")
```

## Never Insert Outside Data in Your Query String

If you construct your query string from someone else's data, you risk running arbitrary Python code of someone else's choice.

* <font color="red">abalone.query(f"""Sex in ({','.join("'" + s + "'" for s in target_sexes})")</font>
* <font color="red">abalone.query(f"...")</font>
* <font color="red">abalone.query("... == " + user_data)</font>

## Code Example: An Evil Request

In [None]:
evil_request = "', @sys.stderr.write('hi'), '"

In [None]:
abalone.query("Sex =='" + evil_request + "'")

hi

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.

## So What?

What if the evil request instead accessed
* `os.remove` to remove files?
* `os.environ` to access security credentials in your production environment?
* `dict.update` and changed some of your program data?

... and cleverly did those while returning a legitimate value back so there was no crash to catch?

## But Why Would I Do That?

* Automatic analysis finds an interesting value in one data set...
* Automated followup or verification copies that value to run more queries in another data set.

## But No One is Out to Get Me...

* Do you want your analysis to crash because of data with an @ or quotation mark included?


## Code Example

In [None]:
abalone.query("Sex == @evil_request")

Unnamed: 0,Sex,Length,Diameter,Height,Whole_weight,Shucked_weight,Viscera_weight,Shell_weight,Rings


## Just Use @variable References

They are usually
* Easier to read
* Easier to write
* Will not execute code unexpectedly.