Skip to content

Python: Private Data Cleartext Storage/Logging #3899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<include src="PrivateCleartextStorage.qhelp" /></qhelp>
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
/**
* @name Clear-text logging of private information
* @description Logging private information without encryption or hashing can
* expose it to an attacker.
* @kind path-problem
* @problem.severity error
* @id py/clear-text-logging-private-data
* @tags security
* external/cwe/cwe-312
* external/cwe/cwe-315
* external/cwe/cwe-359
*/

import python
import semmle.python.security.Paths
import semmle.python.security.TaintTracking
import experimental.semmle.python.security.PrivateData
import semmle.python.security.ClearText

class CleartextLoggingConfiguration extends TaintTracking::Configuration {
CleartextLoggingConfiguration() { this = "ClearTextLogging" }

override predicate isSource(DataFlow::Node src, TaintKind kind) {
src.asCfgNode().(PrivateData::Source).isSourceOf(kind)
}

override predicate isSink(DataFlow::Node sink, TaintKind kind) {
sink.asCfgNode() instanceof ClearTextLogging::Sink and
kind instanceof PrivateData
}
}

from CleartextLoggingConfiguration config, TaintedPathSource source, TaintedPathSink sink
where config.hasFlowPath(source, sink)
select sink.getSink(), source, sink, "Private data returned by $@ is logged here.",
source.getSource(), source.getCfgNode().(PrivateData::Source).repr()
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>

<overview>
<p>
Private information that is stored unencrypted is accessible to an attacker
who gains access to the storage. This is particularly important for cookies,
which are stored on the machine of the end-user.
</p>
</overview>

<recommendation>
<p>
Ensure that private information is always encrypted before being stored.
If possible, avoid placing private information in cookies altogether.
Instead, prefer storing, in the cookie, a key that can be used to look up the
private information.
</p>
<p>
In general, decrypt private information only at the point where it is
necessary for it to be used in cleartext.
</p>

<p>

Be aware that external processes often store the <code>standard
out</code> and <code>standard error</code> streams of the application,
causing logged private information to be stored as well.

</p>

</recommendation>

<references>

<li>M. Dowd, J. McDonald and J. Schuhm, <i>The Art of Software Security Assessment</i>, 1st Edition, Chapter 2 - 'Common Vulnerabilities of Encryption', p. 43. Addison Wesley, 2006.</li>
<li>M. Howard and D. LeBlanc, <i>Writing Secure Code</i>, 2nd Edition, Chapter 9 - 'Protecting Secret Data', p. 299. Microsoft, 2002.</li>

</references>
</qhelp>
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
/**
* @name Clear-text storage of private information
* @description Private information stored without encryption or hashing can expose it to an
* attacker.
* @kind path-problem
* @problem.severity error
* @id py/clear-text-storage-private-data
* @tags security
* external/cwe/cwe-312
* external/cwe/cwe-315
* external/cwe/cwe-359
*/

import python
import semmle.python.security.Paths
import semmle.python.security.TaintTracking
import experimental.semmle.python.security.PrivateData
import semmle.python.security.ClearText

class CleartextStorageConfiguration extends TaintTracking::Configuration {
CleartextStorageConfiguration() { this = "PrivateClearTextStorage" }

override predicate isSource(DataFlow::Node src, TaintKind kind) {
src.asCfgNode().(PrivateData::Source).isSourceOf(kind)
}

override predicate isSink(DataFlow::Node sink, TaintKind kind) {
sink.asCfgNode() instanceof ClearTextStorage::Sink and
kind instanceof PrivateData
}
}

from CleartextStorageConfiguration config, TaintedPathSource source, TaintedPathSink sink
where config.hasFlowPath(source, sink)
select sink.getSink(), source, sink, "Private data from $@ is stored here.", source.getSource(),
source.getCfgNode().(PrivateData::Source).repr()
222 changes: 222 additions & 0 deletions python/ql/src/experimental/semmle/python/security/PrivateData.qll
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
/**
* Provides classes and predicates for identifying private data and methods for security.
*
* 'Private' data in general is anything that should not be sent around in unencrypted form. This
* library tries to guess where private data may either be stored in a variable or produced by a
* method.
*
* In addition, there are methods that ought not to be executed or not in a fashion that the user
* can control. This includes authorization methods such as logins, and sending of data, etc.
*/

import python
import semmle.python.security.TaintTracking
import semmle.python.web.HttpRequest

/**
* Provides heuristics for identifying names related to private information.
*
* INTERNAL: Do not use directly.
* This is copied from the sensitive data library (which was copied from a javascript library), but should be language independent.
*/
private module HeuristicNames {
/**
* Gets a regular expression that identifies strings that may indicate the presence of private data.
*/
string maybeSocialSecurityNumber() { result = "(?is).*social.*security.*" }
string maybePostCode() { result = "(?is).*postcode.*" }
string maybeZipCode() { result = "(?is).*zipcode.*" }
string maybeTelephone() { result = "(?is).*telephone.*" }
string maybeLatitude() { result = "(?is).*latitude.*" }
string maybeLongitude() { result = "(?is).*longitude.*" }
string maybeCreditCard() { result = "(?is).*credit.*card.*" }
string maybeSalary() { result = "(?is).*salary.*" }
string maybeBankAccount() { result = "(?is).*bank.*account.*" }
string maybeEmail() { result = "(?is).*email.*" }
string maybeMobile() { result = "(?is).*mobile.*" }
string maybeEmployer() { result = "(?is).*employer.*" }
string maybeMedical() { result = "(?is).*medical.*" }

/**
* Gets a regular expression that identifies strings that may indicate the presence
* of private data, with `classification` describing the kind of private data involved.
*/
string maybePrivate(PrivateData data) {
result = maybeSocialSecurityNumber() and data instanceof PrivateData::SocialSecurityNumber
or
result = maybePostCode() and data instanceof PrivateData::PostCode
or
result = maybeZipCode() and data instanceof PrivateData::ZipCode
or
result = maybeTelephone() and data instanceof PrivateData::Telephone
or
result = maybeLatitude() and data instanceof PrivateData::Latitude
or
result = maybeLongitude() and data instanceof PrivateData::Longitude
or
result = maybeCreditCard() and data instanceof PrivateData::CreditCard
or
result = maybeSalary() and data instanceof PrivateData::Salary
or
result = maybeBankAccount() and data instanceof PrivateData::BankAccount
or
result = maybeEmail() and data instanceof PrivateData::Email
or
result = maybeMobile() and data instanceof PrivateData::Mobile
or
result = maybeEmployer() and data instanceof PrivateData::Employer
or
result = maybeMedical() and data instanceof PrivateData::Medical
}

/**
* Gets a regular expression that identifies strings that may indicate the presence of data
* that is hashed or encrypted, and hence rendered non-private.
*/
string notPrivate() {
result = "(?is).*(redact|censor|obfuscate|hash|md5|sha|((?<!un)(en))?(crypt|code)).*"
}

bindingset[name]
PrivateData getPrivateDataForName(string name) {
name.regexpMatch(HeuristicNames::maybePrivate(result)) and
not name.regexpMatch(HeuristicNames::notPrivate())
}
}

abstract class PrivateData extends TaintKind {
bindingset[this]
PrivateData() { this = this }
}

module PrivateData {
class SocialSecurityNumber extends PrivateData {
SocialSecurityNumber() { this = "private.data.socialsecuritynumber" }

override string repr() { result = "a social security number" }
}

class PostCode extends PrivateData {
PostCode() { this = "private.data.postcode" }

override string repr() { result = "a postcode" }
}

class ZipCode extends PrivateData {
ZipCode() { this = "private.data.zipcode" }

override string repr() { result = "a zipcode" }
}

class Telephone extends PrivateData {
Telephone() { this = "private.data.telephone" }

override string repr() { result = "a telephone number" }
}

class Latitude extends PrivateData {
Latitude() { this = "private.data.latitude" }

override string repr() { result = "a latitude" }
}

class Longitude extends PrivateData {
Longitude() { this = "private.data.longitude" }

override string repr() { result = "a longitude" }
}

class CreditCard extends PrivateData {
CreditCard() { this = "private.data.creditcard" }

override string repr() { result = "a credit card" }
}

class Salary extends PrivateData {
Salary() { this = "private.data.salary" }

override string repr() { result = "a salary" }
}

class BankAccount extends PrivateData {
BankAccount() { this = "private.data.bankaccount" }

override string repr() { result = "bank account related information" }
}

class Email extends PrivateData {
Email() { this = "private.data.email" }

override string repr() { result = "an email address" }
}

class Mobile extends PrivateData {
Mobile() { this = "private.data.mobile" }

override string repr() { result = "a mobile phone number" }
}

class Employer extends PrivateData {
Employer() { this = "private.data.employer" }

override string repr() { result = "an employer" }
}

class Medical extends PrivateData {
Medical() { this = "private.data.medical" }

override string repr() { result = "medical information" }
}

private PrivateData fromFunction(Value func) {
result = HeuristicNames::getPrivateDataForName(func.getName())
}

abstract class Source extends TaintSource {
abstract string repr();
}

private class PrivateCallSource extends Source {
PrivateData data;

PrivateCallSource() {
exists(Value callee | callee.getACall() = this | data = fromFunction(callee))
}

override predicate isSourceOf(TaintKind kind) { kind = data }

override string repr() { result = "a call returning " + data.repr() }
}

/** An access to a variable or property that might contain private data. */
private class PrivateVariableAccess extends PrivateData::Source {
PrivateData data;

PrivateVariableAccess() {
data = HeuristicNames::getPrivateDataForName(this.(AttrNode).getName())
}

override predicate isSourceOf(TaintKind kind) { kind = data }

override string repr() { result = "an attribute or property containing " + data.repr() }
}

private class PrivateRequestParameter extends PrivateData::Source {
PrivateData data;

PrivateRequestParameter() {
this.(CallNode).getFunction().(AttrNode).getName() = "get" and
exists(StringValue private |
this.(CallNode).getAnArg().pointsTo(private) and
data = HeuristicNames::getPrivateDataForName(private.getText())
)
}

override predicate isSourceOf(TaintKind kind) { kind = data }

override string repr() { result = "a request parameter containing " + data.repr() }
}
}

//Backwards compatibility
class PrivateDataSource = PrivateData::Source;
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
edges
| test.py:7:19:7:35 | an ID | test.py:8:38:8:48 | an ID |
| test.py:7:19:7:35 | bank account related information | test.py:8:38:8:48 | bank account related information |
#select
| test.py:8:38:8:48 | bankaccount | test.py:7:19:7:35 | bank account related information | test.py:8:38:8:48 | bank account related information | Private data returned by $@ is logged here. | test.py:7:19:7:35 | get_bankaccount() | a call returning bank account related information |
| test.py:14:32:14:43 | get_salary() | test.py:14:32:14:43 | a salary | test.py:14:32:14:43 | a salary | Private data returned by $@ is logged here. | test.py:14:32:14:43 | get_salary() | a call returning a salary |
| test.py:17:11:17:27 | get_bankaccount() | test.py:17:11:17:27 | bank account related information | test.py:17:11:17:27 | bank account related information | Private data returned by $@ is logged here. | test.py:17:11:17:27 | get_bankaccount() | a call returning bank account related information |
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
experimental/Security/CWE-312/PrivateCleartextLogging.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
edges
| ssn_in_cookie.py:7:22:7:63 | a social security number | ssn_in_cookie.py:9:47:9:60 | a social security number |
| ssn_in_cookie.py:14:22:14:63 | a social security number | ssn_in_cookie.py:16:47:16:60 | a social security number |
| test.py:7:19:7:35 | an ID | test.py:8:38:8:48 | an ID |
| test.py:20:14:20:25 | a salary | test.py:22:20:22:25 | a salary |
#select
| ssn_in_cookie.py:9:47:9:60 | socialsecurity | ssn_in_cookie.py:7:22:7:63 | a social security number | ssn_in_cookie.py:9:47:9:60 | a social security number | Private data from $@ is stored here. | ssn_in_cookie.py:7:22:7:63 | Attribute() | a request parameter containing a social security number |
| ssn_in_cookie.py:16:47:16:60 | socialsecurity | ssn_in_cookie.py:14:22:14:63 | a social security number | ssn_in_cookie.py:16:47:16:60 | a social security number | Private data from $@ is stored here. | ssn_in_cookie.py:14:22:14:63 | Attribute() | a request parameter containing a social security number |
| test.py:22:20:22:25 | salary | test.py:20:14:20:25 | a salary | test.py:22:20:22:25 | a salary | Private data from $@ is stored here. | test.py:20:14:20:25 | get_salary() | a call returning a salary |
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
experimental/Security/CWE-312/PrivateCleartextStorage.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
semmle-extractor-options: -p ../lib/ --max-import-depth=3
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from flask import Flask, make_response, request, Response

app = Flask("Leak social security number")

@app.route('/')
def index():
socialsecurity = request.args.get("social security number")
resp = make_response(render_template(...))
resp.set_cookie("social security number", socialsecurity)
return resp

@app.route('/')
def index2():
socialsecurity = request.args.get("social security number")
resp = Response(...)
resp.set_cookie("social security number", socialsecurity)
return resp
Loading
Oops, something went wrong.