Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taint Analysis with Psalm #611

Open
paragonie-scott opened this issue Mar 21, 2018 · 5 comments
Open

Taint Analysis with Psalm #611

paragonie-scott opened this issue Mar 21, 2018 · 5 comments

Comments

@paragonie-scott
Copy link
Contributor

@paragonie-scott paragonie-scott commented Mar 21, 2018

https://en.wikipedia.org/wiki/Taint_checking

A lot of PHP security scanners will correctly identify $_GET['foo'] as user input that shouldn't be used directly without validation. However, they will fail to understand that ServerRequest::fromGlobals()->getQueryParams()['foo'] is just as bad.

Implementing taint checking isn't a trivial undertaking, but it would be a huge value add to projects like Psalm.

A very straightforward implementation will require adding custom annotations that mark data as tainted or safe, and/or tell Psalm "whatever it returns should be marked as tainted".

For example:

  • @psalm-tainted-var $foo marks a variable as tainted
  • @psalm-untainted-var $foo marks a variable as safe
  • @psalm-tainted-return tells Psalm this function/method returns untrusted data
  • @psalm-untainted-return tells Psalm this function/method returns trustworthy data

We'd also need to be able to mark function parameters (including for builtin functions) as risky for tainted data to land.

This may sound simple, but considering that "safe against SQLi" doesn't imply "safe against XSS" or "safe against directory traversal", it's going to require a bit of creativity to make it usable.

@muglug

This comment has been minimized.

Copy link
Member

@muglug muglug commented Mar 21, 2018

@paragonie-scott wait what's wrong with doing

$pdo->query('select * from users where id = ' . $_GET['id'])

Buy VIAGARA CHEAP

@muglug

This comment has been minimized.

Copy link
Member

@muglug muglug commented Mar 21, 2018

Anyway, the issue I see is that Psalm can't verify that a method actually does the job of making a given input safe. And, more importantly, Psalm's analysis doesn't follow a particular input through all possible paths (because it would be inordinately slow).

So any analysis would involve two steps:

  • creating an algebraic representation of each input in every function, like Flow does, where its value at each step is represented as a product of its previous values e.g.
    $c_ref0->tainted = $b_ref0->tainted | $a_ref0->tainted
    $c_ref1->tainted = $c_ref1->tainted
  • join everything together and look for tainted paths into critical functions (e.g. PDO::query, echo)

That would be a fairly big undertaking

@paragonie-scott

This comment has been minimized.

Copy link
Contributor Author

@paragonie-scott paragonie-scott commented Mar 21, 2018

Yeah, I suspected as much. I might end up having to build this as its own tool (using PHP-Parser instead of the AST extension).

@muglug

This comment has been minimized.

Copy link
Member

@muglug muglug commented Jul 25, 2019

Facebook just published this: https://cacm.acm.org/magazines/2019/8/238344-scaling-static-analyses-at-facebook/fulltext

And I think it bears thinking about a bit more.

@muglug muglug reopened this Jul 25, 2019
@muglug muglug modified the milestone: Taint analysis Jul 31, 2019
muglug added a commit that referenced this issue Aug 4, 2019
Ref #611
@muglug

This comment has been minimized.

Copy link
Member

@muglug muglug commented Aug 5, 2019

Additional wrinkle:

Given the code

class Utils {
  /**
   * @psalm-pure
   */
  public static function shorten(string $str) : string {
    if (strlen($str) > 30) {
      return substr($str, 0, 30) . "...";
    }
    return $str;
  }
}

class A {
  public function foo() : void {
    echo(Utils::shorten((string) $_GET["user_id"]));
  }

  public function bar() : void {
    echo(Utils::shorten("hello"));
  }
}

The path in A::foo should raise an issue, but the path in A::bar should not. Psalm should take the purity of Utils::shorten into consideration, insofar as it does not rely on state.

Edit: above code now recognised: https://psalm.dev/r/adceef6b3a

@muglug muglug added this to To do in Taint analysis via automation Aug 7, 2019
@muglug muglug moved this from To do to In progress in Taint analysis Aug 7, 2019
2e3s added a commit to 2e3s/psalm that referenced this issue Sep 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Taint analysis
  
In progress
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.