Skip to content

jwt.decode sonar codemod#326

Merged
clavedeluna merged 7 commits intomainfrom
jwt-sonar
Mar 5, 2024
Merged

jwt.decode sonar codemod#326
clavedeluna merged 7 commits intomainfrom
jwt-sonar

Conversation

@clavedeluna
Copy link
Copy Markdown
Contributor

@clavedeluna clavedeluna commented Mar 2, 2024

Overview

Add codemod support for the sonar rule for jwt.decode

Description

  • The challenge with this codemod is matching sonar results column to decode call node column. Sonar returns column start/end for ....verify=False or ...."verify_signature": True while the semgrep node we care about is jwt.decode which contains these verify keywords. To get the codemod to run, I had to implement a codemod specific match_location that I called fuzzy.

Additional Details

  • some minor cleanup. /documentation

Closes #298

@clavedeluna clavedeluna changed the title Jwt sonar jwt.decode sonar codemod Mar 2, 2024
@clavedeluna clavedeluna marked this pull request as ready for review March 2, 2024 15:20

class TestSonarExceptionWithoutRaise(BaseIntegrationTest):
codemod = ExceptionWithoutRaise
codemod = SonarExceptionWithoutRaise
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized this int test was importing the wrong codemod

pos_to_match = self.node_position(node)
if self.results is None:
# Some codemods must run without results existing.
return True
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I documented this because it was strange to return True here (do we filter by result or not filter by result?). Then I realized the logic here is that this method is used for all codemods - those with and those without results. We still want to analyze codemods even if there are no results. Maybe later on we can have a filter_by_result that's more relevant. The one I implemented for jwt sonar returns False here, since there should always be results if we want to run the codemod.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I encountered this too. We're using None as a sentinel for codemods that do not have detectors, and we do not want to short-circuit those cases. If you don't mind, could you update the comment to say something along those lines?

Comment on lines +30 to +31
same_line(pos, location) and fuzzy_column_match(pos, location)
for location in result.locations
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you encounter any examples where fuzzy_column_match is needed? From my observations most nodes reported by sonar would match libcst ones with one exception: Tuples. If you look at the match_location from SonarResult you can see the exception treated there. We should discuss how to treat location match at some point.

Copy link
Copy Markdown
Contributor Author

@clavedeluna clavedeluna Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're going to have to give me an example because adding fuzzy_column_match is the entire point of this PR. As I stated in the PR description, sonar returns index for the keyword=value, while we need jwt.decode(...keyword=value). Unit tests have exact sonar results, too. So inspect those as well.

Or perhaps Im' not understanding your comment?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the simplest terms, does anything break if you remove fuzzy_column_match and replace it by exact match?
If so, does it break consistently? (e.g. always with the same type of node)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are no location matches without fuzzy matching.

Copy link
Copy Markdown
Member

@drdavella drdavella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just a few smallish comments.

results returned have a start/end column for the verify keyword
within the `decode` call, not for the entire call like semgrep returns.
"""
if self.results is None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will actually never occur in the Sonar case and you can remove this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep very good point. I was trying to keep this method as similar to its base one, but you're right it's nonsensical in this case.

pos_to_match = self.node_position(node)
if self.results is None:
# Some codemods must run without results existing.
return True
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I encountered this too. We're using None as a sentinel for codemods that do not have detectors, and we do not want to short-circuit those cases. If you don't mind, could you update the comment to say something along those lines?

"""
if self.results is None:
return False
match node:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution makes sense to me and it works for now. In the longer term, I think we should filter the Sonar/SAST results before they ever reach the transformer. This means we would add some logic to each detector that requires it. This would mean implementing some kind of visitor that validates each of the detected locations before passing the results to the transformer. There's a bit of a performance impact in this case since it effectively means another pass on the file, but it only applies to files where there are already results.

I'm saying this not because I think we need this change right now but because I've encountered similar issues with remediating another SAST tool and I think that updating the filtering logic here for each case is going to become cumbersome.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting! The detector for this particular codemod is semgrep so that's also a little wrinkle.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Mar 5, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@clavedeluna clavedeluna added this pull request to the merge queue Mar 5, 2024
Merged via the queue into main with commit badb2da Mar 5, 2024
@clavedeluna clavedeluna deleted the jwt-sonar branch March 5, 2024 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sonar: fix "JWT should be signed and verified" with jwt-decode-verify

3 participants