Skip to content

Python: model os path file accesses #6741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 13, 2021

Conversation

yoff
Copy link
Contributor

@yoff yoff commented Sep 23, 2021

This PR includes a number of functions in os.path into the concept FileSystemAccess. These all test for the existence of files in some way.

There are a few more functions that could reveal information about the file system, by telling the last modification time of a file, e.g. getmtime. This will raise an error if the file does not exist, so could also reveal the existence of files depending on the applications error handling. These are currently not included.

@yoff yoff requested a review from a team as a code owner September 23, 2021 13:30
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few more functions that could reveal information about the file system, by telling the last modification time of a file, e.g. getmtime. This will raise an error if the file does not exist, so could also reveal the existence of files depending on the applications error handling. These are currently not included.

Can we please model these as well? they fit the FileSystemAccess concept as far as I can tell.

@yoff yoff requested a review from RasmusWL September 30, 2021 11:38
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
@yoff
Copy link
Contributor Author

yoff commented Oct 4, 2021

There was already a taint-step for os.path.realpath, but we were missing some other ones...

@yoff yoff requested a review from RasmusWL October 4, 2021 17:13
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor things

Comment on lines 254 to 258
private string pathComputation() {
result in [
"abspath", "basename", "commonpath", "dirname", "expanduser", "expandvars", "join",
"normcase", "normpath", "realpath", "relpath", "split"
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private string pathComputation() {
result in [
"abspath", "basename", "commonpath", "dirname", "expanduser", "expandvars", "join",
"normcase", "normpath", "realpath", "relpath", "split"
]
private string pathComputation() {
result in [
"abspath", "basename", "commonpath", "commonprefix", "dirname", "expanduser", "expandvars", "join",
"normcase", "normpath", "realpath", "relpath", "split", "splitdrive", "splitext"
]

or was there a reason not to add commonprefix? I'm wondering if you did not include it since it might be used as a form of sanitizer? -- I don't feel doubt about splitdrive and splitext should have taint-steps

NIT: Since this predicate is only used once (in the char-pred of OsPathComputation), could we please inline it there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasons for not including these are given in the list "Functions that need summaries". These functions all do not map tainted input to tainted output but rather map some content to some content.

Comment on lines 249 to 253
// Functions that need summaries:
// - os.path.commonpath(paths): takes a sequence
// - os.path.commonprefix(list): takes a list argument
// - os.path.splitdrive: retunrs a tuple
// - os.path.splittext: returns a tuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is incomplete (for example, missing split). Since all our current taint-modeling would need proper flow-summaries in the future, I would suggest we remove this, and deal with it once we remodel things. it just seems a bit too brittle to my liking 😐

Suggested change
// Functions that need summaries:
// - os.path.commonpath(paths): takes a sequence
// - os.path.commonprefix(list): takes a list argument
// - os.path.splitdrive: retunrs a tuple
// - os.path.splittext: returns a tuple

@yoff yoff requested a review from RasmusWL October 12, 2021 13:17
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@codeql-ci codeql-ci merged commit 2b0415e into github:main Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants