Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue where Windows drive name is stripped from Jekyll.sanitized_path incorrectly #5256

Merged
merged 6 commits into from Oct 5, 2016

Conversation

@kwokfu
Copy link
Contributor

@kwokfu kwokfu commented Aug 18, 2016

Strip drive name only when necessary.

Strip drive name only when necessary.
@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Aug 18, 2016

This works on Windows, so a 👍 from me.

Loading

@kwokfu kwokfu changed the title Proposed fix for #5192 Proposed fix for issue #5192 Aug 18, 2016
@parkr
Copy link
Member

@parkr parkr commented Aug 22, 2016

If you're fixing #5192, can you please add a test for this to our test suite so we know it's fixed?

This is very sensitive code so I'd love a 👍 from @jekyll/security as well. Thank you!

Loading

@parkr parkr changed the title Proposed fix for issue #5192 Fix issue where Windows drive name is stripped from Jekyll.sanitized_path Aug 22, 2016
@parkr parkr changed the title Fix issue where Windows drive name is stripped from Jekyll.sanitized_path Fix issue where Windows drive name is stripped from Jekyll.sanitized_path incorrectly Aug 22, 2016
@benbalter
Copy link
Contributor

@benbalter benbalter commented Aug 22, 2016

Ran into this bug this week. Thanks for fixing this! 😄

Loading

clean_path
else
clean_path.sub!(%r!\A\w:/!, "/")
Copy link

@ptoomey3 ptoomey3 Aug 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test/scenario that shows where this would be useful/can we just drop the whole clean_path.sub!(%r!\A\w:/!, "/") bit? I'm trying to think of a legit case where we receive questionable_path with a leading drive letter AND clean_path.start_with?(base_directory) returns false AND where File.join(base_directory, clean_path.sub!(%r!\A\w:/!, "/")) is expected to do something sensible. It seems like, if clean_path contains a drive letter, it is a fully absolute path by definition. So, doing this substitution and then appending it on to base_directory would always result in an invalid path.

Loading

Copy link
Contributor

@envygeeks envygeeks Aug 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ptoomey3 it'll be a valid path in the sense that it'll work for home users, but it'll break every one of my installs of Jekyll. Our Jekyll users here have 4 drives, plus another 8 on the network depending on who they are. If they are working from their secondary drive (which is where we expect them to place their personal data as that is their personal drive to keep at no matter what) and this strips the path, Jekyll now thinks C:/ is the path, it is not, Jekyll just broke.

This fixes a problem while exacerbating a larger problem for other users and I can see potential security issues downstream (I'd have to test this because I'm not quite sure but I have a feeling this would affect Jekyll Assets which relies on this to ensure nothing silly is happening in a round-about-way and to keep Sprockets in check, along with the users.)

Loading

Copy link
Member

@ashmaroli ashmaroli Sep 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to test this because I'm not quite sure but I have a feeling...

@envygeeks : Did you get the time to test if this had any side-effects..?

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Aug 25, 2016

why dont we apply a patch only for windows users and leave as is for every other platform, by using conditional statements?

Loading

@parkr
Copy link
Member

@parkr parkr commented Aug 25, 2016

In any event, we should write a test that demonstrates the breakage on AppVeyor so we can confirm this fixes things.

Loading

end

should "not strip base path" do
assert_equal "D:/site/sitemap.xml",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unclear on when this test's behavior is desirable. It seems like the path to sitemap.xml is absolute, so under what condition does it make sense to transform it into a relative path that is appended on to the base path? I understand that this is largely just testing existing behavior, but I'm wondering if that behavior is needed.

Loading

Copy link
Contributor Author

@kwokfu kwokfu Aug 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ptoomey3, I'm adding this test based on #5276.

The scenario is if we configured source to D:/ and destination to D:/site, the file sitemap.xml will be created in the source path before copying to destination (i.e copy D:/sitemap.xml to D:/site/sitemap.xml). That is where Jekyll build fail in my test bed.

Loading

Copy link
Member

@parkr parkr Aug 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ptoomey3 this method is like File.join, but enforces the base path :) it was written for that!

Loading

@parkr parkr self-assigned this Aug 26, 2016
@parkr
Copy link
Member

@parkr parkr commented Aug 26, 2016

I shudder to think of how many plugins and such we'd break if we changed Jekyll.sanitized_path to not just gracefully bolt the base_directory before the questionable_path, so I think we need to keep that behaviour.

I agree that the code at present is hella difficult to keep track of and reason about. A lot of that confusion comes down to the fact that the mysterious regexp and so on is not described through comments. For this critical piece of the codebase, maybe a line-by-line explanation would be helpful.

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Aug 27, 2016

@parkr I think the fix for #5276 should obsolete the regex that strip away drive name. I'm not sure if there is a valid reason to strip away any drive name. I think it is also good to remove the regex line that strip drive name, as it really doesn't make any sense.

Run a couple test locally in Windows machine and failed miserably. After all, removing the regex isn't really a good idea for Windows user.

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 23, 2016

Hello @kwokfu, any updates on this?

Loading

Copy link
Member

@parkr parkr left a comment

Both of the two tests are failing:

Failure:
TestPathSanitization#test_: on Windows with file path has matching prefix should not strip base path.  [/home/travis/build/jekyll/jekyll/test/test_path_sanitization.rb:53]
Minitest::Assertion: Expected: "D:/site/sitemap.xml"
  Actual: "D:/site/D:/sitemap.xml"
Failure:
TestPathSanitization#test_: on Windows with absolute path should strip just the clean path drive name.  [/home/travis/build/jekyll/jekyll/test/test_path_sanitization.rb:40]
Minitest::Assertion: Expected: "D:/demo/_site"
  Actual: "D:/demo/D:/demo/_site"

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 23, 2016

@parkr @ashmaroli, these 2 test cases passed in Windows environment but failed *nix environment.

The reason behind that 2 fail cases is because of how the File.expand_path behaves differently between Windows (gives "D:/sitemap.xml") and *nix (gives "/D:/sitemap.xml").

I do not want to add too much decision in Jekyll.sanitized_path, which might make it harder to read. I am thinking if there is a better way to determine if it is running in Windows environment, so that the same tests can be skipped in *nix environment.

*side: I was on a vacation in Australia.

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 23, 2016

@kwokfu, why not use conditional filtering?
See: jekyll/utils/platforms.rb => really_windows? method

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 24, 2016

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 24, 2016

@ashmaroli, I think serving shouldn't be a problem because of the changes barely introducing any new feature, but additional checking on the path prefix before actually stripping the drive letter.

Do let me know if you have any scenario pop out that I've missed, especially related to theme where files are sitting on another path.

Loading

@parkr
Copy link
Member

@parkr parkr commented Sep 24, 2016

@XhmikosR Would you be willing to give this branch a try using a gem-based theme?

Loading

@parkr
Copy link
Member

@parkr parkr commented Sep 24, 2016

An alternative here, which might make things a lot easier to grok, is to have a Windows-specific sanitized_path method:

def windows_sanitized_path(base, questionable)
  # Assume drive names, do not strip them.
end

def sanitized_path(base, questionable)
  if Jekyll::Utils::Platforms.windows?
    return windows_sanitized_path(base, questionable)
  end

  # We know we're on Linux or a Mac now.
  # Assume no drive names.
end

What do you think about that?

Loading

@parkr parkr added this to the 3.3 milestone Sep 24, 2016
@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 24, 2016

Would you be willing to give this branch a try using a gem-based theme?

The branch works fine with minima on Windows..
What I doubted was if this branch would work equally fine on a Linux or like platforms..

Loading

@parkr
Copy link
Member

@parkr parkr commented Sep 28, 2016

@kwokfu Hoping to release v3.3.0 out here soon, what do you think about my suggestion above?

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 29, 2016

@parkr, I'm still exploring this and thinking what if someone really did create a directory in *nix with a name that looks like Windows drive name (for whatever reason) and what will be the case that goes into the sanitized_path.

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 29, 2016

what if someone really did create a directory in *nix with a name that looks like Windows drive name

I dont think someone can do something like that to affect the regex. Even if they did, the santized_path method strips only the initial drive name. It should be similar to this block in the test_file

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 29, 2016

@ashmaroli, thanks for pointing that out, but what I'm really interested is if the questionable path in the test start with a drive name instead of what it is now.

I'm still working on to have all my new tests passed in both platforms. Lets hope that God strike me an idea later today so that I can get a better test results tonight.

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 29, 2016

@parkr, @ashmaroli, I believe current changes made is sufficient to handle both the reported bugs without altering the original design of the function. I think introducing the suggested logic for Windows-only will introduce duplicate codes or changing the designate behaviour of the function, hence, not really wanted to incorporate in this patch (may be in another patch with discussion on how this function should behave?).

Loading

@parkr
Copy link
Member

@parkr parkr commented Sep 29, 2016

I think introducing the suggested logic for Windows-only will introduce duplicate codes or changing the designate behaviour of the function, hence, not really wanted to incorporate in this patch (may be in another patch with discussion on how this function should behave?).

@kwokfu My primary concern is ensuring that all possible path traversal vectors are covered. This function exists to prevent path traversal. My secondary concern is doing The Right Thing with regard to drive names. What if you request C:/mysite and your site source is at D:/Users/My Documents/mysite? What is the expected behaviour there? Being able to tailor our path traversal checking to also respect drive names (instead of ignoring them as this PR does) is worth the potential code duplication.

@ptoomey3 If you have time to load this up and try to crack it, I'd love to have your approval before I merge. FWIW, it passes our checks in this test suite. Thanks!

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 29, 2016

I would also like to know is there any documentation or discussion restricting where can we create a new Jekyll site, like under /css? I'm getting error while generating site under that path. If that were not discussed previously, I might just file a bug and work from there.

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 30, 2016

I'm getting error while generating site under that path.

@kwokfu Can you please elaborate further? What exactly was the error raised?

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 30, 2016

@ashmaroli, just ran the same test in Windows and getting the same error, looks like same is happening to both platform. Following is the output for Windows. I'll paste the trace in issue if confirmed bug.

D:\>jekyll new css
D:\>cd css
D:\css> jekyll build
..ignored..
jekyll 3.2.1 | Error:  No such file or directory @ rb_file_s_stat - D:/css/css/Gemfile

No issue if I'm not using css as the path name.

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 30, 2016

@kwokfu That bug is not related to Jekyll 3.2.1 but rather this branch alone.
I'm able to build and serve a new site created at css

Loading

@kwokfu
Copy link
Contributor Author

@kwokfu kwokfu commented Sep 30, 2016

@ashmaroli, thats weird. I'm not creating a new site using this branch. Are you creating the new site under root? It has to be under root directory, in either Windows or *nix.

Update: It happens that I was using this branch in Windows, but not in MacOS. Following output for MacOS.

MacBook-Pro:css kwokfu$ jekyll build
Configuration file: /css/_config.yml
            Source: /css
       Destination: /css/_site
 Incremental build: disabled. Enable with --incremental
      Generating... 
jekyll 3.2.1 | Error:  No such file or directory @ rb_file_s_stat - /css/css/Gemfile

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 30, 2016

Interesting find.. was able to reproduce the bug on Windows at C:\ using 3.2.1 and master. Surely warrants a dedicated ticket. Feel free to do so, if you have time..

Loading

@ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Sep 30, 2016

@kwokfu, the bug you discovered seems to happen due to the presence of a subdirectory also named css. Deleting that directory resolves this edge-case-scenario.
Jekyll 3.3 no longer requires that directory to be present in the root. It'll be handled by the theme gem.

Loading

@ptoomey3
Copy link

@ptoomey3 ptoomey3 commented Oct 4, 2016

@ptoomey3 If you have time to load this up and try to crack it, I'd love to have your approval before I merge. FWIW, it passes our checks in this test suite. Thanks!

This looks ok. In the end, the path either has to be a prefix of the base or gets joined to the base. 👍

Loading

@parkr parkr mentioned this pull request Oct 5, 2016
9 tasks
@parkr
Copy link
Member

@parkr parkr commented Oct 5, 2016

This looks ok. In the end, the path either has to be a prefix of the base or gets joined to the base. 👍

Thank you SO much, @ptoomey3! 🔒

Loading

parkr
parkr approved these changes Oct 5, 2016
@parkr
Copy link
Member

@parkr parkr commented Oct 5, 2016

@jekyllbot: merge +bug

Loading

@jekyllbot jekyllbot merged commit 275f5a6 into jekyll:master Oct 5, 2016
1 of 2 checks passed
Loading
@kwokfu kwokfu deleted the patch-1 branch Oct 30, 2016
@jekyll jekyll locked and limited conversation to collaborators Jul 11, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants