Fix issue where Windows drive name is stripped from Jekyll.sanitized_path incorrectly #5256

Merged
merged 6 commits into from Oct 5, 2016

Projects

None yet

7 participants

@kwokfu
Contributor
kwokfu commented Aug 18, 2016

Strip drive name only when necessary.

@kwokfu kwokfu Proposed fix for #5192
Strip drive name only when necessary.
596f5d1
@ashmaroli
Contributor

This works on Windows, so a 👍 from me.

@kwokfu kwokfu changed the title from Proposed fix for #5192 to Proposed fix for issue #5192 Aug 18, 2016
@parkr
Member
parkr commented Aug 22, 2016

If you're fixing #5192, can you please add a test for this to our test suite so we know it's fixed?

This is very sensitive code so I'd love a 👍 from @jekyll/security as well. Thank you!

@parkr parkr changed the title from Proposed fix for issue #5192 to Fix issue where Windows drive name is stripped from Jekyll.sanitized_path Aug 22, 2016
@parkr parkr changed the title from Fix issue where Windows drive name is stripped from Jekyll.sanitized_path to Fix issue where Windows drive name is stripped from Jekyll.sanitized_path incorrectly Aug 22, 2016
@benbalter
Contributor

Ran into this bug this week. Thanks for fixing this! 😄

@ptoomey3 ptoomey3 and 2 others commented on an outdated diff Aug 23, 2016
lib/jekyll.rb
clean_path
else
+ clean_path.sub!(%r!\A\w:/!, "/")
@ptoomey3
ptoomey3 Aug 23, 2016 edited

Is there a test/scenario that shows where this would be useful/can we just drop the whole clean_path.sub!(%r!\A\w:/!, "/") bit? I'm trying to think of a legit case where we receive questionable_path with a leading drive letter AND clean_path.start_with?(base_directory) returns false AND where File.join(base_directory, clean_path.sub!(%r!\A\w:/!, "/")) is expected to do something sensible. It seems like, if clean_path contains a drive letter, it is a fully absolute path by definition. So, doing this substitution and then appending it on to base_directory would always result in an invalid path.

@envygeeks
envygeeks Aug 23, 2016 Member

@ptoomey3 it'll be a valid path in the sense that it'll work for home users, but it'll break every one of my installs of Jekyll. Our Jekyll users here have 4 drives, plus another 8 on the network depending on who they are. If they are working from their secondary drive (which is where we expect them to place their personal data as that is their personal drive to keep at no matter what) and this strips the path, Jekyll now thinks C:/ is the path, it is not, Jekyll just broke.

This fixes a problem while exacerbating a larger problem for other users and I can see potential security issues downstream (I'd have to test this because I'm not quite sure but I have a feeling this would affect Jekyll Assets which relies on this to ensure nothing silly is happening in a round-about-way and to keep Sprockets in check, along with the users.)

@ashmaroli
ashmaroli Sep 23, 2016 Contributor

I'd have to test this because I'm not quite sure but I have a feeling...

@envygeeks : Did you get the time to test if this had any side-effects..?

@ashmaroli
Contributor

why dont we apply a patch only for windows users and leave as is for every other platform, by using conditional statements?

@kwokfu kwokfu Fix issue #5276, where path strips root destination dir if filename m…
…atches
7892c5e
@parkr
Member
parkr commented Aug 25, 2016

In any event, we should write a test that demonstrates the breakage on AppVeyor so we can confirm this fixes things.

kwokfu added some commits Aug 26, 2016
@kwokfu kwokfu Test case for issue #5276, where Jekyll.sanitized_path strips base pa…
…th incorrectly if file path has matching prefix
0d8796f
@kwokfu kwokfu Test case for issue #5192, where Jekyll.sanitized_path strips drive n…
…ame on Windows incorrectly
23d7929
@ptoomey3 ptoomey3 and 2 others commented on an outdated diff Aug 26, 2016
test/test_path_sanitization.rb
+
+ should "strip just the clean path drive name" do
+ assert_equal "D:/demo/_site",
+ Jekyll.sanitized_path(@base_path, @file_path)
+ end
+ end
+
+ context "on Windows with file path has matching prefix" do
+ setup do
+ @base_path = "D:/site"
+ @file_path = "D:/sitemap.xml"
+ allow(Dir).to receive(:pwd).and_return("D:/")
+ end
+
+ should "not strip base path" do
+ assert_equal "D:/site/sitemap.xml",
@ptoomey3
ptoomey3 Aug 26, 2016

I'm still unclear on when this test's behavior is desirable. It seems like the path to sitemap.xml is absolute, so under what condition does it make sense to transform it into a relative path that is appended on to the base path? I understand that this is largely just testing existing behavior, but I'm wondering if that behavior is needed.

@kwokfu
kwokfu Aug 26, 2016 Contributor

@ptoomey3, I'm adding this test based on #5276.

The scenario is if we configured source to D:/ and destination to D:/site, the file sitemap.xml will be created in the source path before copying to destination (i.e copy D:/sitemap.xml to D:/site/sitemap.xml). That is where Jekyll build fail in my test bed.

@parkr
parkr Aug 26, 2016 Member

@ptoomey3 this method is like File.join, but enforces the base path :) it was written for that!

@parkr parkr self-assigned this Aug 26, 2016
@parkr
Member
parkr commented Aug 26, 2016

I shudder to think of how many plugins and such we'd break if we changed Jekyll.sanitized_path to not just gracefully bolt the base_directory before the questionable_path, so I think we need to keep that behaviour.

I agree that the code at present is hella difficult to keep track of and reason about. A lot of that confusion comes down to the fact that the mysterious regexp and so on is not described through comments. For this critical piece of the codebase, maybe a line-by-line explanation would be helpful.

@kwokfu
Contributor
kwokfu commented Aug 27, 2016 edited

@parkr I think the fix for #5276 should obsolete the regex that strip away drive name. I'm not sure if there is a valid reason to strip away any drive name. I think it is also good to remove the regex line that strip drive name, as it really doesn't make any sense.

Run a couple test locally in Windows machine and failed miserably. After all, removing the regex isn't really a good idea for Windows user.

@ashmaroli
Contributor

Hello @kwokfu, any updates on this?

@parkr

Both of the two tests are failing:

Failure:
TestPathSanitization#test_: on Windows with file path has matching prefix should not strip base path.  [/home/travis/build/jekyll/jekyll/test/test_path_sanitization.rb:53]
Minitest::Assertion: Expected: "D:/site/sitemap.xml"
  Actual: "D:/site/D:/sitemap.xml"
Failure:
TestPathSanitization#test_: on Windows with absolute path should strip just the clean path drive name.  [/home/travis/build/jekyll/jekyll/test/test_path_sanitization.rb:40]
Minitest::Assertion: Expected: "D:/demo/_site"
  Actual: "D:/demo/D:/demo/_site"
@kwokfu
Contributor
kwokfu commented Sep 23, 2016 edited

@parkr @ashmaroli, these 2 test cases passed in Windows environment but failed *nix environment.

The reason behind that 2 fail cases is because of how the File.expand_path behaves differently between Windows (gives "D:/sitemap.xml") and *nix (gives "/D:/sitemap.xml").

I do not want to add too much decision in Jekyll.sanitized_path, which might make it harder to read. I am thinking if there is a better way to determine if it is running in Windows environment, so that the same tests can be skipped in *nix environment.

*side: I was on a vacation in Australia.

@ashmaroli
Contributor
ashmaroli commented Sep 23, 2016 edited

@kwokfu, why not use conditional filtering?
See: jekyll/utils/platforms.rb => really_windows? method

kwokfu added some commits Sep 24, 2016
@kwokfu kwokfu Merge remote-tracking branch 'jekyll/master' into patch-1 db53213
@kwokfu kwokfu Skip Windows tests in non-Windows environment.
bacb300
@ashmaroli
Contributor
@ashmaroli
Contributor

When I mentioned conditional filtering, I implied that the changes you made be conditionally applied.. not test conditionally..
how can we be sure that this will not break on an unixoid users system?

@kwokfu
Contributor
kwokfu commented Sep 24, 2016

@ashmaroli, thats because both the tests were written specifically with Windows environment in mind. Or maybe I should just assert differently between Windows and *nix (i.e. D:/site/sitemap.xml for Windows and D:/site/D:/sitemap.xml for *nix). What you say?

@ashmaroli
Contributor

say, a non-windows users runs jekyll serve using a release with this change incorporated. What do you expect to happen?
continue serving or abort like travis previously did?

@kwokfu
Contributor
kwokfu commented Sep 24, 2016

@ashmaroli, I think serving shouldn't be a problem because of the changes barely introducing any new feature, but additional checking on the path prefix before actually stripping the drive letter.

Do let me know if you have any scenario pop out that I've missed, especially related to theme where files are sitting on another path.

@parkr
Member
parkr commented Sep 24, 2016

@XhmikosR Would you be willing to give this branch a try using a gem-based theme?

@parkr
Member
parkr commented Sep 24, 2016

An alternative here, which might make things a lot easier to grok, is to have a Windows-specific sanitized_path method:

def windows_sanitized_path(base, questionable)
  # Assume drive names, do not strip them.
end

def sanitized_path(base, questionable)
  if Jekyll::Utils::Platforms.windows?
    return windows_sanitized_path(base, questionable)
  end

  # We know we're on Linux or a Mac now.
  # Assume no drive names.
end

What do you think about that?

@parkr parkr added this to the 3.3 milestone Sep 24, 2016
@ashmaroli
Contributor

Would you be willing to give this branch a try using a gem-based theme?

The branch works fine with minima on Windows..
What I doubted was if this branch would work equally fine on a Linux or like platforms..

@parkr
Member
parkr commented Sep 28, 2016

@kwokfu Hoping to release v3.3.0 out here soon, what do you think about my suggestion above?

@kwokfu
Contributor
kwokfu commented Sep 29, 2016

@parkr, I'm still exploring this and thinking what if someone really did create a directory in *nix with a name that looks like Windows drive name (for whatever reason) and what will be the case that goes into the sanitized_path.

@ashmaroli
Contributor

what if someone really did create a directory in *nix with a name that looks like Windows drive name

I dont think someone can do something like that to affect the regex. Even if they did, the santized_path method strips only the initial drive name. It should be similar to this block in the test_file

@kwokfu
Contributor
kwokfu commented Sep 29, 2016

@ashmaroli, thanks for pointing that out, but what I'm really interested is if the questionable path in the test start with a drive name instead of what it is now.

I'm still working on to have all my new tests passed in both platforms. Lets hope that God strike me an idea later today so that I can get a better test results tonight.

@kwokfu
Contributor
kwokfu commented Sep 29, 2016

@parkr, @ashmaroli, I believe current changes made is sufficient to handle both the reported bugs without altering the original design of the function. I think introducing the suggested logic for Windows-only will introduce duplicate codes or changing the designate behaviour of the function, hence, not really wanted to incorporate in this patch (may be in another patch with discussion on how this function should behave?).

@parkr
Member
parkr commented Sep 29, 2016

I think introducing the suggested logic for Windows-only will introduce duplicate codes or changing the designate behaviour of the function, hence, not really wanted to incorporate in this patch (may be in another patch with discussion on how this function should behave?).

@kwokfu My primary concern is ensuring that all possible path traversal vectors are covered. This function exists to prevent path traversal. My secondary concern is doing The Right Thing with regard to drive names. What if you request C:/mysite and your site source is at D:/Users/My Documents/mysite? What is the expected behaviour there? Being able to tailor our path traversal checking to also respect drive names (instead of ignoring them as this PR does) is worth the potential code duplication.

@ptoomey3 If you have time to load this up and try to crack it, I'd love to have your approval before I merge. FWIW, it passes our checks in this test suite. Thanks!

@kwokfu
Contributor
kwokfu commented Sep 29, 2016

I would also like to know is there any documentation or discussion restricting where can we create a new Jekyll site, like under /css? I'm getting error while generating site under that path. If that were not discussed previously, I might just file a bug and work from there.

@ashmaroli
Contributor

I'm getting error while generating site under that path.

@kwokfu Can you please elaborate further? What exactly was the error raised?

@kwokfu
Contributor
kwokfu commented Sep 30, 2016 edited

@ashmaroli, just ran the same test in Windows and getting the same error, looks like same is happening to both platform. Following is the output for Windows. I'll paste the trace in issue if confirmed bug.

D:\>jekyll new css
D:\>cd css
D:\css> jekyll build
..ignored..
jekyll 3.2.1 | Error:  No such file or directory @ rb_file_s_stat - D:/css/css/Gemfile

No issue if I'm not using css as the path name.

@ashmaroli
Contributor

@kwokfu That bug is not related to Jekyll 3.2.1 but rather this branch alone.
I'm able to build and serve a new site created at css

@kwokfu
Contributor
kwokfu commented Sep 30, 2016 edited

@ashmaroli, thats weird. I'm not creating a new site using this branch. Are you creating the new site under root? It has to be under root directory, in either Windows or *nix.

Update: It happens that I was using this branch in Windows, but not in MacOS. Following output for MacOS.

MacBook-Pro:css kwokfu$ jekyll build
Configuration file: /css/_config.yml
            Source: /css
       Destination: /css/_site
 Incremental build: disabled. Enable with --incremental
      Generating... 
jekyll 3.2.1 | Error:  No such file or directory @ rb_file_s_stat - /css/css/Gemfile
@ashmaroli
Contributor

Interesting find.. was able to reproduce the bug on Windows at C:\ using 3.2.1 and master. Surely warrants a dedicated ticket. Feel free to do so, if you have time..

@ashmaroli
Contributor

@kwokfu, the bug you discovered seems to happen due to the presence of a subdirectory also named css. Deleting that directory resolves this edge-case-scenario.
Jekyll 3.3 no longer requires that directory to be present in the root. It'll be handled by the theme gem.

@ptoomey3
ptoomey3 commented Oct 4, 2016

@ptoomey3 If you have time to load this up and try to crack it, I'd love to have your approval before I merge. FWIW, it passes our checks in this test suite. Thanks!

This looks ok. In the end, the path either has to be a prefix of the base or gets joined to the base. 👍

@parkr parkr referenced this pull request Oct 5, 2016
Closed

Jekyll 3.3 Release Gameplan #5400

9 of 9 tasks complete
@parkr
Member
parkr commented Oct 5, 2016

This looks ok. In the end, the path either has to be a prefix of the base or gets joined to the base. 👍

Thank you SO much, @ptoomey3! 🔒

@parkr
parkr approved these changes Oct 5, 2016 View changes
@parkr
Member
parkr commented Oct 5, 2016

@jekyllbot: merge +bug

@jekyllbot jekyllbot merged commit 275f5a6 into jekyll:master Oct 5, 2016

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jekyllbot jekyllbot added bug fix labels Oct 5, 2016
@kwokfu kwokfu deleted the kwokfu:patch-1 branch Oct 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment