Add a Jekyll doctor warning for URLs that only differ by case #3171

Merged
merged 4 commits into from Dec 13, 2015

Conversation

Projects
None yet
6 participants
@akoeplinger
Contributor

akoeplinger commented Nov 29, 2014

Those URLs are problematic on case-insensitive file systems because one of the URLs is overwritten by the other. Fixes #3035.

Note: I can't seem to find tests for doctor, I'm assuming there are none?

Add a Jekyll doctor warning for URLs that only differ by case
Those URLs are problematic on case-insensitive file systems because one of the URLs is overwritten by the other.
Fixes #3035
lib/jekyll/commands/doctor.rb
+ urls[dest.downcase] << dest
+ else
+ urls[dest.downcase] = [dest]
+ end

This comment has been minimized.

@parkr

parkr Nov 30, 2014

Member

This if block can be simplified to:

(urls[dest.downcase] ||= []) << dest
@parkr

parkr Nov 30, 2014

Member

This if block can be simplified to:

(urls[dest.downcase] ||= []) << dest

This comment has been minimized.

@akoeplinger

akoeplinger Nov 30, 2014

Contributor

Sure, but this would make it inconsistent with collect_urls() above. I'd say consistency is more important than short code? Or should I change the other method too?

@akoeplinger

akoeplinger Nov 30, 2014

Contributor

Sure, but this would make it inconsistent with collect_urls() above. I'd say consistency is more important than short code? Or should I change the other method too?

This comment has been minimized.

@parkr

parkr Dec 1, 2014

Member

Consistency is not important to me here. Theoretically every method should do entirely different things :)

@parkr

parkr Dec 1, 2014

Member

Consistency is not important to me here. Theoretically every method should do entirely different things :)

+ urls_only_differ_by_case = true
+ Jekyll.logger.warn "Warning:", "The following URLs only differ" +
+ " by case. On a case-insensitive file system one of the URLs" +
+ " will be overwritten by the other: #{real_urls.join(", ")}"

This comment has been minimized.

@parkr

parkr Nov 30, 2014

Member

Is there a way to detect whether the current filesystem is case-sensitive and only check if so?

@parkr

parkr Nov 30, 2014

Member

Is there a way to detect whether the current filesystem is case-sensitive and only check if so?

This comment has been minimized.

@akoeplinger

akoeplinger Nov 30, 2014

Contributor

As @pathawks correctly mentions, the current FS doesn't matter as the site might be used on a different FS too. This is exactly what happened to us and made me raise the issue: one dev worked on Linux and didn't notice URLs only differed by case. Then another dev on OSX wanted to work on the site, but the page got overwritten.

@akoeplinger

akoeplinger Nov 30, 2014

Contributor

As @pathawks correctly mentions, the current FS doesn't matter as the site might be used on a different FS too. This is exactly what happened to us and made me raise the issue: one dev worked on Linux and didn't notice URLs only differed by case. Then another dev on OSX wanted to work on the site, but the page got overwritten.

@pathawks

This comment has been minimized.

Show comment
Hide comment
@pathawks

pathawks Nov 30, 2014

Member

Is there a way to detect whether the current filesystem is case-sensitive and only check if so?

Is it safe to always assume that the current system is the ultimate deployment target?

Member

pathawks commented Nov 30, 2014

Is there a way to detect whether the current filesystem is case-sensitive and only check if so?

Is it safe to always assume that the current system is the ultimate deployment target?

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Nov 30, 2014

Member

This is true! Ok, cool. @mattr- / @alfredxing, what do you think about this? Thanks for writing this up, @akoeplinger!

Member

parkr commented Nov 30, 2014

This is true! Ok, cool. @mattr- / @alfredxing, what do you think about this? Thanks for writing this up, @akoeplinger!

@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing Dec 1, 2014

Member

👍 This is definitely useful, thanks for the PR!

Member

alfredxing commented Dec 1, 2014

👍 This is definitely useful, thanks for the PR!

lib/jekyll/commands/doctor.rb
+ urls[dest.downcase] = [dest]
+ end
+ end
+ urls

This comment has been minimized.

@parkr

parkr Dec 1, 2014

Member

How about

things.inject(urls) do |memo, thing|
  dest = thing.destination(destination)
  (memo[dest.downcase] ||= []) << dest
  memo
end
@parkr

parkr Dec 1, 2014

Member

How about

things.inject(urls) do |memo, thing|
  dest = thing.destination(destination)
  (memo[dest.downcase] ||= []) << dest
  memo
end

This comment has been minimized.

@akoeplinger

akoeplinger Dec 1, 2014

Contributor

Looks good, pushed it.

@akoeplinger

akoeplinger Dec 1, 2014

Contributor

Looks good, pushed it.

lib/jekyll/commands/doctor.rb
+ (memo[dest.downcase] ||= []) << dest
+ memo
+ end
+ urls

This comment has been minimized.

@parkr

parkr Dec 1, 2014

Member

This isn't necessary – the #inject call creates the memo based on urls already. Ideally we'd have:

def urls_only_differ_by_case(site)
  urls = case_insensitive_urls(site.pages + site.posts, destination)
  # etc
end

def case_insensitive_urls(things, destination)
  things.inject(Hash.new) do |memo, thing|
    dest = thing.destination(destination)
    (memo[dest.downcase] ||= []) << dest
    memo
  end
end

The inject call returns the memo object, and we can do it just once for all pages and posts rather than mutating things.

@parkr

parkr Dec 1, 2014

Member

This isn't necessary – the #inject call creates the memo based on urls already. Ideally we'd have:

def urls_only_differ_by_case(site)
  urls = case_insensitive_urls(site.pages + site.posts, destination)
  # etc
end

def case_insensitive_urls(things, destination)
  things.inject(Hash.new) do |memo, thing|
    dest = thing.destination(destination)
    (memo[dest.downcase] ||= []) << dest
    memo
  end
end

The inject call returns the memo object, and we can do it just once for all pages and posts rather than mutating things.

This comment has been minimized.

@akoeplinger

akoeplinger Dec 1, 2014

Contributor

Cool, thanks for walking me through this, Ruby is still very new for me 😄 Updated.

@akoeplinger

akoeplinger Dec 1, 2014

Contributor

Cool, thanks for walking me through this, Ruby is still very new for me 😄 Updated.

This comment has been minimized.

@parkr

parkr Dec 1, 2014

Member

No problem, thanks for listening!

@parkr

parkr Dec 1, 2014

Member

No problem, thanks for listening!

@akoeplinger

This comment has been minimized.

Show comment
Hide comment
@akoeplinger

akoeplinger Dec 15, 2014

Contributor

Is there anything else here I need to do?

Contributor

akoeplinger commented Dec 15, 2014

Is there anything else here I need to do?

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 15, 2014

Member

Looks good to me! @alfredxing? Thoughts?

Member

parkr commented Dec 15, 2014

Looks good to me! @alfredxing? Thoughts?

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 15, 2014

Member

I'm a bit concerned about performance but that's it.

Member

parkr commented Dec 15, 2014

I'm a bit concerned about performance but that's it.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 15, 2014

Member

Want to write a couple quick unit tests for this?

Member

parkr commented Dec 15, 2014

Want to write a couple quick unit tests for this?

@akoeplinger

This comment has been minimized.

Show comment
Hide comment
@akoeplinger

akoeplinger Dec 15, 2014

Contributor

I'm a bit concerned about performance but that's it.

I tested on our website with ~2200 files and jekyll doctor takes 1.208s with this change and 1.192s without, so it's barely noticeable.

I'll take a look at adding some unit tests.

Contributor

akoeplinger commented Dec 15, 2014

I'm a bit concerned about performance but that's it.

I tested on our website with ~2200 files and jekyll doctor takes 1.208s with this change and 1.192s without, so it's barely noticeable.

I'll take a look at adding some unit tests.

@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing Dec 15, 2014

Member

Looks good! I was a bit concerned about performance as well but it doesn't seem to bad based on @akoeplinger's stats.

Member

alfredxing commented Dec 15, 2014

Looks good! I was a bit concerned about performance as well but it doesn't seem to bad based on @akoeplinger's stats.

@akoeplinger

This comment has been minimized.

Show comment
Hide comment
@akoeplinger

akoeplinger Dec 15, 2014

Contributor

Hey guys, I added some tests here: akoeplinger@34ff0bb

I'm not really happy with them as I needed to make these two separate folders for the site source (I couldn't get anything in-memory working, basically cause I have zero insight into how the Jekyll internals work together), would be glad to hear a better approach.

Contributor

akoeplinger commented Dec 15, 2014

Hey guys, I added some tests here: akoeplinger@34ff0bb

I'm not really happy with them as I needed to make these two separate folders for the site source (I couldn't get anything in-memory working, basically cause I have zero insight into how the Jekyll internals work together), would be glad to hear a better approach.

+
+ should 'return success on a valid site/page' do
+ @site = Site.new(Jekyll.configuration({
+ "source" => File.join(source_dir, '/_urls_differ_by_case_valid'),

This comment has been minimized.

@parkr

parkr Dec 16, 2014

Member

You can use source_dir('_urls_differ_by_case_valid') here – it does all the joining for you :)

@parkr

parkr Dec 16, 2014

Member

You can use source_dir('_urls_differ_by_case_valid') here – it does all the joining for you :)

+ " will be overwritten by the other: #{real_urls.join(", ")}"
+ end
+ end
+ urls_only_differ_by_case

This comment has been minimized.

@parkr

parkr Dec 16, 2014

Member

Could you please add a test for this method to ensure it returns the proper boolean value?

@parkr

parkr Dec 16, 2014

Member

Could you please add a test for this method to ensure it returns the proper boolean value?

@@ -77,6 +92,13 @@ def collect_urls(urls, things, destination)
urls
end
+ def case_insensitive_urls(things, destination)

This comment has been minimized.

@parkr

parkr Dec 16, 2014

Member

Can you please add a test for this method to ensure it reduces properly into a hash?

@parkr

parkr Dec 16, 2014

Member

Can you please add a test for this method to ensure it reduces properly into a hash?

@mattr-

This comment has been minimized.

Show comment
Hide comment
@mattr-

mattr- Jan 2, 2015

Member

domain names are case sensitive according to RFC 4353. The rest is up to how the server wants to handle GET requests, but I'd rather eliminate the possibility of any duplicates due to case-sensitivity, so 👍 on this.

@akoeplinger It looks like there are some missing tests to add. Would you mind taking care of those so we can merge this?

Member

mattr- commented Jan 2, 2015

domain names are case sensitive according to RFC 4353. The rest is up to how the server wants to handle GET requests, but I'd rather eliminate the possibility of any duplicates due to case-sensitivity, so 👍 on this.

@akoeplinger It looks like there are some missing tests to add. Would you mind taking care of those so we can merge this?

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jan 12, 2015

Member

@akoeplinger It looks like there are some missing tests to add. Would you mind taking care of those so we can merge this?

👍

Member

parkr commented Jan 12, 2015

@akoeplinger It looks like there are some missing tests to add. Would you mind taking care of those so we can merge this?

👍

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Jan 17, 2015

Member

Still interested in seeing this @akoeplinger?

Member

parkr commented Jan 17, 2015

Still interested in seeing this @akoeplinger?

@akoeplinger

This comment has been minimized.

Show comment
Hide comment
@akoeplinger

akoeplinger Jan 18, 2015

Contributor

@parkr ahh sorry guys, I totally missed the notifications here and was busy with other stuff. I'll add the tests tomorrow :)

Contributor

akoeplinger commented Jan 18, 2015

@parkr ahh sorry guys, I totally missed the notifications here and was busy with other stuff. I'll add the tests tomorrow :)

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Feb 14, 2015

Member

@akoeplinger Any updates?

Member

parkr commented Feb 14, 2015

@akoeplinger Any updates?

@parkr parkr merged commit 34ff0bb into jekyll:master Dec 13, 2015

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr Dec 13, 2015

Member

Merged in fdcd761.

Member

parkr commented Dec 13, 2015

Merged in fdcd761.

parkr added a commit that referenced this pull request Dec 13, 2015

@akoeplinger

This comment has been minimized.

Show comment
Hide comment
@akoeplinger

akoeplinger Dec 14, 2015

Contributor

@parkr omg, no idea how this fell off my radar, sorry for the lack of updates 😭 Thanks for merging!

Contributor

akoeplinger commented Dec 14, 2015

@parkr omg, no idea how this fell off my radar, sorry for the lack of updates 😭 Thanks for merging!

@akoeplinger akoeplinger deleted the akoeplinger:doctor-permalink-same-case-warning branch Dec 14, 2015

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.