Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Underscore can cause duplicate file inclusion. #83

Closed
donquixote opened this Issue · 18 comments

6 participants

@donquixote

The way that PSR-0 is defined, it can happen that two classes map to the same file.

class Aa\Bb\Cc -> file Aa/Bb/Cc.php
class Aa\Bb_Cc -> file Aa/Bb/Cc.php

Having require / include instead of require_once / include_once in the class loader, as suggested in the example implementations (and as implemented e.g. in Symfony2 or other projects) can crash, if you run

class_exists('Aa\\Bb\\Cc');
class_exists('Aa\\Bb_Cc');

And that respective file Aa/Bb/Cc.php contains the no-underscore version.
The second class_exists() will cause a duplicate file inclusion.

This can be avoided by using class_exists() with care. So it might be a wontfix.
Or, if we really want to solve it, we should include_once instead of include / require.

This issue is primarily meant for documentation and discussion, not necessarily asking for a change. I don't expect much more than a "seen it, thank you".

@henrikbjorn

the PSR says that the last part of a namespaced file is considered to be the filename and that _ in that part must not be replaced by a directory separator.

@donquixote

https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-0.md
it says the opposite.
There is one section explicitly about underscores in the class name.

@philsturgeon
Owner

When would this ever be a problem in the real world? I can definitely do this, but if I have control over my own package then why would I ever build it like this?

@donquixote

I suppose it can only be a problem of

  • you rename a file or class, and something in the system still has the old name, or if
  • you have an automated process that guesses the class name based on the file name (and guesses wrong).

E.g.
1. your file contains a class Aa\Bb_Cc.
2. something in your code does a "new \Aa\Bb_Cc", knowing that this class does exist.
3. an automated process scans the directory and assumes Aa\Bb\Cc, so it does "class_exists('Aa\Bb\Cc')" and the file gets included again.

You can avoid this by using require_once on the 3rd step (or whenever you are not sure that the class does actually exist). So, as mentioned, this issue is only for documentation and "told you so".

@philsturgeon
Owner
@AmyStephen

@donquixote I looked at this and also didn't know for certain what to do with the information. I think this might be a good example of the type of specific data that would be best suited for the comments users can add at the bottom of a PHP page. If it happened to you, it will likely happen to someone else. Since it is fairly non-standard, adding it to the standard would likely confuse more people than it assist.

I agree with @philsturgeon - can you recommend very specifically what you think would be useful, by that I mean "Add this statement as it is written below (then put it in) to this page (then link to it)" keeping in mind the need to keep the standard very high level.

As time goes on, adding user notes or a Wiki could provide a place for these. @donquixote you might end up deciding there isn't a place for it right now, too. If so, that feedback would also be helpful.

@philsturgeon
Owner

@AmyStephen it wouldn't go on the PHP Comments in the manual, is that what you meant? This is not a "php problem", its known and expected behavior that some people find odd.

Anthony Ferrera pointed this out:

With PSR-0, multiple classes actually map to the same file. For example, all of the following map to the same file (Foo/Bar/Baz.php)

  • \Foo\Bar\Baz
  • \Foo\Bar_Baz
  • \Foo_Bar_Baz

Right, so don't have three classes called Bar and Baz. Done. Next! :)

@bobthecow

@philsturgeon She was pointing out the usefulness of the php.net user comments in situations like this. They are great for listing gotchas or workarounds or implementation details that are beyond the scope of the official documentation. I'd say those comments are one of PHP's greatest strengths :)

@philsturgeon
Owner
@donquixote

@Amy:
For my taste, it can be any place that can be found in google, and that can be refered to with a link. It doesn't need to be prominently advertised or anything.

This issue alone might already be good enough of a place. Maybe with an added summary saying "yes, we confirm that your observation is correct. 99% of developers don't need to worry about it. Here is some notes for the remaining 1%."

The main audience would be people who implement a class discovery mechanism based on directory scans.

@philsturgeon:

you have an automated process that guesses the class name based on the file name (and guesses wrong).

Improve your logic.

This is assuming that the people who design the discovery process actually know about this subtlety. In my experience, a lot of people are either not aware of it, or they don't care. Or even if the actual programmer knows, then it is likely that a number of people join the discussion who don't.

If you design such a system, there can be different ways to deal with this "issue":

  1. "We don't want to think about it." and hope that it doesn't blow up.
    This will probably work ok most of the time.
  2. You make an explicit policy saying that in those folders that you are going to scan for class discovery, you don't want any underscore after the last namespace separator.
    (a variation of that would be to say "noone is going to do that", without actually nailing it down in a policy)
    Or, more strictly, you require in your policy that any file in these folders must define the class that you would expect by the filename (otherwise, running the discovery again would cause the duplicate inclusion, independent of the underscore thingie)
  3. You circumvent the regular autoload mechanic in your discovery process. Instead, you require_once the file, then use class_exists($class, FALSE), to check if the class exists.
  4. You let your class loader do require_once / include_once instead of just require / include. This has a minor but measurable price in performance.
    (most existing implementations I have seen use include, not include_once)

I personally prefer option 3, which is easy enough to implement.

But I can understand if someone prefers option 1. In this case, this issue would simply be a place that explains why your code is blowing up (in the rare case that this happens).

I think Drupal 8 (discovery of test classes) is somewhere within 1 or 2. They adapted a lot from Symfony 2, so might be that sf2 contains some similar logic.

@AmyStephen

+1, Thanks @donquixote
This issue alone might already be good enough of a place.

@bobthecow - thanks for explaining my poorly made point.

@philsturgeon
Owner

I wrote up a more verbose explanation of the oddity here:

https://github.com/philsturgeon/psr0-naming-oddity

@donquixote

@philsturgeon:
I think your explanation is a bit misleading. It can be understood as saying that "new Package\Foo_Bar_Baz()" does actually create an instance of Package\Foo\Bar\Baz(). This is not the case. Instead it says the class does not exist.

I think it is more useful to play with class_exists(), because that will trigger the class loader and include the file, but it will not fatal out if the class does not exist.

@philsturgeon
Owner
@dongilbert

After giving this some thought, the only real FIX for this would be to set a preference, but this would have performance implications. The autoloader would have to change require to include, so that if the class wasn't found, it could continue to the next iteration of the autoloader, and it would have to make 2 attempts at including the file / class. The first time, just try to load the class by replacing namespace separators with directory separators, but no _'s. If it's included and the requested class exists, then return true. If it's included but the class does not exist, then try again, check again, and finally return false.

https://gist.github.com/dongilbert/5113138

Finally, the benefits of this is that it allows the end user to set up another fall back autoloader if they so desired. I can understand now why PSR-0 uses require instead of include, but it doesn't make it right. IMO, the autoloader needs to try to load the file, and then not return true just if the file has been included, because that's just assuming the matched file contained the desired class. At the very least it should check if the class exists and only return true if it has been loaded.

Hope that all makes sense, and I'm not just rambling.

@philsturgeon
Owner

The fix has already been mentioned I think. If you've got a weird piece of "class discovery" code which you know is going to look for things in a messed up way then you probably just want to manually include that one. I think thats probably good enough as this is a real edge-case anyway.

@philsturgeon

This is a known issue which has not current caused the internet to explode, and the issue has had no progress in 3 months. PSR-X is being worked on (a new autoloader) which will not allow these underscores, so the issue will be resolved that way.

@donquixote
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.