Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve collections without serve single files #102

Open
chrvadala opened this issue Aug 30, 2013 · 31 comments
Open

Serve collections without serve single files #102

chrvadala opened this issue Aug 30, 2013 · 31 comments

Comments

@chrvadala
Copy link

Hi,
I'm using AssetManager with a collection resolver, but I think there is a security issue.

I have a situation like this

   'resolver_configs' => array(
        'collections' => array(
            'css/style.css' => array(
                'css/lessfile1.css',
                'css/lessfile2.css',
            ),
           ),

        'paths' => array(
            __DIR__ . '/../public',
        ),
    ),

       'filters' => array(
        'css/style.css' => array(
            array(
                'filter' => 'Lessphp',
            ),
        ),

This works correctly. I can see /css/style.css correctly. The security problem is related to paths. To use a collection I need to configure a path that expose my single files.
I want to serve only files merged and compressed, because in single files there are a lot of team's comments.

An easy solution could be like this:

     'resolver_configs' => array(
        'collections' => array(
            'css/style.css' => array(
                  __DIR__ . '/../public/css/lessfile1.css',
                  __DIR__ . '/../public/css/lessfile2.css',
            ),
           ),
    ),

       'filters' => array(
        'css/style.css' => array(
            array(
                'filter' => 'Lessphp',
            ),
        ),

But this doesn't work because I think that a collection expect an alias and not a path. Is there another solution?

@Ocramius
Copy link
Contributor

@work-out-web you can simply configure a MapResolver instead of a path stack resolver.

@chrvadala
Copy link
Author

I think that it's the same thing. If I use a MapResolver all single files are however exposed in public.

I want that nobody can access from an url like this http://www.example.com/css/lessfile1.css

@RWOverdijk
Copy link
Owner

@work-out-web I see your problem. It's not a "risk", it's just too much freedom; and you're right. What we could do is add a config key for restrictions (essentially blacklisting or whitelisting assets; for internal or external use).

Is this a big problem to you?

@Ocramius
Copy link
Contributor

Ah, I see the problem. I don't think it's a security concern though.

@chrvadala
Copy link
Author

@RWOverdijk I think that add a whitelist can be an extra work. In my opinion add a method to specify a full path in collections is the easist way to solve this problem and also avoiding BC.

 'css/style.css' => array(
    __DIR__ . '/../public/css/lessfile1.css',
    __DIR__ . '/../public/css/lessfile2.css',
    '/css/aliasfile.css'
),

@Ocramius
You're right but I think that expose files with a lot of team's comment can sometime be risky.
Not a serious risk, but still a risk. :D

@RWOverdijk
Copy link
Owner

@work-out-web We don't want to avoid BC, we want to avoid BC-breaks. If we add white/black-listing it has to be for dirs, paths, and aliases. While on that, we might as well add the possibility to add existing files to collections (without re-resolving them). This sounds like a nice update to me, personally. Can you work out a proposal?

@Ocramius What are your thoughts on this?

@Ocramius
Copy link
Contributor

@work-out-web mapping assets means the assets are available, and that must be clear to the developer.
Also, we're really just talking about assets. I don't think it's a security issue.

This is an overkill in my opinion, and requires a lot of rewrites and probably isn't even compatible with the current architecture.

I'd just suggest to mark it as "won't fix", since the benefit is near to 0, especially compared to the effort required to implement this.

@chrvadala
Copy link
Author

@RWOverdijk Your idea is not simply to realize, ad I accord with @Ocramius that it require a lot of rewrites.

Effectly my idea goes against this principle "mapping assets means the assets are available, and that must be clear to the developer" and this is bad.

I'm going to solve my problem with a custom resolver.
I hope however that this discussion has brought out an interesting point to be explored.

@RWOverdijk
Copy link
Owner

I guess it has. Though we actually bundle our assets ourselves, and prefer them to be accessible to the public (in case we need a file for a specific component on a completely different page).

@Thinkscape
Copy link

Meh, this just hit me in the face.

The way I usually work with asset collections (see assetic and node.js stuff) is that collections might be accessible in debug mode (devel mode). However, by default, individual files must not be accessible. I'd also call it a security risk - I had no idea that creating a collection exposes my files... so it doesn't fulfill the "clear to the developer" requirement.

What should be accessible is the collection itself and nothing more.

If I wanted individual files to be accessible, I'd expect one of the following:

  • I'd just us path/map resolver together with collection resolver, or
  • I'd use some special flag/option in collection resolver, i.e. expose_individual_files => true.

The above promotes explicitness and brevity.

@Thinkscape
Copy link

I see the problem in architecture... collection resolver is an aggregate, so it is able to use other resolvers for stackability. That's actually quite cool, but is source of the problem here.

One of the ways to solve this is to have a separate (private) resolver aggregate for collections. Other idea is to have 2 user-defined stacks for resolvers - i.e. public and private. The former would be used for MVC, while the latter would be used for aggregates. Whatcha think ?

@Ocramius
Copy link
Contributor

The idea is good, but it sounds like the implementation will be a hack...

@Thinkscape
Copy link

Another idea is separation of production and devel by moving out "compilation" (merging, filtering) outside of the standard asset serving.

The standard workflow (for most websites) is:

  1. devel : fiddling with tens source files, lessc/sass, images, js/coffee, require.js (or other AMD) because it helps with debugging.
  2. deploy : files get grouped, merged and minified.
  3. production : we get just a handful of compiled sprite/css/js files that are put into public/ or uploaded to CDN. Source files stay outside of webroot (i.e. in module/*/assets)

I've played with the module for a few hours but dumped it in favor of zf2-assetic-module because I could not easily do the above :\

@RWOverdijk
Copy link
Owner

It's not a big deal. If you want to serve combined files without exposing the individual files, add a compile step (as you really should). But, I see that not everyone does this. So it's probably an idea to add something for those people. (I don't really want to, as it goes against what I believe is the way of serving assets, but clearly there's need for it).

So how about this:

  • We add support for directly addressing files in collections. We'll add a resolver to the stack, being filePathResolver which will simply check if the file exists. Like this, you won't need the map.
  • We'll add simple acl. You can then by default specify the resource is available internal or public. To make this easy, we'll allow for it to be sent along with the config in the map / path / whatever alias you've used.

I'd say, let's implement both. The latter has the added benefit of less resolving, as we can already exclude files based on these rules.

@Thinkscape
Copy link

@RWOverdijk you're missing my point mate.

Acl is a terrible idea... Compilation is part of the asset management. Why would you use jsmin filter with AssetManager in development ?

The primary problem with this module is that it does not allow to clearly divide between production and development. Right now, in order to have compilation I'd have to use separate module, write my own or have separate sets of configs for assetmanager (which still wouldn't work as expected).

Here are the things that are missing:

  • Production means compilation.
  • Devel shouldn't use compilation or compiled versions of files.
  • I want to define my assets once and then use them in both production and devel.
  • I want helpers to expand collections in devel, so devel uses individual unminified files.
  • I want helpers to use minified files in production, so production uses big monolitic optimized files.
  • Individual files, that have been compiled (or has not been moved), must not be visible in production.

@Ocramius
Copy link
Contributor

@Thinkscape AssetManager never worked like that.

We still miss a compilation step, and a compilation step would simply go through all the defined resolvers and basically warm up the cache.

The primary problem with this module is that it does not allow to clearly divide between production and development. Right now, in order to have compilation I'd have to use separate module, write my own or have separate sets of configs for assetmanager (which still wouldn't work as expected).

That is _NOT_ a problem. That is how it was _DESIGNED_

Production means compilation.

Production does NOT mean compilation with AM. With AM, it's cache warnup

Devel shouldn't use compilation or compiled versions of files.

devel CAN and SHOULD use compilation, that's why we use AM and not BaconAssetLoader. Obviously, you don't want that to run at each page load

I want to define my assets once and then use them in both production and devel.

that's how it currently works - not sure what the problem is here

I want helpers to use minified files in production, so production uses big monolitic optimized files.

nope. Helpers simply require the collection smashed together. We don't override helpers like that, and it's just more work and more and more bugs potentially introduced for no real benefit

Individual files, that have been compiled (or has not been moved), must not be visible in production.

That's entirely up to the resolvers. If your resolver does not map to these files, then you're fine

@Ocramius
Copy link
Contributor

Again, asset manager does not _COMPILE_ a deployable set of assets. It works as a cache. And you can eventually warm it up.

@Ocramius
Copy link
Contributor

Additional note: please stay on topic and forget about the compilation part. Let's move that to another issue eventually.

@Thinkscape
Copy link

Additional note: please stay on topic and forget about the compilation part. Let's move that to another issue eventually.

Additional note: I am on topic :-)

That is NOT a problem. That is how it was DESIGNED

So the design is FLAWED.

WHY SO MANY CAPS @Ocramius ?

Production does NOT mean compilation with AM. With AM, it's cache warnup

Yes, I see this problem. Any time of the day, serving static compiled files (via httpd) is hundreds % faster than serving compiled files from some invisible directory via PHP process... so cache warmup is essentially quite useless. Cache would only make sense for compilable languages, i.e. lessCSS, because we need a css version from .less to be able to feed it to the browser. For collections however we should end up with a monolithic file for production (which you can call any way you like, as long as it's static and not fed through php)

devel CAN and SHOULD use compilation, that's why we use AM and not BaconAssetLoader. Obviously, you don't want that to run at each page load

Question: Why would you use minified js for development ?

that's how it currently works - not sure what the problem is here

It doesn't work.

Helpers simply require the collection smashed together. We don't override helpers like that, and it's just more work and more and more bugs potentially introduced for no real benefit

Really Marco ?

Let me spell it out for you: The benefit is debugging 30x individual JS/CSS files, as opposed to debugging a single monolithic 4MB file with 50.000 lines of code :-) You've mentioned once you've been playing with ExtJS. Have you ever tried debugging an app with ext.all.js (more than 100.000+ lines of code) ? both firefox and chrome usually hang and stutter when trying to display source or set a break point.

Individual files, that have been compiled (or has not been moved), must not be visible in production.

That's entirely up to the resolvers. If your resolver does not map to these files, then you're fine

I know that, however there are other problems that make this hard to accomplish with this module.

@Ocramius
Copy link
Contributor

It's not flawed, it's exactly how it should work. You basically run in an environment that reproduces the production environment as it should be. If you need to work in a dev environment with an unminified, uncompiled version of ExtJs, then configure your dev environment accordingly, but don't expect AssetManager to do the magic for you. That's not what it does, and you're just asking the wrong thing to the wrong project :)

As for the caps, they weren't about screaming at you - I was just underlining some keywords.

@Ocramius
Copy link
Contributor

And no, you're not on-topic. The topic is "how do we hide files from some resolvers?"

@Thinkscape
Copy link

And no, you're not on-topic. The topic is "how do we hide files from some resolvers?"

Yes I am. If the compilation/warmup worked as expected, we wouldn't have access to individual files from collection.

You basically run in an environment that reproduces the production environment as it should be. If you need to work in a dev environment with an unminified, uncompiled version of ExtJs, then configure your dev environment accordingly, but don't expect AssetManager to do the magic for you.

Good. You're slowly approaching the gist of the problem :-)

I know I could have 2 totally separate configs for compilation and "reproducing production environment" but that's stupid as a box of rocks.

The module is called "AssetManager" so I want it to manage my assets. The premise is, I define assets once and then I can easily control how they are served in production and development (using config merging and different files for production).

That's actually simplified use case. What you called "reproducing production env" is sometimes called "staging".

Let's take the ExtJS example:

Development

  • I have 200 JS files in a collection.
  • Individual files from collection are visible under a web path of /js/ext/*
  • Local path for files is module/Application/assets/js/ext/*
  • Files are fed by MVC via AssetManager.

Staging

  • I have 200 JS files in a collection.
  • I want AssetManager to simulate production
  • File is fed by MVC via AssetManager.
  • Collection (merged and minified) is accessible under web path /js/ext.js
  • The collection is merged and minified on-the-fly by AssetManager
  • Caching prevents compilation on each request

Production

  • I have 200 JS files in a collection
  • AssetManager merged and compiled the collection once and stored it under public/js/ext.js
  • Collection (merged and minified) is accessible under web path /js/ext.js
  • Collection is served as static file via httpd (no php).
  • AssetManager is inactive (no config? disabled?) - Individual files from collection result in 404

@Ocramius
Copy link
Contributor

No, you simply define the assets 3 times here

  1. for development
  2. for staging
  3. for production

Additionally, for production, you can just warm it up manually via varnish in a deployment script

@Thinkscape
Copy link

sigh 😪

If it was feasible and fulfilled the requirements I've stated 2 screens above, there wouldn't be any discussion.

@Ocramius
Copy link
Contributor

@Thinkscape what I'm saying is: if you are requiring from this module to be BaconAssetLoader, then use BaconAssetLoader :P

AssetManager works as a cache, and as a MVC fallback on 404s.

As for the settings for ExtJs, for the ~6 months this year that I've been using ExtJs (not right now) we simply had a context switch to use the dev or prod settings for assets.

@work-out-web and that's more interesting for you: each of the files in the ExtJs installation was accessible, even when serving the compiled version...

@RWOverdijk just my 2 cents on all the issue after thinking a bit more about it. Since we state somewhere in the docs that anything that is defined in the asset resolvers is basically accessible via HTTP unless there's a route matching and not producing a 404, I fail to see a security related issue in here. It's just a limitation, and seen the complexity introduced by just trying to solve this particular problem I'd just say that we should deal with it as a "won't fix". If a collection resolver has to produce assets from unaccessible assets, then it should be manually configured by the end user.

As for all the other questions in the issue, that's other issues that should eventually be reported.

I repeat myself for clearness: exposing the single files that are part of a collection resolver is not a problem. If it is, it is up to the end developer to manually build a collection resolver and register it with AssetManager.

@Thinkscape
Copy link

AssetManager works as a cache, and as a MVC fallback on 404s.

Then call it a AssetCache* (the asterisk is there so you can explain in detail why exactly it isn't doing what you're expecting it to do :-)

For some reason, you keep ignoring my use cases, examples and reasoning. I'm wasting my time. Good luck and goodbye.

@Ocramius
Copy link
Contributor

@Thinkscape call it whatever you want. Not going to change a project name for religious beliefs. AssetManager manages assets. How? By acting as a fallback cache.

As for the use cases, as said, I've been using it with ExtJs, with Varnish and both in dev and prod. I'm not ignoring your use cases, I'm just telling it how you're supposed to get shit running with THIS module (and not with what you think this module is).

If you can't wrap your head around how the module works then that's your problem, and you may as well just write a simple bash script that works for your use case. No real need to modify a module that works fine for quite a bit of people to handle everyone's use case.

@RWOverdijk
Copy link
Owner

I'm not going to read everything now, as I'm in a bit of chaos @ work. But @Thinkscape acl is fine, actually. You basically specify that a file is for internal use only; makes sense. Otherwise you cannot get those files in your collection. If you want a collection, your individual files are going to be publicly accessible. Unless you specify rules.

How about local (dist) config files by the way? That's a thing.

Otherwise, create an actual build step with growl (or something). Honestly, the acl is as close as you can get. (It's not actual acl btw. It's just a flag. I'm calling it acl because the flag will be used to decide if you have access, in a list of assets. Making it an access control list).

Also, kids, stop the yelling. :)

@stefanotorresi
Copy link

It's been a while but I wanted to add something to the discussion.

I've been using the module happily for quite some time now, but there are a few catches that could be addressed (which I'm gonna try to do with some PR in the future), and one of them is this very issue.

IMHO this is a potential security issue:
Let's say I have assets installed via Bower, and then want to configure some collections to pass those assets through some Assetic filters.
Having to map every single file of the bower packages can be a PITA. To make things easier I add a path stack for each package, or (worse) the whole bower vendor path. Now any file of those bower packages is actually served, there is no point in keeping them outside public. The security depends on how I trust those packages. I may trust them now, but who knows what bower is gonna deliver in a future update? You can see it's a liability.

Basically, the use case here is being able to decide what AssetManager is going to serve to a more fine grained degree. I don't think it's a request to be marked as 'a developer concern'.

Having the possibility to determine whether or not a resolved assets can be made available to public or just for internal use (i.e. collections) would be a very nice feature.

@RWOverdijk
Copy link
Owner

@stefanotorresi I agree. That would be a very nice feature. :) It can be solved with direct paths to files for collections, or like you said, having the option to specify the accessibility of the asset.

@leogr
Copy link

leogr commented Mar 16, 2014

IMHO, I agree with @Ocramius about the AssetManager's design, but I still think it's a potential security issue, because it's not easy to understand for a new developer what AssetManager is doing.

For instance, first time I played with AM, just trying to create a collection with some JS libs included via composer, I did this in the Application module config:

        'paths' => array(
           __DIR__ . '/../../../vendor',
        ),

After a couple of hours I just realized I exposed all files contained into vendor, including AM's sources :)
I know, it was my fault, but when I was reading the doc I really thought that the specified files in collection will be accessibile, not all files in __DIR__ . '/../../../vendor'

Maybe someone else could be the same mistake.

So I agree with @RWOverdijk and @stefanotorresi that having the possibility to specify the accessibility of the asset would be a really nice feature, also I suggest that by default AM should expose only files declared in collection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants