Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix remote-path normalization (#127) #128

Closed
wants to merge 2 commits into from

Conversation

bezoerb
Copy link

@bezoerb bezoerb commented Mar 1, 2017

No description provided.

@@ -9,7 +9,6 @@ var cloneable = require('cloneable-readable');
var replaceExt = require('replace-ext');
var cloneStats = require('clone-stats');
var cloneBuffer = require('clone-buffer');
var removeTrailingSep = require('remove-trailing-separator');
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to normalize to allow remote paths with trailing slash

return str;
}

if (urlRegex().test(str) && !/^www\./.test(str)) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

url-regex matches on strings like www.google.de which are also a valid directory.
So let path.normalize do the normalization

"replace-ext": "^1.0.0"
"replace-ext": "^1.0.0",
"normalize-url": "1.6.1",
"url-regex": "3.2.0"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix versions for normalize-urland url-regex as the latest version only support node >= 4.
Could be updated when vinyl drops node 0.10...0.12 support

@bezoerb bezoerb mentioned this pull request Mar 1, 2017
@yocontra
Copy link
Member

yocontra commented Mar 1, 2017

Didn't look yet, but we should have tests for:

  • custom protocols
  • http + https
  • ftp
  • no protocol
  • network drives
  • relative paths
  • absolute paths

Will review later today

@phated
Copy link
Member

phated commented Mar 1, 2017

Definitely needs all of the tests stated by @contra

I'd also like to wait on feedback from @darsain @jonschlinkert and @doowb

@jonschlinkert
Copy link

jonschlinkert commented Mar 1, 2017

Hmm, since path handling in vinyl is completely geared towards file system paths, maybe this is an opportunity to rethink the URL handling so that it's more idiomatic for URLs.

For example:

  • the "domain" (protocol plus hostname etc) of a URL is functionally equivalent to file.cwd or file.base
  • the path or pathname of a URL if functionally equivalent to the relative part of a file path

Maybe we should add support for file.url as an alternative to file.cwd? I'm not sure what the best approach is, but my gut reaction is that either a) URL normalization doesn't really belong here, so maybe we should create an adapter like vinyl-url or something (although this isn't really about adapters, it's about the actual virtual object being created, so even if there was an adapter for handling requests, it still wouldn't address how the URL segments should map over to file path segments) or b) if it does belong here, we should do it the right way and add support for all of the URL segments that are created by url.parse() (@contra is touching on this I think). Just my 2c.

edit: idea... we could just create conventions for how the URL segments created by url.parse() map over to the existing path segments. Then, if a user defines {url: true} (and/or we can use the regex check to determine this) when creating a new file, we can parse it as a URL into those segments. So the result would be a vinyl file object that has the same path segments and conventions as existing vinyl files, but using a URL and normalization logic that is idiomatic for URLs.

edit edit: regardless of the approach, it seems like the code in the pr doesn't make sense for this lib. If we're going to go to these measures for normalization, maybe it's better to allow a normalize function to be passed so that users can deal with edge cases. The normalize-url module does some questionable things on urls.

@darsain
Copy link
Contributor

darsain commented Mar 1, 2017

This feels like an entrance to a very deep and complicated rabbit hole that will lead to a lot of edge cases and weird behavior as we try to unify 2 different resource pointing concepts into one ambiguous monster.

For start, how would we handle search and hash parts of the url? or ports, or username:password pairs, or tons and tons of other URI stuff... the RFC 3987 is a very nasty beast to tame: http://www.faqs.org/rfcs/rfc3987.html

I think it would be simpler to create a separate "virtual online resource format" project, differentiate between them with isVinyl/isUril (example name) or something like that, and have a separate code paths for each, since that will be necessary anyways as you need different APIs and tools to deal with files and network requests.

@yocontra
Copy link
Member

yocontra commented Mar 1, 2017

It worked fine prior to path normalization being added - vinyl-http, vinyl-ftp, etc. are all things that exist (now broken due to normalization addition)

Worst case we can revert the normalization and do it a layer up

@jonschlinkert
Copy link

This feels like an entrance to a very deep and complicated rabbit hole that will lead to a lot of edge cases and weird behavior as we try to unify 2 different resource pointing concepts into one ambiguous monster.

yeah, this sums up what I was thinking as well.

@phated
Copy link
Member

phated commented Mar 2, 2017

@contra it was a major bump because it changed path semantics

@bezoerb
Copy link
Author

bezoerb commented Mar 2, 2017

it was a major bump because it changed path semantics

That's why i asked if the dropped support for remote paths was the intended behaviour and vinyl should focus on file system paths. What was the the goal in introducing normalization? What's the advantage in making this lib more strictly focused on filesystems?

edit edit: regardless of the approach, it seems like the code in the pr doesn't make sense for this lib. If we're going to go to these measures for normalization, maybe it's better to allow a normalize function to be passed so that users can deal with edge cases.

I personally like the idea. This would address all concerns regarding this pr and let's the user opt-in to the opinionated normalization process which in my case would be totally fine to fix my issue.

Maybe i am in the wrong position to make decisions here but i would be fine to propose another PR for the normalization overloading ;)

@yocontra
Copy link
Member

yocontra commented Mar 2, 2017

If the user is passing in a normalize function then why wouldn't they normalize the path before passing it in? Not a fan of that. I don't see why it would be a huge deal to mandate people normalize before passing it in (previous behavior, better docs) and leave it out of the base level library. What was the normalization fixing? IIRC every adapter does it's own normalization.

@phated
Copy link
Member

phated commented Mar 2, 2017

I completely disagree. However, I've actually thought of an elegant way to handle both. I'll write up my thoughts in the morning.

@phated
Copy link
Member

phated commented Mar 2, 2017

I actually believe this problem can and should be solved by my idea of an Enhanced Stat Object (#105). Usage would look something like:

var file = new Vinyl({
  path: 'http://google.com',
  stat: {
    isRemote: true
  }
})

The isRemote flag in the Enhanced Stat Object would indicate that a Vinyl resource is being constructed with a remote path and would disable normalization (or do URL normalization). Also, Enhanced Stat Objects would allow for property changes and track history, a non-remote file could be converted to a remote file and users would know where it happened in a pipeline.

@darsain
Copy link
Contributor

darsain commented Mar 2, 2017

@phated yup, that'd be a pretty good solution

@bezoerb
Copy link
Author

bezoerb commented Mar 2, 2017

Closing this PR because it won't make it into master. Thanks for the discussion/feedback.
In my opinion the enhanced stat object solution entails more maintenance work as there will always be edge-cases which need to be fixed and it takes away the control from the user.

I would go for a) "drop the normalization + enhance documentation" or b) something like #130.

But no matter what option you'll choose. I'm sure it will work out ;)

@bezoerb bezoerb closed this Mar 2, 2017
@bezoerb bezoerb deleted the bug/remote-paths branch March 2, 2017 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants