Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added more supporting methods for URN paths. #201

Merged
merged 6 commits into from
Mar 31, 2015
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 112 additions & 47 deletions src/URI.js
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,42 @@
'%3D': '='
}
}
},
urnpath: {
// The characters under `encode` are the characters called out by RFC 2141 as being acceptable
// for usage in a URN. RFC2141 also calls out "-", ".", and "_" as acceptable characters, but
// these aren't encoded by encodeURIComponent, so we don't have to call them out here. Also
// note that the colon character is not featured in the encoding map; this is because URI.js
// gives the colons in URNs semantic meaning as the delimiters of path segements, and so it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If : is neither the URN path delimiter, nor the "industry default path delimiter", we need to talk about what is.

// should not appear unencoded in a segment itself.
// See also the note above about RFC3986 and capitalalized hex digits.
encode: {
expression: /%(21|24|27|28|29|2A|2B|2C|3B|3D|40)/ig,
map: {
'%21': '!',
'%24': '$',
'%27': '\'',
'%28': '(',
'%29': ')',
'%2A': '*',
'%2B': '+',
'%2C': ',',
'%3B': ';',
'%3D': '=',
'%40': '@'
}
},
// These characters are the characters called out by RFC2141 as "reserved" characters that
// should never appear in a URN, plus the colon character (see note above).
decode: {
expression: /[\/\?#:]/g,
map: {
'/': '%2F',
'?': '%3F',
'#': '%23',
':': '%3A'
}
}
}
};
URI.encodeQuery = function(string, escapeQuerySpace) {
Expand Down Expand Up @@ -366,6 +402,22 @@

return segments.join('/');
};
URI.recodeURNPath = function(string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URI.recodeUrnPath

this method looks identical to recodePath() except for the delimiter and the recode callback. this should probably be refactored to a generator like generatePrefixAccessor() does for accessors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do that. Interestingly, both of those look like their decodePath counterparts, so I might be able to make a generator for those as well.

var segments = (string + '').split(':');
for (var i = 0, length = segments.length; i < length; i++) {
segments[i] = URI.encodeURNPathSegment(URI.decode(segments[i]));
}

return segments.join(':');
};
URI.decodeURNPath = function(string) {
var segments = (string + '').split(':');
for (var i = 0, length = segments.length; i < length; i++) {
segments[i] = URI.decodeURNPathSegment(segments[i]);
}

return segments.join(':');
};
// generate encode/decode path functions
var _parts = {'encode':'encode', 'decode':'decode'};
var _part;
Expand All @@ -389,6 +441,10 @@
URI[_part + 'PathSegment'] = generateAccessor('pathname', _parts[_part]);
}

for (_part in _parts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we need two loops here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's embarrassing. Fixed in upcoming commit.

URI[_part + 'URNPathSegment'] = generateAccessor('urnpath', _parts[_part]);
}

URI.encodeReserved = generateAccessor('reserved', 'encode');

URI.parse = function(string, parts) {
Expand Down Expand Up @@ -946,9 +1002,13 @@
p.pathname = function(v, build) {
if (v === undefined || v === true) {
var res = this._parts.path || (this._parts.hostname ? '/' : '');
return v ? URI.decodePath(res) : res;
return v ? (this._parts.urn ? URI.decodeURNPath : URI.decodePath)(res) : res;
} else {
this._parts.path = v ? URI.recodePath(v) : '/';
if (this._parts.urn) {
this._parts.path = v ? URI.recodeURNPath(v) : '';
} else {
this._parts.path = v ? URI.recodePath(v) : '/';
}
this.build(!build);
return this;
}
Expand Down Expand Up @@ -1624,6 +1684,7 @@
if (this._parts.urn) {
return this
.normalizeProtocol(false)
.normalizePath(false)
.normalizeQuery(false)
.normalizeFragment(false)
.build();
Expand Down Expand Up @@ -1670,63 +1731,67 @@
return this;
};
p.normalizePath = function(build) {
if (this._parts.urn) {
return this;
}

if (!this._parts.path || this._parts.path === '/') {
return this;
}

var _was_relative;
var _path = this._parts.path;
var _leadingParents = '';
var _parent, _pos;

// handle relative paths
if (_path.charAt(0) !== '/') {
_was_relative = true;
_path = '/' + _path;
}
if (this._parts.urn) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be structured in a way that avoids the "christmas tree" effect of indentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It certainly can be; in order to do that, we'd have to duplicate the this.build(!build) line. That's probably not so bad, but I think I can just pull the parent-resolving part of the code into its own subroutine, and it won't be so hideously overindented then.

if (!_path) {
return this;
}
_path = URI.recodeURNPath(this._parts.path);
} else {
if (!_path || _path === '/') {
return this;
}
var _was_relative;
var _leadingParents = '';
var _parent, _pos;

// handle relative paths
if (_path.charAt(0) !== '/') {
_was_relative = true;
_path = '/' + _path;
}

// resolve simples
_path = _path
.replace(/(\/(\.\/)+)|(\/\.$)/g, '/')
.replace(/\/{2,}/g, '/');
// resolve simples
_path = _path
.replace(/(\/(\.\/)+)|(\/\.$)/g, '/')
.replace(/\/{2,}/g, '/');

// remember leading parents
if (_was_relative) {
_leadingParents = _path.substring(1).match(/^(\.\.\/)+/) || '';
if (_leadingParents) {
_leadingParents = _leadingParents[0];
// remember leading parents
if (_was_relative) {
_leadingParents = _path.substring(1).match(/^(\.\.\/)+/) || '';
if (_leadingParents) {
_leadingParents = _leadingParents[0];
}
}
}

// resolve parents
while (true) {
_parent = _path.indexOf('/..');
if (_parent === -1) {
// no more ../ to resolve
break;
} else if (_parent === 0) {
// top level cannot be relative, skip it
_path = _path.substring(3);
continue;
// resolve parents
while (true) {
_parent = _path.indexOf('/..');
if (_parent === -1) {
// no more ../ to resolve
break;
} else if (_parent === 0) {
// top level cannot be relative, skip it
_path = _path.substring(3);
continue;
}

_pos = _path.substring(0, _parent).lastIndexOf('/');
if (_pos === -1) {
_pos = _parent;
}
_path = _path.substring(0, _pos) + _path.substring(_parent + 3);
}

_pos = _path.substring(0, _parent).lastIndexOf('/');
if (_pos === -1) {
_pos = _parent;
// revert to relative
if (_was_relative && this.is('relative')) {
_path = _leadingParents + _path.substring(1);
}
_path = _path.substring(0, _pos) + _path.substring(_parent + 3);
}

// revert to relative
if (_was_relative && this.is('relative')) {
_path = _leadingParents + _path.substring(1);
_path = URI.recodePath(_path);
}

_path = URI.recodePath(_path);
this._parts.path = _path;
this.build(!build);
return this;
Expand Down
29 changes: 29 additions & 0 deletions test/test.js
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,20 @@
equal(u.pathname(), '/', 'empty absolute path');
equal(u.toString(), '/', 'empty absolute path to string');
});
test('URN paths', function() {
var u = new URI('urn:uuid:6e8bc430-9c3a-11d9-9669-0800200c9a66?foo=bar');
u.pathname('uuid:de305d54-75b4-431b-adb2-eb6b9e546013');
equal(u.pathname(), 'uuid:de305d54-75b4-431b-adb2-eb6b9e546013');
equal(u + '', 'urn:uuid:de305d54-75b4-431b-adb2-eb6b9e546013?foo=bar');

u.pathname('');
equal(u.pathname(), '', 'changing pathname ""');
equal(u+'', 'urn:?foo=bar', 'changing url ""');

u.pathname('music:classical:Béla Bártok%3a Concerto for Orchestra');
equal(u.pathname(), 'music:classical:B%C3%A9la%20B%C3%A1rtok%3A%20Concerto%20for%20Orchestra', 'path encoding');
equal(u.pathname(true), 'music:classical:Béla Bártok%3A Concerto for Orchestra', 'path decoded');
});
test('query', function() {
var u = new URI('http://example.org/foo.html');
u.query('foo=bar=foo');
Expand Down Expand Up @@ -1050,6 +1064,20 @@
u = URI('/../../../../../www/common/js/app/../../../../www_test/common/js/app/views/view-test.html');
u.normalize();
equal(u.path(), '/www_test/common/js/app/views/view-test.html', 'parent absolute');

// URNs
u = URI('urn:people:authors:poets:Shel Silverstein');
u.normalize();
equal(u.path(), 'people:authors:poets:Shel%20Silverstein');

u = URI('urn:people:authors:philosophers:Søren Kierkegaard');
u.normalize();
equal(u.path(), 'people:authors:philosophers:S%C3%B8ren%20Kierkegaard');

// URNs path separator preserved
u = URI('urn:games:cards:Magic%3A the Gathering');
u.normalize();
equal(u.path(), 'games:cards:Magic%3A%20the%20Gathering');
});
test('normalizeQuery', function() {
var u = new URI('http://example.org/foobar.html?');
Expand Down Expand Up @@ -1559,6 +1587,7 @@

equal(URI.decodeQuery('%%20'), '%%20', 'malformed URI component returned');
equal(URI.decodePathSegment('%%20'), '%%20', 'malformed URI component returned');
equal(URI.decodeURNPathSegment('%%20'), '%%20', 'malformed URN component returned');
});
test('encodeQuery', function() {
var escapeQuerySpace = URI.escapeQuerySpace;
Expand Down