Permalink
Browse files

New README, removed old implementation and tests.

  • Loading branch information...
1 parent db2bce2 commit da88ae4b26e65f6a64d2c25b2423374c6e19eb7a @zacharyvoase committed Feb 11, 2012
Showing with 131 additions and 542 deletions.
  1. +128 −63 README.md
  2. +3 −2 setup.py
  3. +0 −142 test.py
  4. +0 −335 urlobject.py
View
191 README.md
@@ -1,73 +1,138 @@
-# urlobject.py v0.5
+# URLObject 2
-`URLObject` is a utility class for manipulating URLs.
+`URLObject` is a utility class for manipulating URLs. The latest incarnation of
+this library builds upon the ideas of its predecessor, but aims for a clearer
+API, focusing on proper method names over operator overrides. It's also being
+developed from the ground up in a test-driven manner, and has full Sphinx
+documentation.
-## Example Usage
+## Tour
-Here is how you use the library:
-
>>> from urlobject import URLObject
- >>> url = URLObject(scheme='http', host='example.com')
- >>> print url
- http://example.com/
- >>> print url / 'some' / 'path'
- http://example.com/some/path
- >>> print url & ('key', 'value')
- http://example.com/?key=value
- >>> print url & ('key', 'value') & ('key2', 'value2')
- http://example.com/?key=value&key2=value2
- >>> print url * 'fragment'
- http://example.com/#fragment
- >>> print url / u'\N{LATIN SMALL LETTER N WITH TILDE}'
- http://example.com/%C3%B1
- >>> url
- <URLObject(u'http://example.com/') at 0x...>
- >>> new_url = url / 'place'
- >>> new_url
- <URLObject(u'http://example.com/place') at 0x...>
- >>> new_url &= 'key', 'value'
- >>> new_url
- <URLObject(u'http://example.com/place?key=value') at 0x...>
- >>> new_url &= 'key2', 'value2'
- >>> new_url
- <URLObject(u'http://example.com/place?key=value&key2=value2') at 0x...>
- >>> new_url |= 'key', 'newvalue'
- >>> new_url
- <URLObject(u'http://example.com/place?key2=value2&key=newvalue') at 0x...>
-
-## Important points to note
-
-* URLObjects are completely unicode-aware (they subclass `unicode`). This also means that international hostnames will be encoded to IDNA format, and international characters in pathnames will be automatically escaped. You should continue using unicode values for everything; the various components will be en/decoded on-the-fly.
-
-* `url & (key, value)` adds `key=value` to URL, even if `key` is already present as a query parameter. This allows you to have multiple appearances of `key` in the query.
-
-* `url | (key, value)` adds `key=value` to URL, removing any previous appearance of `key` in the query parameters.
-
-* `url & dictionary` and `url | dictionary` work similarly to their `(key, value)` counterparts, only they add every key, value pair in the dictionary to the query string. You can also pass in a list of key, value pairs.
-
-* `url / 'path'` adds `'path'` to the current path, quoting special characters if necessary.
-
-* `url // 'path'` sets the path to `'path'`, removing the current path if present.
-
-* `url * 'fragment'` sets the fragment to `'fragment'`.
-* `url ^ 123` sets the port number to `123`.
+Create a URLObject with a string representing a URL. `URLObject` is a regular
+subclass of `unicode`, it just has several properties and methods which make it
+easier to manipulate URLs. All the basic slots from urlsplit are there:
-* `url.with_*(value)` can be done with scheme, host, port, path, query and fragment, returning a new URL object with the value in that place.
-
-* `url.without_port()`, `url.without_path()`, `url.without_query()` and `url.without_fragment()` all exist and do something obvious.
-
-* Operations return a *new* URL object (URL objects are immutable).
-
-## Hints and tips
-
-* If a URL's scheme is `'http'` and you try to set the port to 80, it is equivalent to not specifying the port (same goes for `'https'`, `'ftp'` and `'ftps'` for their appropriate ports).
-
-* If you need to end the path with `'/'`, you can do either `url / ''` or `url / 'last_component/'`.
-
-* The query parameters are available as a list through the `query_list()` method and as a dictionary via `query_dict()`. By default, the latter method will return a dictionary with lists as the values, corresponding to potential multiple occurrences of the same key. You can just take the last value by passing the `seq=False` keyword argument to the method.
+ >>> url = URLObject("https://github.com/zacharyvoase/urlobject?spam=eggs#foo")
+ >>> url
+ URLObject('https://github.com/zacharyvoase/urlobject?spam=eggs#foo')
+ >>> unicode(url)
+ u'https://github.com/zacharyvoase/urlobject'
+ >>> url.scheme
+ u'https'
+ >>> url.netloc
+ u'github.com'
+ >>> url.hostname
+ u'github.com'
+ >>> (url.user, url.password)
+ (None, None)
+ >>> print url.port
+ None
+ >>> url.path
+ URLPath(u'/zacharyvoase/urlobject')
+ >>> url.query
+ QueryString(u'spam=eggs')
+ >>> url.fragment
+ u'foo'
+
+You can replace any of these slots using a `with_*()` method. Remember
+that, because `unicode` (and therefore `URLObject`) is immutable, these methods
+all return new URLs:
+
+ >>> url.with_scheme('http')
+ URLObject('http://github.com/zacharyvoase/urlobject?spam=eggs#foo')
+ >>> url.with_netloc('example.com')
+ URLObject('https://example.com/zacharyvoase/urlobject?spam=eggs#foo')
+ >>> url.with_auth('alice', '1234')
+ URLObject('https://alice:1234@github.com/zacharyvoase/urlobject?spam=eggs#foo')
+ >>> url.with_path('/some_page')
+ URLObject('https://github.com/some_page?spam=eggs#foo')
+ >>> url.with_query('funtimes=yay')
+ URLObject('https://github.com/zacharyvoase/urlobject?funtimes=yay#foo')
+ >>> url.with_fragment('example')
+ URLObject('https://github.com/zacharyvoase/urlobject?spam=eggs#example')
+
+For the query and fragment, `without_` methods also exist:
+
+ >>> url.without_query()
+ URLObject('https://github.com/zacharyvoase/urlobject#foo')
+ >>> url.without_fragment()
+ URLObject('https://github.com/zacharyvoase/urlobject?spam=eggs')
+
+
+### Path
+
+The `path` property is an instance of `URLPath`, which has several methods and
+properties for manipulating the path string:
+
+ >>> url.path
+ URLPath(u'/zacharyvoase/urlobject')
+ >>> url.path.parent
+ URLPath(u'/zacharyvoase')
+ >>> url.path.segments
+ ('zacharyvoase', 'urlobject')
+ >>> url.path.add_segment('subnode')
+ URLPath(u'/zacharyvoase/urlobject/subnode')
+ >>> url.path.root
+ URLPath(u'/')
+
+Some of these are aliased on the URL itself:
+
+ >>> url.parent
+ URLObject('https://github.com/zacharyvoase?spam=eggs#foo')
+ >>> url.add_path_segment('subnode')
+ URLObject('https://github.com/zacharyvoase/urlobject/subnode?spam=eggs#foo')
+ >>> url.root
+ URLObject('https://github.com/?spam=eggs#foo')
+
+
+### Query string
+
+The `query` property is an instance of `QueryString`, so you can access
+sub-attributes of that with richer representations of the query string:
+
+ >>> url.query
+ QueryString(u'spam=eggs')
+ >>> url.query.list
+ [(u'spam', u'eggs')]
+ >>> url.query.dict
+ {u'spam': u'eggs'}
+ >>> url.query.multi_dict
+ {u'spam': [u'eggs']}
+
+Modifying the query string is easy, too. You can 'add' or 'set' parameters: any
+method beginning with `add_` will allow you to use the same parameter name
+multiple times in the query string; methods beginning with `set_` will only
+allow one value for a given parameter name. Don't forget that each method will
+return a *new* `QueryString` instance:
+
+ >>> url.query.add_param(u'spam', u'ham')
+ QueryString(u'spam=eggs&spam=ham')
+ >>> url.query.set_param(u'spam', u'ham')
+ QueryString(u'spam=ham')
+ >>> url.query.add_params({u'spam': u'ham', u'foo': u'bar'})
+ QueryString(u'spam=eggs&spam=ham&foo=bar')
+ >>> url.query.set_params({u'spam': u'ham', u'foo': u'bar'})
+ QueryString(u'spam=ham&foo=bar')
+
+Delete parameters with `del_param()` and `del_params()`. These will remove all
+appearances of the requested parameter name from the `QueryString`:
+
+ >>> url.query.del_param(u'spam')
+ QueryString(u'')
+ >>> url.query.add_params({u'foo': u'bar'}).del_params(['spam', 'foo'])
+ QueryString(u'')
+
+Again, some of these methods are aliased on the `URLObject` directly:
+
+ >>> url.add_query_param(u'spam', u'ham')
+ URLObject('https://github.com/zacharyvoase/urlobject?spam=eggs&spam=ham#foo')
+ >>> url.set_query_param(u'spam', u'ham')
+ URLObject('https://github.com/zacharyvoase/urlobject?spam=ham#foo')
+ >>> url.del_query_param(u'spam')
+ URLObject('https://github.com/zacharyvoase/urlobject#foo')
-* Since `URLObject` subclasses directly from Python's built-in `unicode`, you can pass URL objects straight into `urllib2.urlopen()`, JSON serializers, templating systems, etc. If you need a plain-old string or Unicode object, you can just call `str` or `unicode` on it.
## (Un)license
View
@@ -5,10 +5,11 @@
setup(
name='URLObject',
- version='0.6.0',
+ version='2.0.0a1',
description='A utility class for manipulating URLs.',
author='Zachary Voase',
author_email='z@zacharyvoase.com',
url='http://github.com/zacharyvoase/urlobject',
- py_modules=['urlobject'],
+ package_dir={'': 'lib'},
+ packages=['urlobject'],
)
View
142 test.py
@@ -1,142 +0,0 @@
-# -*- coding: utf-8 -*-
-
-import doctest
-import unittest
-
-import urlobject
-from urlobject import URLObject
-
-
-class URLObjectTest(unittest.TestCase):
-
- def test_netloc(self):
- self.assertEqual(URLObject(host='example.com'), u'//example.com/')
-
- def test_scheme(self):
- self.assertEqual(URLObject(scheme='http', path='/hello/'), u'http:///hello/')
- self.assertEqual(URLObject(scheme='http', host='example.com'), u'http://example.com/')
- self.assertEqual(URLObject(scheme='', path='/hello/'), u'/hello/')
-
- def test_path(self):
- url = URLObject(path='/hello/')
- self.assertEqual(url, u'/hello/')
- # Using / operator syntax.
- self.assertEqual(url / 'foo', u'/hello/foo')
- # Two ways to do trailing slashes.
- self.assertEqual(url / 'foo/', u'/hello/foo/')
- self.assertEqual(url / 'foo' / '', u'/hello/foo/')
-
- def test_fragment(self):
- url = URLObject(path='/hello', fragment='world')
- self.assertEqual(url, u'/hello#world')
- self.assertEqual(url / 'fun', u'/hello/fun#world')
-
- def test_query(self):
- url = URLObject(scheme='http', host='www.google.com')
- url |= ('q', 'query')
- self.assertEqual(url, u'http://www.google.com/?q=query')
- self.assertEqual(url | ('q', 'another'), u'http://www.google.com/?q=another')
- self.assertEqual(url & ('q', 'another'), u'http://www.google.com/?q=query&q=another')
-
- def test_query_list(self):
- url = URLObject(scheme='http', host='www.google.com')
- url |= ('q', 'query')
- self.assertEqual(url.query_list(), [(u'q', u'query')])
-
- self.assertEqual(
- (url | ('q', 'another')).query_list(),
- [(u'q', u'another')])
-
- self.assertEqual(
- (url & ('q', 'another')).query_list(),
- [(u'q', u'query'), (u'q', u'another')])
-
- def test_query_dict_seq(self):
- url = URLObject(scheme='http', host='www.google.com')
- url |= ('q', 'query')
- self.assertEqual(url.query_dict(), {u'q': [u'query']})
-
- self.assertEqual(
- (url | ('q', 'another')).query_dict(),
- {u'q': [u'another']})
-
- self.assertEqual(
- (url & ('q', 'another')).query_dict(),
- {u'q': [u'query', u'another']})
-
- def test_query_dict_noseq(self):
- url = URLObject(scheme='http', host='www.google.com')
- url |= ('q', 'query')
- self.assertEqual(url.query_dict(seq=False), {u'q': u'query'})
-
- self.assertEqual(
- (url | ('q', 'another')).query_dict(seq=False),
- {u'q': u'another'})
-
- self.assertEqual(
- (url & ('q', 'another')).query_dict(seq=False),
- {u'q': u'another'})
-
- def test_unicode_query_strings(self):
- url = URLObject(scheme='http', host='example.com', path='/')
- url |= {'a': u'é'}
- self.assertEqual(str(url), 'http://example.com/?a=%C3%A9')
- url |= {'b': 'c'}
- self.assertEqual(str(url), 'http://example.com/?a=%C3%A9&b=c')
-
-
-class URLObjectParseTest(unittest.TestCase):
-
- def setUp(self):
- self.url = URLObject.parse(u'http://www.google.com/search?q=something&hl=en#frag')
-
- def test_scheme(self):
- self.assertEqual(self.url.scheme, 'http')
-
- def test_host(self):
- self.assertEqual(self.url.host, 'www.google.com')
-
- def test_host_idna_encoding_is_parsed(self):
- url = URLObject.parse(u'http://xn--hllo-bpa.com/')
- self.assertEqual(url.host, u'héllo.com')
-
- def test_host_idna_encoding_is_preserved(self):
- url = URLObject.parse(u'http://xn--hllo-bpa.com/')
- self.assertEqual(unicode(url), u'http://xn--hllo-bpa.com/')
-
- def test_path(self):
- self.assertEqual(self.url.path, '/search')
-
- def test_path_is_not_double_escaped(self):
- url = URLObject.parse('http://www.google.com/path%20with%20spaces')
- self.assertEqual(unicode(url), 'http://www.google.com/path%20with%20spaces')
- self.assertEqual(url.path, '/path with spaces')
-
- def test_fragment(self):
- self.assertEqual(self.url.fragment, 'frag')
-
- def test_fragment_is_not_double_escaped(self):
- url = URLObject.parse('http://google.com/#frag%20with%20escapes')
- self.assertEqual(unicode(url), 'http://google.com/#frag%20with%20escapes')
- self.assertEqual(url.fragment, 'frag with escapes')
-
- def test_query(self):
- self.assertEqual(self.url.query, 'q=something&hl=en')
-
- def test_query_is_not_double_escaped(self):
- url = URLObject.parse('http://www.google.com/search?q=a%20string%20with%20escapes')
- self.assertEqual(unicode(url), 'http://www.google.com/search?q=a%20string%20with%20escapes')
- self.assertEqual(url.query, 'q=a%20string%20with%20escapes')
-
- def test_multiple_parses_are_idempotent(self):
- url = u'http://xn-hllo-bpa.com/path%20withspaces?query=es%25capes&foo=bar#frag%28withescapes%29'
- parse1 = URLObject.parse(url)
- self.assertEqual(unicode(url), unicode(parse1))
- parse2 = URLObject.parse(unicode(parse1))
- self.assertEqual(unicode(url), unicode(parse2))
- self.assertEqual(unicode(parse1), unicode(parse2))
-
-
-if __name__ == '__main__':
- doctest.testmod(urlobject, optionflags=doctest.ELLIPSIS)
- unittest.main()
Oops, something went wrong.

0 comments on commit da88ae4

Please sign in to comment.