Skip to content

Latest commit

Β 

History

History
461 lines (372 loc) Β· 13.9 KB

API.md

File metadata and controls

461 lines (372 loc) Β· 13.9 KB

furl API

Basics

furl objects let you access and modify the components of a URL

scheme://username:password@host:port/path?query#fragment
  • scheme is the scheme string, all lowercase.
  • username is the username string for authentication.
  • password is the password string for authentication with username.
  • host is the domain name, IPv4, or IPv6 address as a string. Domain names are all lowercase.
  • port is an integer or None. A value of None means no port specified and the default port for the given scheme should be inferred, if possible.
  • path is a Path object comprised of path segments.
  • query is a Query object comprised of query arguments.
  • fragment is a Fragment object comprised of a Path and Query object separated by an optional '?' separator.

Scheme, Username, Password, Host, Port, and Network Location

scheme, username, password, and host are strings. port is an integer or None.

>>> f = furl('http://user:pass@www.google.com:99/')
>>> f.scheme, f.username, f.password, f.host, f.port
('http', 'user', 'pass', 'www.google.com', 99)

furl infers the default port for common schemes.

>>> f = furl('https://secure.google.com/')
>>> f.port
443

>>> f = furl('unknown://www.google.com/')
>>> print f.port
None

netloc is the string combination of username, password, host, and port, not including port if it is None or the default port for the provided scheme.

>>> furl('http://www.google.com/').netloc
'www.google.com'

>>> furl('http://www.google.com:99/').netloc
'www.google.com:99'

>>> furl('http://user:pass@www.google.com:99/').netloc
'user:pass@www.google.com:99'

Path

URL paths in furl are Path objects that have segments, a list of zero or more path segments that can be manipulated directly. Path segments in segments are maintaned decoded and all interaction with segments should take place with decoded segment strings.

>>> f = furl('http://www.google.com/a/larg%20ish/path')
>>> f.path
Path('/a/larg ish/path')
>>> f.path.segments
['a', 'larg ish', 'path']
>>> str(f.path)
'/a/larg%20ish/path'

Manipulation

>>> f.path.segments = ['a', 'new', 'path', '']
>>> str(f.path)
'/a/new/path/'

>>> f.path = 'o/hi/there/with%20some%20encoding/'
>>> f.path.segments
['o', 'hi', 'there', 'with some encoding', '']
>>> str(f.path)
'/o/hi/there/with%20some%20encoding/'

>>> f.url
'http://www.google.com/o/hi/there/with%20some%20encoding/'

>>> f.path.segments = ['segments', 'are', 'maintained', 'decoded', '^`<>[]"#/?']
>>> str(f.path)
'/segments/are/maintained/decoded/%5E%60%3C%3E%5B%5D%22%23%2F%3F'

A Path can be absolute or not, as specified by the boolean isabsolute. While URL paths are always absolute if they aren't empty, isabsolute is useful for fragment paths.

>>> f = furl('http://www.google.com/a/directory/#/absolute/fragment/path/')
>>> f.path.isabsolute
True
>>> f.path.isabsolute = False
Traceback (most recent call last):
  ...
AttributeError: Path.isabsolute is read only for URL paths. URL paths are always
absolute if not empty.
>>> f.fragment.path.isabsolute
True
>>> f.fragment.path.isabsolute = False
>>> f.url
'http://www.google.com/a/directory/#absolute/fragment/path/'

A path that ends with '/' is considered a directory, and otherwise considered a file. The Path attribute isdir returns True if the path is a directory, False otherwise. Conversely, the attribute isfile returns True if the path is a file, False otherwise.

>>> f = furl('http://www.google.com/a/directory/')
>>> f.path.isdir
True
>>> f.path.isfile
False

>>> f = furl('http://www.google.com/a/file')
>>> f.path.isdir
False
>>> f.path.isfile
True

Query

URL queries in furl are Query objects that have params, a one dimensional ordered multivalue dictionary of query keys and values. Query keys and values in params are maintained decoded and all interaction with params should take place with decoded strings.

>>> f = furl('http://www.google.com/?one=1&two=2')
>>> f.query
Query('one=1&two=2')
>>> f.query.params
omdict1D([('one', '1'), ('two', '2')])
>>> str(f.query)
'one=1&two=2'

furl objects and Fragment objects (covered below) contain a Query object, and args is provided as a shortcut on these objects to access query.params.

>>> f = furl('http://www.google.com/?one=1&two=2')
>>> f.query.params
omdict1D([('one', '1'), ('two', '2')])
>>> f.args
omdict1D([('one', '1'), ('two', '2')])
>>> id(f.query.params) == id(f.args)
True

Manipulation

params is a one dimensional ordered multivalue dictionary that maintains method parity with Python's standard dictionary.

>>> f.query = 'silicon=14&iron=26&inexorable%20progress=vae%20victus'
>>> f.query.params
omdict1D([('silicon', '14'), ('iron', '26'), ('inexorable progress', 'vae victus')])
>>> del f.args['inexorable progress']
>>> f.args['magnesium'] = '12'
>>> f.args
omdict1D([('silicon', '14'), ('iron', '26'), ('magnesium', '12')])

params can also store multiple values for the same key because it is a one dimensional ordered multivalue dictionary.

>>> f = furl('http://www.google.com/?space=jams&space=slams')
>>> f.args['space']
'jams'
>>> f.args.getlist('space')
['jams', 'slams']
>>> f.args.addlist('repeated', ['1', '2', '3'])
>>> f.querystr
'space=jams&space=slams&repeated=1&repeated=2&repeated=3'
>>> f.args.popvalue('space')
'slams'
>>> f.args.popvalue('repeated', '2')
'2'
>>> f.querystr
'space=jams&repeated=1&repeated=3'

params is one dimensional. If a list of values is provided as a query value, that list is interpretted as multiple values.

>>> f = furl()
>>> f.args['repeated'] = ['1', '2', '3']
>>> f.add(args={'space':['jams', 'slams']})
>>> f.querystr
'repeated=1&repeated=2&repeated=3&space=jams&space=slams'

This makes sense - URL queries are inherently one dimensional. Query values cannot have subvalues.

See the omdict documentation for more information on interacting with the ordered multivalue dictionary params.

encode(delimeter='&') can be used to encode query strings with delimeters like ';'.

>>> f.query = 'space=jams&woofs=squeeze+dog'
>>> f.query.encode()
'space=jams&woofs=squeeze+dog'
>>> f.query.encode(';')
'space=jams;woofs=squeeze+dog'

Fragment

URL fragments in furl are Fragment objects that have a Path path and Query query separated by an optional '?' separator.

>>> f = furl('http://www.google.com/#/fragment/path?with=params')
>>> f.fragment
Fragment('/fragment/path?with=params')
>>> f.fragment.path
Path('/fragment/path')
>>> f.fragment.query
Query('with=params')
>>> f.fragment.separator
True

Manipulation of Fragments is done through the Fragment's Path and Query instances, path and query.

>>> f = furl('http://www.google.com/#/fragment/path?with=params')
>>> str(f.fragment)
'/fragment/path?with=params'
>>> f.fragment.path.segments.append('file.ext')
>>> str(f.fragment)
'/fragment/path/file.ext?with=params'

>>> f = furl('http://www.google.com/#/fragment/path?with=params')
>>> str(f.fragment)
'/fragment/path?with=params'
>>> f.fragment.args['new'] = 'yep'
>>> str(f.fragment)
'/fragment/path?new=yep&with=params'

Creating hash-bang fragments with furl illustrates the use of Fragment's separator. When separator is False, the '?' separating path and query isn't included.

>>> f = furl('http://www.google.com/')
>>> f.fragment.path = '!'
>>> f.fragment.args = {'a':'dict', 'of':'args'}
>>> f.fragment.separator
True
>>> str(f.fragment)
'!?a=dict&of=args'

>>> f.fragment.separator = False
>>> str(f.fragment)
'!a=dict&of=args'
>>> f.url
'http://www.google.com/#!a=dict&of=args'

Encoding

Furl handles encoding automatically, and furl's philosophy on encoding is simple.

Whole path, query, and fragment strings should always be encoded.

>>> f = furl()
>>> f.path = 'supply%20encoded/whole%20path%20strings'
>>> f.path.segments
['supply encoded', 'whole path strings']

>>> f.set(query='supply+encoded=query+strings,+too')
>>> f.query.params
omdict1D([('supply encoded', 'query strings, too')])

>>> f.set(fragment='encoded%20path%20string?and+encoded=query+string+too')
>>> f.fragment.path.segments
['encoded path string']
>>> f.fragment.args
omdict1D([('and encoded', 'query string too')])

Path, Query, and Fragment subcomponents strings should always be decoded.

>>> f = furl()
>>> f.set(path=['path segments are', 'decoded', '<>[]"#'])
>>> f.pathstr
'/path%20segments%20are/decoded/%3C%3E%5B%5D%22%23'

>>> f.set(args={'query parameters':'and values', 'are':'decoded, too'})
>>> f.querystr
'query+parameters=and+values&are=decoded,+too'

>>> f.fragment.path.segments = ['decoded', 'path segments']
>>> f.fragment.args = {'and decoded':'query parameters and values'}
>>> f.fragmentstr
'decoded/path%20segments?and+decoded=query+parameters+and+values'

Python's urllib.quote() and urllib.unquote() can be used to encode and decode path strings. Similarly, urllib.quote_plus() and urllib.unquote_plus() can be used to encode and decode query strings.

Inline manipulation

For quick, single-line URL manipulation, the add(), set(), and remove() methods of furl objects manipulate various components of the URL and return the furl object for method chaining.

>>> url = 'http://www.google.com/#fragment' 
>>> furl(url).add(args={'example':'arg'}).set(port=99).remove(fragment=True).url
'http://www.google.com:99/?example=arg'

add() adds items to a furl object with the optional arguments

  • args: Shortcut for query_params.
  • path: A list of path segments to add to the existing path segments, or a path string to join with the existing path string.
  • query_params: A dictionary of query keys and values to add to the query.
  • fragment_path: A list of path segments to add to the existing fragment path segments, or a path string to join with the existing fragment path string.
  • fragment_args: A dictionary of query keys and values to add to the fragment's query.
>>> url = 'http://www.google.com/' 
>>> furl(url).add(path='/index.html', fragment_path='frag/path',
                  fragment_args={'frag':'args'}).url
'http://www.google.com/index.html#frag/path?frag=args'

set() sets items of a furl object with the optional arguments

  • args: Shortcut for query_params.
  • path: List of path segments or a path string to adopt.
  • scheme: Scheme string to adopt.
  • netloc: Network location string to adopt.
  • query: Query string to adopt.
  • query_params: A dictionary of query keys and values to adopt.
  • fragment: Fragment string to adopt.
  • fragment_path: A list of path segments to adopt for the fragment's path or a path string to adopt as the fragment's path.
  • fragment_args: A dictionary of query keys and values for the fragment's query to adopt.
  • fragment_separator: Boolean whether or not there should be a '?' separator between the fragment path and the fragment query.
  • host: Host string to adopt.
  • port: Port number to adopt.
  • username: Username string to adopt.
  • password: password string to adopt.
>>> furl().set(scheme='https', host='secure.google.com', port=99,
               path='index.html', args={'some':'args'}, fragment='great job').url
'https://secure.google.com:99/index.html?some=args#great%20job'

remove() removes items from a furl object with the optional arguments

  • args: Shortcut for query_params.
  • path: A list of path segments to remove from the end of the existing path segments list, or a path string to remove from the end of the existing path string, or True to remove the path entirely.
  • fragment: If True, remove the fragment portion of the URL entirely.
  • query: If True, remove the query portion of the URL entirely.
  • query_params: A list of query keys to remove from the query, if they exist.
  • port: If True, remove the port from the network location string, if it exists.
  • fragment_path: A list of path segments to remove from the end of the fragment's path segments, or a path string to remove from the end of the fragment's path string, or True to remove the fragment path entirely.
  • fragment_args: A list of query keys to remove from the fragment's query, if they exist.
  • username: If True, remove the username, if it exists.
  • password: If True, remove the password, if it exists.
>>> url = 'https://secure.google.com:99/a/path/?some=args#great job'
>>> furl(url).remove(args=['some'], path='path/', fragment=True, port=True).url
'https://secure.google.com/a/'

Miscellaneous

copy() creates and returns a new furl object with an identical URL.

>>> f = furl('http://www.google.com')
>>> f.copy().set(path='/new/path').url
'http://www.google.com/new/path'
>>> f.url
'http://www.google.com'

join() joins the furl object's url with the provided relative or absolute URL and returns the furl object for method chaining.

>>> f = furl('http://www.google.com')
>>> f.join('new/path').url
'http://www.google.com/new/path'
>>> f.join('replaced').url
'http://www.google.com/new/replaced'
>>> f.join('../parent').url
'http://www.google.com/parent'
>>> f.join('path?query=yes#fragment').url
'http://www.google.com/path?query=yes#fragment'
>>> f.join('unknown://www.yahoo.com/new/url/').url
'unknown://www.yahoo.com/new/url/'