data
is a small Python module that allows you to treat input in a singular
way and leave it up to the caller to supply a byte-string, a unicode object, a
file-like or a filename.
>>> open('helloworld.txt', 'w').write('hello, world from a file')
>>> from data import Data as I
>>> a = I(u'hello, world')
>>> b = I(file='helloworld.txt')
>>> c = I(open('helloworld.txt'))
>>> print unicode(a)
hello, world
>>> print unicode(b)
hello, world from a file
>>> print unicode(c)
hello, world from a file
This can be made even more convenient using the data
decorator:
>>> from data.decorators import data
>>> @data('buf')
... def parse_buffer(buf, magic_mode=False):
... return 'buf passed in as ' + repr(buf)
...
>>> parse_buffer('hello')
"buf passed in as Data(data='hello', encoding='utf8')"
>>> rv = parse_buffer(open('helloworld.txt'))
>>> assert 'file=' in rv
All instances support methods like read
or __str__
that make it easy to
fit it into existing APIs:
>>> d = I('some data')
>>> d.read(4)
u'some'
>>> d.read(4)
u' dat'
>>> d.read(4)
u'a'
>>> e = I(u'more data')
>>> str(e)
'more data'
Note how read
returns unicode. Additionally, readb
is available:
>>> f = I(u'I am \xdcnicode.')
>>> f.readb()
'I am \xc3\x9cnicode.'
Every data
object has an encoding attribute which is used for converting
from and to unicode.
>>> g = I(u'I am \xdcnicode.', encoding='latin1')
>>> g.readb()
'I am \xdcnicode.'
Iteration and line reading are also supported:
>>> h = I('I am\nof many\nlines')
>>> h.readline()
u'I am\n'
>>> h.readlines()
[u'of many\n', u'lines']
>>> i = I('line one\nline two\n')
>>> list(iter(i))
[u'line one\n', u'line two\n']
Some useful convenience methods are available:
>>> j = I('example')
>>> j.save_to('example.txt')
The save_to
method will use the most efficient way possible to save the
data to a file (copyfileobj
or write()
). It can also be passed a
file-like object:
>>> k = I('example2')
>>> with open('example2.txt', 'wb') as out:
... k.save_to(out)
...
If you need the output inside a secure temporary file, temp_saved
is
available:
>>> l = I('goes into tmp')
>>> with l.temp_saved() as tmp:
... print tmp.name.startswith('/tmp/tmp')
... print l.read()
...
True
goes into tmp
temp_saved
functions almost identically to tempfile.NamedTemporaryFile
,
with one difference: There is no delete
argument. The file is removed only
when the context manager exits.
data
can be used on both sides of an API, either while passing values in:
>>> import json
>>> from data import Data as I
>>> m = I('{"this": "json"}')
>>> json.load(m)
{u'this': u'json'}
or when getting values passed (see the data decorator example above). If necessary, you can also support APIs that allow users to pass in filenames:
>>> class Parser(object):
... @data('input')
... def parse(self, input, parser_opt=False):
... return input
... def parse_file(self, input_file, *args, **kwargs):
... return self.parse(I(file=input_file), *args, **kwargs)
...
>>> p = Parser()
>>> p.parse_file('/dev/urandom')
Data(file='/dev/urandom', encoding='utf8')
See the documentation at http://pythonhosted.org/data for an API reference.
data
works the same on Python 2 and 3 thanks to six, a few compatibility functions and a
testsuite.
Python 3 is supported from 3.3 onwards, Python 2 from 2.6.