Permalink
Browse files

implement loads more of format() spec

implement closer to the format() spec
  • Loading branch information...
1 parent d26ba29 commit 23b89610f7a991f98271d10dc31393f74a3ee3b8 @r1chardj0n3s committed Nov 17, 2011
Showing with 411 additions and 147 deletions.
  1. +1 −1 MANIFEST.in
  2. +88 −0 README.rst
  3. +0 −44 README.txt
  4. +318 −101 parse.py
  5. +4 −1 setup.py
View
@@ -1,2 +1,2 @@
-include README.txt
+include README.rst
include *.py
View
@@ -0,0 +1,88 @@
+Parse strings using a specification based on the Python format() syntax.
+
+ parse() is the opposite of format()
+
+The `Format String Syntax`_ is supported with anonymous (fixed-position),
+named and formatted values are supported::
+
+ {[field name]:[format spec]}
+
+Field names must be a single Python identifier word. No attributes or
+element indexes are supported (as they would make no sense.)
+
+Numbered fields are also not supported: the result of parsing will include
+the parsed fields in the order they are parsed.
+
+There conversion of values to types other than strings is not yet supported.
+
+Some simple parse() format string examples:
+
+ >>> parse("Bring me a {}", "Bring me a shrubbery")
+ <Result ('shrubbery',) {}>
+ >>> parse("The {} who say {}", "The knights who say Ni!")
+ <Result ('knights', 'Ni!') {}>
+ >>> parse("Bring out the holy {item}", "Bring out the holy hand grenade")
+ <Result () {'item': 'hand grenade'}>
+
+Most of the `Format Specification Mini-Language`_ is supported::
+
+ [[fill]align][sign][#][0][width][,][.precision][type]
+
+The align operators will cause spaces (or specified fill character)
+to be stripped from the value. The alignment character "=" is not yet
+supported.
+
+The comma "," separator is not yet supported.
+
+The types supported are the not the format() types but rather some of
+those types b, o, h, x, X and also regular expression character group types
+d, D, w, W, s, S and not the string format types. The format() types n, f,
+F, e, E, g and G are not yet supported.
+
+===== ==========================================
+Type Characters Matched
+===== ==========================================
+ w Letters and underscore
+ W Non-letter and underscore
+ s Whitespace
+ S Non-whitespace
+ d Digits (effectively integer numbers)
+ D Non-digit
+ b Binary numbers
+ o Octal numbers
+ h Hexadecimal numbers (lower and upper case)
+ x Lower-case hexadecimal numbers
+ X Upper-case hexadecimal numbers
+===== ==========================================
+
+Do remember though that most often a straight type-less {} will suffice
+where a more complex type specification might have been used.
+
+So, for example, some typed parsing, and None resulting if the typing
+does not match:
+
+ >>> parse('Hello {:d} {:w}', 'Hello 12 people')
+ <Result ('12', 'people') {}>
+ >>> print parse('Hello {:d} {:w}', 'Hello twelve people')
+ None
+
+And messing about with alignment:
+
+ >>> parse('hello {:<} world', 'hello there world')
+ <Result ('there',) {}>
+ >>> parse('hello {:^} world', 'hello there world')
+ <Result ('there',) {}>
+
+Note that the "center" alignment does not test to make sure the value is
+actually centered. It just strips leading and trailing whitespace.
+
+See also the unit tests at the end of the module for some more
+examples. Run the tests with "python -m parse".
+
+.. _`Format String Syntax`: http://docs.python.org/library/string.html#format-string-syntax
+.. _`Format Specification Mini-Language`: http://docs.python.org/library/string.html#format-specification-mini-language
+
+----
+
+This code is copyright 2011 eKit.com Inc (http://www.ekit.com/)
+See the end of the source file for the license of use.
View
@@ -1,44 +0,0 @@
-Parse strings using a specification based on the Python format() syntax.
-
-Anonymous (fixed-position), named and typed values are supported. Also the
-alignment operators will cause whitespace (or another alignment character)
-to be stripped from the value.
-
-You may not use both fixed and named values in your format string.
-
-The types supported in ":type" expressions are the regular expression
-character group types d, D, w, W, s, S and not the string format types.
-
-So, for example, some fixed-position parsing:
-
- >>> r = parse('hello {}', 'hello world')
- >>> r.fixed
- ('world', )
-
- >>> r = parse('hello {:d} {:w}', 'hello 12 people')
- >>> r.fixed
- ('12', 'people')
-
-And some named parsing:
-
- >>> r = parse('{greeting} {name}', 'hello world')
- >>> r.named
- {'greeting': 'hello', 'name': 'world'}
-
- >>> r = parse('hello {^} world', 'hello there world')
- >>> r.fixed
- ('there', )
-
-None will be returned if there is no match:
-
- >>> r = parse('hello {name:w}', 'hello 12')
- >>> print r
- None
-
-See also the unit tests at the end of the module for some more
-examples. Run those with "python -m parse".
-
-----
-
-This code is copyright 2011 eKit.com Inc (http://www.ekit.com/)
-See the end of the source file for the license of use.
Oops, something went wrong.

0 comments on commit 23b8961

Please sign in to comment.