# Chapter 11: Formatting Strings

Storing strings in variables is nice, but being able to compose strings of other strings and manipulated them is also necessary. One way to achieve this is to use string formatting.

Python 2.6 and above, the preferred method to format strings is to use the format method of strings. Here is an example.

In [1]:
name = 'matt'
print 'Hello {0}'.format(name)

Hello matt


When a string contains curly braces {and} with the an integer inside it, the braces serve as a place holder for the variable passed into format. In this case you are telling Python to replace the contents of {0} with the contents of name or the string 'matt'

Another useful property of formatting is that you can also format non-string objects, such as numbers:



In [3]:
print 'I:{0} R:{1} S:{2}'.format(1,2.5,'foo')

I:1 R:2.5 S:foo


If you pay attention you will notice that the numbers in the curly braces are incrementing. In reality they tell the format operation which object to insert and where. Many computer langages start counting from zero, so {0} would correspond with the integer 1, and the {1} would correspond to 2.5, while {2} corresponds to the string "foo".


## 11.1 Format String Syntax

Format strings have a special syntax for replacement fields. In the previous examples integers were used to represent positional argument locations. If an object is passed into the format string attributes can be looked up used .attribute_name syntax. There is also support for pulling index-able items out by using [index] as well.


In [5]:
'Name: {0}'.format('Paul')

'Name: Paul'

In [6]:
'Name: {name}'.format(name='John')

'Name: John'

There is a whole language for formatting strings. The form is:

## :[ [ fill ] align ] [ sign } [ # ] [ 0 ] [ width ] [ , ] [ .precision ] [ type ] 

The following tables list the fields and their meaning.

Some Examples Below:
                    
                    

In [7]:
"Name: {0:*^24}".format("Ringo")

'Name: *********Ringo**********'

In [11]:
#Format a percent using a width of 10, one decimal place and the
#sign before the width padding:

"Percent: {0:=10.1%}".format(-44./100)

'Percent: -    44.0%'

In [12]:
# Binary and Hex Conversions
"Binary: {0:b}".format(12)

'Binary: 1100'

In [13]:
# Binary and Hex Conversions
"Hex: {0:x}".format(12)

'Hex: c'

## Note:

The format method on a string replaces the % operator which was similar to C's printf. This operator is still available and some users prefer it because it requires less typing for simple statements and because it is similar to C. %s, %d, and %x are replaced by their string, integer, and hex value respectively. Here are some examples.



In [14]:
"Num: %d Hex: %x" % (12,13)

'Num: 12 Hex: d'

In [15]:
"%s %s" % ('hello','hello')

'hello hello'

# Chapter 11: dir, help and pdb

Dir lists all the attribites of the object passed onto it. Since you passed in the string. "Matt" to dir, the function display the attributes of the string Matt. This handy feature of Python illustrates its "batteries included" philosophy. Python gives you easy mechanism to dicover the attributes of any object. Other languages might require different websites, documentation etc.. 

The attribute list is displayed in alpha order. You can normally ignore the firs couple of attributes starting with _ Later on you will see attributes such as capitalize (which is a method that capitalizes a string), format ( which as illustrated allows you to format strings), or lower ( which allows you to ensure a string is lower case ) There attributes happen to be methods, which are easy to invoke on a string.



In [16]:
dir('Matt')

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__getslice__',
 '__gt__',
 '__hash__',
 '__init__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_formatter_field_name_split',
 '_formatter_parser',
 'capitalize',
 'center',
 'count',
 'decode',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'index',
 'isalnum',
 'isalpha',
 'isdigit',
 'islower',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

In [17]:
print "matt".capitalize()

Matt


In [18]:
print "Hi {0}".format('there')

Hi there


In [19]:
print "Yikes".lower()


yikes


## 12.1 Dunder Methods

You might be wondering what are all the attributes starting with _ are?

People call them magic methods or dunder methods, since they start and end with double underscores (DoubleUNDERscores). Dunder add is one way to say _add_, the add magic method is another. Special methods determine what happens under the covers when operations are performed on an object. For example when you use the + or / operator on a string, the _add_ or _div_ method is invoked respectively. 

Beginner pythonistas can usually ignore dunder methods. When you start programming you own classes and want to react to operations such as + or / you can define them.

## 12.2 help

Help is another built-in function that is useful in conbination with the REPL. This function provides documentation for methods, modules, class, and functions ( if it exists). For example, if you are curious what the attribute upper on a string does, the following gives you documentation:





In [20]:
help('some string'.upper)

Help on built-in function upper:

upper(...)
    S.upper() -> string
    
    Return a copy of the string S converted to uppercase.



In [25]:
a = 'some string'.upper()
print a

SOME STRING


## 12.3 pdb 

Python includes a debugger to step through code named pdb. This library is modeled somewhat after the gbd library for C. 

To drop into the dubugger at any point a Python program, in-sert the code import pdb; pdb.set_trace(). When this line is executed it will present a (pdb) prompt, which is similar to REPL. Code can be evaluated and inspected live. Also breakpoints can be set and further inspection can take place.

## Note:

Many Python developers use print debugging. They insert print statements to provide clarity as to what is going on. This is often sufficient. Just make sure to remove the debug statements or change them to logging statements before releasing the code. When more exploration is required, the pdb module can be useful.




# Chapter 13: Strings and Methods

In the previous chapter you learned about the built in dir function and saw some methods you can call on string objects. Strings allow you to capitalize them, format them, makethem lowecare as well as many other actions. These attributes of strings are methods. Methods are functions that are called on an instance of a type. Try to parse out that last sentence a little. The string type allows you to call a method. ( invoke by placing a period.) and the method name directly after the variablename holding the data (or the data itself), followed by the parantheses with arguments inside of it. Here is another example on capitalize to understand this.



In [3]:
name = 'matt'

In [10]:
correct = name.capitalize()

print correct

Matt


In [8]:
print 'fred'.capitalize()

Fred


In Python Methods and functions are first-class objects. If the parentheses are left off, Python will not throw an error, it will simply show a referece to a method.

In [11]:
print "fred".capitalize


<built-in method capitalize of str object at 0x106efc1b0>


In [12]:
dir(5)

['__abs__',
 '__add__',
 '__and__',
 '__class__',
 '__cmp__',
 '__coerce__',
 '__delattr__',
 '__div__',
 '__divmod__',
 '__doc__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__getattribute__',
 '__getnewargs__',
 '__hash__',
 '__hex__',
 '__index__',
 '__init__',
 '__int__',
 '__invert__',
 '__long__',
 '__lshift__',
 '__mod__',
 '__mul__',
 '__neg__',
 '__new__',
 '__nonzero__',
 '__oct__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdiv__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rlshift__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__rpow__',
 '__rrshift__',
 '__rshift__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__trunc__',
 '__xor__',
 'bit_length',
 'conjugate',
 'denominator',
 'imag',
 'numerator',
 'real']

In [20]:
(5+35j).conjugate()


(5-35j)

## 13.1 Common String Methods 

Here are a few string methods that are commonly used or found in the wild. Feel free to explore others using dir and help or the online documentation.



## endwith
If you have a variable holding a filename, you might want to check the extension.


In [24]:
xl = 'Oct2000.xls'
xl.endswith('.xls')

True

In [25]:
xl = 'Oct2000.xls'
xl.endswith('.xlsx')

False

## Note: 
Notice that you had to pass in a parameter, 'xls', into the method. 
Methods have a signature, which is a funky way of saying that they need to be called with the correct number (and type) of parameters. For endswith it makes sense that if you want to know if a string ends with another string you have to tell Python which ending you want to check for. This is done by passing the end string to the method.

## Tip

Again, it is usually helpful and easy to find the answers via the help. Documentation should tell you what parameters are required as well as any optional paramenters.


In [26]:
help(xl.endswith)

Help on built-in function endswith:

endswith(...)
    S.endswith(suffix[, start[, end]]) -> bool
    
    Return True if S ends with the specified suffix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    suffix can also be a tuple of strings to try.



Notice the parameters between the square brackets [and] are optional parameters. In this case start and end allow you to check a portion of the string. If you wanted to check if the characters starting at 0 and ending at 3 end with Oct, you could do the following.

In [27]:
xl.endswith("Oct", 0 ,3)

True

## 13.3 find

The find method allows you to find substrings inside other strings. It returns the index (offset  starting at 0) of the matched substring. If no substring is matched then the returned value is  -1.




In [28]:
word = 'grateful'
word.find('ate')

2

In [29]:
word.find("great")

-1

## 13.4 format

Format allows for easy creating of new strings by conbining existing variables. The variables repace {X} (where X is an integer)


In [35]:
print 'name: {0}, age:{1}'.format("matt",10)

name: matt, age:10


## note:
In the above example, the print statement spread accross two lines. By placing a \ following a . you indicate to Python that you want to continue on the next line. If you have an opened parentheses, (, you can also place the arguments on multiple lines without a \. 

In [37]:
print "word".\
find('ord')

1


In [38]:
print "word".find(
'ord')

1


Why spread the code over multiple lines, well because most code standards want to keep code below 90 characters.

print '{0} {1} {2} {3} {4}'

In [46]:

print '{0} {1} {2} {3} {4}'.format(
    'hello',
    'to',
    'you',
    'and',
    'you'
)

hello to you and you


## 13.5 join
    join creates a new string from a sequence by inserting a string between every member of the list: 

In [47]:
','.join(["1","2","3"])

'1,2,3'

## Tip

For most Python interpreters, using join is faster than repeated concatenation using the + operator, the above idiom is common.

## 13.6 startswith

startswith is anagolous to the endswidth method, except it checks that a string starts with another string.


In [48]:
'book'.startswith('B')

False

In [50]:
'book'.startswith("b")

True

## 13.7 strip 
Strip removes preceeding and trailing whitespace (spaces, tabs, newlines) from a string. This may come in handy if you have to normalize data or parse input from a user (or the web)


In [52]:
"    hello there    ".strip()

'hello there'

Alternatively you can strip right or left sided lstrip or rstrip