# The Python Programming Language

From Coursera: Intro to Data Science, Week 1   
Patricia Schuster, University of Michigan  
Feb. 2017

# More on Strings

In python 3, strings are unicode, and can include many characters beyond the alphabet including math symbols, foreign language characters, and emoji. 

You can write a string with placeholders, and provide the values to substitute in for those placeholders at the end of the string. This may be a better option than converting a number to a string using the `str()` function.

For example, create a dictionary including a sales record.

In [2]:
sales_record = {'price':3.24,
                'num_items':4,
                 'person':'Chris'}

Now create a string containing placeholders that will be replaced with specified values. Use `{}` to specify the position of each placeholder, and use the `format` function to specify the values to replace in placeholder, provided as input arguments to `format` in the order in which they should appear.

In [4]:
sales_statement = '{} bought {} items at {} each for a total of {}'

In [5]:
sales_statement.format(sales_record['person'],
                       sales_record['num_items'],
                       sales_record['price'],
                       sales_record['num_items']*sales_record['price'])

'Chris bought 4 items at 3.24 each for a total of 12.96'

# `join()`ing strings

Python has a built-in function for strings, similar to split but working oppositely to join two strings and insert a specified delimiter between them. 

`join()` takes a list of strings as its input parameter, so if you have strings defined separately, put brackets around them to create a list of those strings.

Try this out.

In [4]:
str1 = 'First'
str2 = 'Second'
str3 = 'Third'

''.join([str1,str2,str3])

'FirstSecondThird'

In a lot of cases, I need to specify a delimeter between the strings.

In [5]:
', '.join([str1,str2,str3])

'First, Second, Third'

Now show how `split()` does the opposite of `join()`.

In [7]:
str_joined = ', '.join([str1,str2,str3])
print(str_joined)

First, Second, Third


Note that in the following case, the `,` will remain with the strings because I am only going to specify ` ` as the delimeter.

In [9]:
str_joined.split(' ')

['First,', 'Second,', 'Third']

Now get rid of the `,` as well by including it in the delimeter.

In [10]:
str_joined.split(', ')

['First', 'Second', 'Third']

Tadaa!

# Formatting number of digits

I'm going to take this discussion a little further to discuss how to format the number of digits in the values assigned to each placeholder.

For instance, the price of each item that Chris bought in the previous section was \$3.24, but what if the price was \$3.20? Common convention is to print both digits after the decimal point, but python may round to \$3.2. In the context of a receipt, \$3.2 looks strange. Let's see...

*By the way, Jupyter automatically interprets my dollar sign symbols in this markdown cell as the beginning of an equation. In order to tell Jupyter that they are, in fact, dollar signs, use a backslash before the dollar sign:* `\$3.2`

In [7]:
sales_record = {'price':3.20,
                'num_items':4,
                 'person':'Chris'}

sales_statement.format(sales_record['person'],
                       sales_record['num_items'],
                       sales_record['price'],
                       sales_record['num_items']*sales_record['price'])

'Chris bought 4 items at 3.2 each for a total of 12.8'

How can I tell python to bring both decimal places? A useful resource for this is on the website here: https://pyformat.info/

I can tell python how many places to keep by putting that number of digits in the parentheses. I will need to redefine `sales_statement`:

In [14]:
sales_statement = '{} bought {} items at {:.2f} each for a total of {}'

In [16]:
sales_record = {'price':3.20,
                'num_items':4,
                 'person':'Chris'}

sales_statement.format(sales_record['person'],
                       sales_record['num_items'],
                       sales_record['price'],
                       sales_record['num_items']*sales_record['price'])

'Chris bought 4 items at 3.20 each for a total of 12.8'

Explore this a little further for each type separately.

## Integers
With an integer, you can specify the total number of characters in the output. 

Try saving an integer, and tell python to print it to a string of four characters.

In [28]:
x = 45
'{:4d}'.format(x)

'  45'

But in some cases, I don't want empty spaces at the beginning of the string, but rather 0s at the beginning. Tell python to zero-pad the string.

In [29]:
'{:04d}'.format(x)

'0045'

If I want to force python to print the sign of the variable, whether it is positive or negative:

In [32]:
'{:+d}'.format(42)

'+42'

In [34]:
'{:+d}'.format(-42)

'-42'

If I want to have python print a negative sign for a negative variable, and leave a space for a positive variable, leave an empty space.

In [35]:
'{: d}'.format(42)

' 42'

In [36]:
'{: d}'.format(-42)

'-42'

## Floats
With floats, you have to tell python how many total characters to print, and how many of those characters should be after the decimal. Try it out.

In [21]:
x = 45.6789
'{:06.2f}'.format(x)

'045.68'

This printed six total characters, with two of them past the decimal point. Because I included the 0, it is zero-padded. Try it without the zero-padding. Also, ask python to print more decimal places than I specify in the value of `x`, so that it adds a zero after the decimal point.

In [26]:
'{:9.5f}'.format(x)

' 45.67890'

## Strings

*Padding:* Similar to integers and floats, you can pad strings by specifying the number of digits and the side on which to pad the string.

In [44]:
'{:<10}'.format('test')

'test      '

In [39]:
'{:>10}'.format('test')

'      test'

In [40]:
'{:^10}'.format('test')

'   test   '

You can also modify the padding charactter.

In [43]:
'{:_>10}'.format('test')

'______test'

In [45]:
'{:1>10}'.format('test')

'111111test'

*Truncating:* It is possible to truncate a long string to a specific number of characters. Specify the number of characters to keep. I haven't figured out how to specify which side to truncate. *Note: one can, alternatively, use indexing to specify parts of the string as if it is a list of characters.*

In [50]:
'{:.6}'.format('bananaphone')

'banana'

In [56]:
'bananaphone'[6:]

'phone'