> "What I hear I forget.

> What I see I remember.

> What I do I understand."
**Confucius**

**What is and Algorithm?**

Formal definition is:
`algorithm`: a process or a set of rules to follwed in calculations or other problem-solving operations.

Informally, an algorithm is sometimes described as

`algorithm`: a recipe for solving problem.

Example: *chocolate cake*
![](figures/algorithm-example.png)

Recipes can be used not only to make food but to make calculations as well.

Babylonian Square Root Algorithm: 
1. Guess the square root of the number. 
2. Divide the number by the guess. 
3. Average the quotient (from step 2) and the guess. 
4. Make the new guess the average from step 3. 
5. If the new guess is different than the previous guess, go back to step 2; otherwise, stop.

An algorithm should be:
* Detailed
* Effective
* Specific
* General purpose

A program should be:
* Readable
* Robust
* Correct

> **Rule 6**: If it was hard to write, it is probably hard to read. Add a comment.

> The basic tool for the manipulation of reality is the manipulation of words. **Phillip K. Dick, author**

# Working with `string`

MUCH OF THE TIME SPENT ON COMPUTERS INVOLVES WORKING WITH WORDS. WE write emails and essays, we send text messages and instant messages, we post to blogs, we create Facebook pages, we Google for information, and we read web pages. In programming languages, **any sequence of printable characters is referred to as a `string`**.

The string type is one of the many collection types provided by Python. As ﬁrst discussed in Section 2.1.4, a collection is a group of Python objects that can be treated as a single object. In particular, a string type is a special kind of collection called a `sequence`.

A Python string object can be constructed either by using the string constructor `str` or, as a shortcut, by encompassing a group of characters in either two single quotes `'` or two double quotes `''`.

**The Triple-Quote String**

This string preserves all the format information of the string. If the string spans multiple lines, those carriage returns between lines are preserved. If there are quotes, tabs, any information at all, it is preserved. In this way, you can capture a whole paragraph as a single string.

```python
zen_str = '''Beautiful is better than ugly.
             Explicit is better than implicit.
             Simple is better than complex.
             Complex is better than complicated.'''
```

**Non-Printable Characters**

Some characters perform necessary operations but show up as whitespace in the output.

* `\n` - carriage return
* `\t` - tab

In [4]:
print('first line \n second line')
print('first then tab \t second')

first line 
 second line
first then tab 	 second


**String Representation**

What is the difference between a string and other Python types? For  example, what is  the difference between the integer 1 and the string '1'? One answer is obvious: they are different types! As we have discussed previously, the type of an object determines both the attributes of an object and the kinds of operations that can be performed on the object.  A string is a collection type that has multiple parts: an integer is the representation of a number. Integers are created with the constructor int or with a number (without decimal points); strings are created with the constructor str or by enclosing characters with a pair of quotes. Just as important as its creation, the type of an object determines much of what you can do with that object.

Single-character strings, like all other data, are represented in a computer as numbers. When you type on your keyboard, the individual characters are stored in a special computer representation know as Unicode (UTF-8 is the Unicode default for Python 3). UTF-8, and any Unicode set in general, maps each character to an integer. By “map,” we mean that in the UTF-8 set, each character is associated with a particular integer value. That integer is what gets stored in a computer, and Python labels its type as str. Because that integer is stored as a string type, Python knows to map that integer to a particular character in the UTF-8 character set. Take a look at the UTF-8 character mapping shown in Appendix D. Note that, at least within the groups of lowercase letters, uppercase letters, and numbers, the order of the characters is as you would expect: 'a' comes before 'b', '1' before '2', etc.

You can experiment with the mapping yourself. Python provides two special functions, `ord` and `chr`, to capture the relationship between UTF-8 and a character. The ord function shows the UTF-8 integer associated with a character. For example, ord('a') yields a value 97, because 'a' is associated with 97 in the UTF-8 table. Similarly, the chr function takes an integer and yields the character associated with that integer in the UTF-8 table. Thus chr(97) yields the value 'a' because 'a' is associated with the integer 97 in the UTF-8 table.

In [5]:
ord('a')

97

In [6]:
chr(109)

'm'

In [8]:
# get character representation from 90 to 110
for i in range(90, 110, 1):
    print(chr(i), end=' ')

Z [ \ ] ^ _ ` a b c d e f g h i j k l m 

**Strings as a Sequence**

This position is called the index of the character within the string. In Python, and other languages as well, the ﬁrst index (the ﬁrst position) in a sequence is index 0.

Python lets us look at the individual characters in the string sequence using the `indexing` operator, represented by the square brackets operator `[]`. The indexing operator works by associating square brackets with a string, with an integer within the brackets.

Python also allows **indexing from the back end of the sequence**. Thus, if you want to index from the string end, Python starts indexing the last character of the string with -1 and subtracts one from the index for each character to the left.

In [9]:
hello_str = 'Hello World'
hello_str

'Hello World'

In [10]:
hello_str[0]

'H'

In [11]:
hello_str[5]

' '

In [12]:
hello_str[-1]

'd'

In [13]:
hello_str[11]

IndexError: string index out of range

**More Indexing and Slicing**

Indexing in Python allows you to indicate more than just a single character string. You can also select subsequences of the string with the proper indicies. Python calls such a subsequence a slice. Remember, just like for a single index, a slice returns a new string and does not change the original string in any way (even though slice sounds like it would!).

To index a subsequence, you indicate a range of indicies within the square bracket by providing a pair of indices separated by a colon (:). The colon within the index operator brackets indicates that, instead of a single position being selected, a range of indices is being selected.

![](figures/string-indexing.png)

In [14]:
hello_str[6:10]

'Worl'

In [19]:
hello_str[6:] # no ending value defaults to the end of string

'World'

In [20]:
hello_str[:5] # no start value defaults to beginning of string

'Hello'

In [21]:
hello_str[-1] # negative index works back from the end

'd'

In [18]:
hello_str[3:-2]

'lo Wor'

Slicing allows a third parameter that speciﬁes the step in the slice. This means that you can have as many as three numbers in the index operator brackets separated by two colon characters: the ﬁrst number is the beginning of the sequence, the second number speciﬁes the end of the sequence, and the third is the step to take along the sequence. As with the ﬁrst two arguments, the step number has a default if not indicated: a step of 1. The step value indicates the step size through the sequence.

In [22]:
hello_str[::2]

'HloWrd'

In [25]:
hello_str[::3] # every third letter

'HlWl'

In [26]:
hello_str[::-1] # step backwards from the end to the beginning

'dlroW olleH'

In [27]:
hello_str[::-2] # backwards, every other letter

'drWolH'

In [28]:
digits = '0123456789'

digits[::2] # even digits

'02468'

In [30]:
digits[1::2] # odd digits (starts at 1)

'13579'

In [31]:
digits[::-1] # reverse digits

'9876543210'

In [32]:
digits[::-2] # reverse odds

'97531'

In [33]:
digits[-2::-2] # reverse evens

'86420'

`Copy Slice` - a new string is yielded as the result of a slice; the original string is not modiﬁed. Thus a copy slice is indeed a new copy of the original string.

In [42]:
name_one = 'Monty'
name_two = name_one[:]
name_two

'Monty'

**Strings are Iterable**

A data type that is iterable means that the individual elements can be “iterated through” using a for loop (or other methods). A string is indeed an iterable data type, and you can iterate through the individual elements of a string using a for loop. Because strings are also a sequence, iteration through a string yields the elements of the string in the order in which they appear in the string.

In [48]:
for char in 'Hi Baku':
    print(char, type(char))

H <class 'str'>
i <class 'str'>
  <class 'str'>
B <class 'str'>
a <class 'str'>
k <class 'str'>
u <class 'str'>


**Concatenation and Repetition**

* concatenate `+`: The operator + requires two string objects and creates a new string object. The new string object is formed by concatenating copies of the two string objects together: the ﬁrst string joined at its end to the beginning of the second string.
* repeat `*`: The * takes a string object and an integer and creates a new string object. The new string object has as many copies of the string as is indicated by the integer.

In [50]:
my_str = 'Hello'
your_str = 'World'

my_str + your_str # concatenation

'HelloWorld'

In [51]:
your_str + my_str # order does matter in concatenation

'WorldHello'

In [52]:
my_str + ' ' + your_str # add a space between

'Hello World'

In [53]:
my_str * 3 # replication

'HelloHelloHello'

In [54]:
3 * my_str # order does not matter in replication

'HelloHelloHello'

In [58]:
(my_str + ' ') * 3 # parantheses force ordering

'Hello Hello Hello '

In [57]:
my_str + ' ' * 3 # withour paranthesis repeat spaces

'Hello   '

In [59]:
'hello' + 3 # wrong types for concatenation, requires two strings

TypeError: can only concatenate str (not "int") to str

In [60]:
'hello' * 'world' # wrong types for replication, requires string and int

TypeError: can't multiply sequence by non-int of type 'str'

How does Python know whether to do concatenation or addition when it sees a + operator? The answer is that the types of the operands indicate the operation to be performed.

In general, the fact that a single operator can perform multiple tasks is called operator `overloading`. By overloading, we mean that a single operator, such as +, will perform different operations depending on the types of its operands.

When the Python interpreter sees a + operator, it examines the types of the operands. If the operands are numbers (integers or ﬂoating-point numbers), the interpreter will perform addition. If the operands are strings, the interpreter will perform concatenation. If the operands are a mixture of numbers and strings, the Python interpreter will generate an error. 

**Comparison Operators**

**Single-Character String Compares**
Let’s start easy and work with only single-character strings. You can compare two single- character strings using the equality operator ==, as in 'a' == 'a'. If the two single characters are the same, the expression returns True. Note that the expression 'a' == 'A' returns False as those are indeed two different strings.

What about the greater than (>) or less than (<) operators? The easy example would be 'a' > 'a', which is obviously False. What is the result of 'a' > 'A'? If you type it into the shell, you will get the result True. Why? We introduced the functions ord and chr in Section 4.1.3. These two functions help us relate a character and its integer representation in the Unicode UTF-8 table. All comparisons between two single characters are done     on the basis of their UTF-8 integer mapping. When we compare 'a' > 'A', Python fetches the associated UTF-8 number for both characters and compares those two numbers. Because ord('a') is 97 and ord('A') is 65, the question becomes whether 97 > 65, which yields True. Conveniently, the lowercase letters are all sequentially ordered, so that 'a' < 'b', 'b' < 'c', and so on. Similarly, the capital letters are sequentially ordered, so that 'A' < 'B', 'B' < 'C', and so on. Finally, the numeric characters are also ordered, so that '0' < '1', '1' < '2', and so on. However, only the run of lowercase, uppercase, and numeric strings follow the assumed order. It is also True that '0' < 'a' and 'A' < 'a'. If you wonder about character ordering, the UTF-8 table or the associated functions ord and chr should resolve the question.

In [61]:
'a' > 'A'

True

In [62]:
'0' < '1'

True

In [63]:
'0' < 'a'

True

**Comparing Strings with More than One Character**

String comparison—in fact, any sequence comparison—works as follows. The basic idea is to, in parallel, examine both string characters at some index and then walk through both strings until a difference in characters is found.

1.	Start at index 0, the beginning of both strings.
2.	Compare the two single characters at the present index of each each string.
    * If the two characters are equal, increase the present index of both strings by 1 and go back to the beginning of step 2.
    * If the two characters are not equal, return the result of comparing those two characters as the result of the string comparison.
3.	If both strings are equal up to some point but one is shorter than the other, then the longer string is always greater. For example, `'ab' < 'abc'` returns `True`.


In [64]:
'abc' < 'cde'

True

In [65]:
'abc' < 'abd'

True

In [66]:
'' < 'a'

True

**The `in` Operator**

The in operator is useful for checking membership in a collection. An example of its use is 'a' in 'abcd'. The operator takes two arguments: the collection we are testing and the element we are looking for in the collection. 
True . The test string sequence must be found **exactly**.

In [67]:
vowels = 'aeiou'

'a' in vowels

True

In [68]:
'x' in vowels

False

In [69]:
'eio' in vowels

True

In [70]:
'aiu' in vowels

False

**String Collections are `immutable`**

Given that a string is a collection—a sequence, in fact—it is tempting to try the following kind of operation: create a string and then try to change a particular character in that string to a new character. In Python, that would look something like the following session:

In [71]:
my_str = 'Hello'
my_str[0] = 'J'

TypeError: 'str' object does not support item assignment

What is wrong? The problem is a special characteristic of some collections in Python. The string type, as well as some other types, are `immutable`. This means that once the object is created, usually by assignment, its contents cannot be modiﬁed. Having an index expression on the left side of an assignment statement is an attempt to do exactly that—change one of the elements of the string sequence. Such a modiﬁcation is not allowed with the string type.

There are some efﬁciency reasons for this restriction. By making strings immutable, the Python interpreter is faster. However, immutable strings are an advantage for the programmer as well. No matter what you do to a string, you are guaranteed that the original string is not changed. By deﬁnition, it cannot be changed; it is immutable. As a result, all Python string operators must generate a new string. Once you create a string, you cannot change it. You must create a new string to reﬂect any changes you desire.

**String `Functions` and `Methods`**

**Functions**

Think of a function as a small program that performs a speciﬁc task. That program is packaged up, or encapsulated, and made available for use. The function can take some input values, perform some task by executing statements and evaluating expressions, and, when ﬁnished, potentially return a value. Functions are useful because we use them to perform commonly needed tasks. Instead of writing the same code over and over again, we encapsulate that code in a function, making it easier to use.

Functions should not be a completely new concept for you, as we use functions fre- quently in mathematics. One example is the square root function. It takes a real number as an argument and then returns a number that is the square root. How the square root is calcu- lated is not important to us. We care only that works and that it works correctly. In this way, the square root operation is encapsulated; the details of its operation are hidden from you. 

In [72]:
my_str = 'Hello World'
len(my_str)

11

**Methods**

A method is a variation on a function. It looks very similar. It has a name and it has a list of arguments in parentheses. It differs, however, in the way it is invoked. Every method is called in conjunction with a particular object. The kinds of methods that can be used in conjunction with an object depends on the object’s type. String objects have a set of methods suited for strings, just as integers have integer methods, and ﬂoats have ﬂoat methods. The invocation is done using what is called the dot notation.

In [77]:
my_str = 'Python rules!'
my_str.upper() # converts all character into uppercase

'PYTHON RULES!'

In [76]:
my_str.find('m') # returns the index of the sibstring in the string where the sinstring first occurs

-1

In [75]:
my_str.find('P')

0

Chaining of Methods

In [78]:
my_str.upper().find('O')

4

Nesting of Methods

In [79]:
a_str = 'He had the bat.'
a_str.find('t')

7

In [81]:
a_str.find('t', 8) # start at index 8 = 7 + 1

13

In [84]:
a_str.find('t', a_str.find('t') + 1) # start at one after the forst 't'

13

![](figures/string-methods.png)

**Formatted Output for Strings**

Using the default print function is easy, but it provides no control of what is called the `format` of the output. By format, we mean a low-level kind of typesetting to better control how the output looks on the console. Python provides a ﬁner level of control that gives us, the programmer, the option to provide “prettier,” more readable, output. Conveniently, the control of console typesetting is done through the use of the string format method.

In [85]:
'{} is {} years old.'.format('Bill', 25)

'Bill is 25 years old.'

In [86]:
import math
'{} is nice but {} is divine!'.format(1, math.pi)

'1 is nice but 3.141592653589793 is divine!'

The way each object is formatted in the string is done by default based on its type, as was shown in the previous session. However, each brace can include formatting commands that provide directives about how a particular object is to be printed. The four pieces of information that one can provide for a particular object are a descriptor code, an alignment number, a width number, and a precision descriptor. We will review each of those in the sections below.

**Descriptor Codes**

The formatting commands include a set of descriptor codes that dictate the type of object to be placed at that location in the string and formatting operations that can be performed on that type. The descriptor can control how an individual object of that type is printed to the screen.

The most commonly used descriptor codes.:
* `s` - string
* `d` - decimal integer
* `f` - floating-point decimal
* `e` - floating-point exponential
* `%` - floating-point as percent

**Width and Alignment Descriptors**

A ﬁeld width can be speciﬁed for each data item. It speciﬁes a printing-ﬁeld width, counted as the number of spaces the object occupies. By default, formatted strings are left justiﬁed and formatted numbers are right justiﬁed. If the speciﬁcation includes a less than `<`, the data are placed left justiﬁed within the indicated width; a greater than `>` forces right justiﬁcation. Centering can be done using `ˆ`.

* `<` - left
* `>` - right
* `^` - center

![](figures/string-formatting.png)

In [88]:
for i in range(5):
    print('{:10d} --> {:4d}'.format(i, i**2))

         0 -->    0
         1 -->    1
         2 -->    4
         3 -->    9
         4 -->   16


**Floating-Point Precision Descriptor**

When printing ﬂoating-point values, it is desirable to control the number of digits to    the right of the decimal point—that is, the precision. Precision is speciﬁed in the format descriptor using a decimal point followed by an integer to specify the precision.

In [89]:
import math
print(math.pi)

3.141592653589793


In [90]:
print('Pi is {:.4f}'.format(math.pi))

Pi is 3.1416


In [91]:
print('Pi is {:8.4f}'.format(math.pi))

Pi is   3.1416


In [92]:
print('Pi is {:8.2f}'.format(math.pi))

Pi is     3.14


Implementing `find` method by what we know untill now.

In [96]:
river = 'Mississippi'
target = input('Input a character to find: ')

for index in range(len(river)):
    if river[index] == target:
        print('Letter found at index: ', index)
        break
else:
    print('Letter', target, 'not found in', river)

Input a character to find: M
Letter found at index:  0


**Palindrome**

In [101]:
pal_1 = "Madam, I'm Adam"
pal_2 = "A man, a plan, a canal, Panama"

print('Forward: {} \nBackward: {}'.format(pal_1, pal_1[::-1]))
print()
print('Forward: {} \nBackward: {}'.format(pal_2, pal_2[::-1]))

Forward: Madam, I'm Adam 
Backward: madA m'I ,madaM

Forward: A man, a plan, a canal, Panama 
Backward: amanaP ,lanac a ,nalp a ,nam A


In [102]:
import string

string.punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [103]:
string.digits

'0123456789'

In [106]:
string.ascii_lowercase

'abcdefghijklmnopqrstuvwxyz'

In [107]:
string.whitespace

' \t\n\r\x0b\x0c'

**More String Formatting**

* `<` - left (default for most objects)
* `>` - right (default for numbers)
* `^` - center
* `=` - force fill between sign and digits (numeric only)

In [109]:
print('{0:.>12s} | {1:0=+10d} | {2:->5d}'.format(' abc ',35,22))

....... abc  | +000000035 | ---22
