### A compound data type
Types that comprise smaller pieces are called compound data types.

In [4]:
word = 'banana'
for i in range(0, len(word)):
    print(word[i], end=" . ") # i is Index

b . a . n . a . n . a . 

To obtain the last letter in a word:

In [5]:
word[len(word)-1]

'a'

Alternatively, we can use **negative indices**.

In [7]:
print(word[-1], word[-2], end=" ")

a n 

### Traversal and the for loop
* Encode traversal:

In [9]:
idx = 0
while idx < len(word): # the last loop idx=5 < len('banana')=6
    letter = word[idx]
    idx += 1
    print(letter, end=" ")

b a n a n a 

In [10]:
for i in word:
    print(i, end=" ") # loop continues until no characters are left

b a n a n a 

* Concatenation 

In [12]:
prefixes = "JKLMNOPQ"
suffix = "ack"

for i in prefixes:
    print(i + suffix)

Jack
Kack
Lack
Mack
Nack
Oack
Pack
Qack


In [30]:
"_".join(['J','ack','ooo'])

'J_ack_ooo'

* Slice

The operator [n:m] returns the part of the string from the n-eth character to the m-eth character, including the first but excluding the last.

In [16]:
word[:3], word[3:], word[:]

('ban', 'ana', 'banana')

* String Comparison

Comparison operations are useful for putting words in lexigraphical order (a generalization of the way words are alphabetically ordered based on the alphabetical order of their component letters)

In [17]:
def compare_words(word):
    if word < "banana":
        print("Your word, " + word + ", comes before banana.")
    elif word > "banana":
        print("Your word, " + word + ", comes after banana.")
    else:
        print("Yes, we have no bananas!")
        
compare_words('pear')

Your word, pear, comes after banana.


The uppercase letters come before all the lowercase letters

In [19]:
compare_words('Zebra')

Your word, Zebra, comes before banana.


### Strings are immutable
Strings are immutable, which means you can’t change an existing string. 

In [25]:
greeting = "Hello, world!"
greeting[0] = 'J'            # ERROR!
print(greeting)

TypeError: 'str' object does not support item assignment

In [26]:
# The best to do is creating a new string that is a variation on the original
greeting = "Hello, world!"
print('J' + greeting[1:])

Jello, world!


### The IN operator
* To test if a string is a substring of another string.
* A string is a substring of itself

In [27]:
'apple' in "apple"

True

###### Function to remove all vowels in a stirng:

In [29]:
def remove_vowels(input_str):
    vowels = 'aeiouAEIOU'
    output_str = ""
    for i in input_str:
        if i not in vowels:
            output_str += i
    print(output_str)
    
remove_vowels("duplicated")

dplctd


### The FIND function

In [39]:
def find_ind(input_str, ch):
    ind = 0
    while ind < len(input_str):
        if input_str[ind] == ch:
            return ind
        ind += 1
    return "The character you're looking for is not in the string."

find_ind("crossyroad", 'r')

1

In [36]:
myString = 'American'
myString.lower().find('a')

0

In [38]:
[idx for idx, cha in enumerate('American'.lower()) if cha == 'a']

[0, 6]

In [42]:
# Add a STARTing point in find_ind function (the augment of "start" is optional)
def find_ind_2(input_str, ch, start=0):
    ind = start
    while ind < len(input_str):
        if input_str[ind] == ch:
            return ind
        ind += 1
    return "The character you're looking for is not in the string."

find_ind_2("crossyroad", 'r', 2), find_ind_2("crossyroad", 'r')

(6, 1)

### The "string" module

In [43]:
import string

In [44]:
dir(string)

['Formatter',
 'Template',
 '_ChainMap',
 '_TemplateMetaclass',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_re',
 '_string',
 'ascii_letters',
 'ascii_lowercase',
 'ascii_uppercase',
 'capwords',
 'digits',
 'hexdigits',
 'octdigits',
 'printable',
 'punctuation',
 'whitespace']

In [48]:
string.digits.__doc__

"str(object='') -> str\nstr(bytes_or_buffer[, encoding[, errors]]) -> str\n\nCreate a new string object from the given object. If encoding or\nerrors is specified, then the object must expose a data buffer\nthat will be decoded using the given encoding and error handler.\nOtherwise, returns the result of object.__str__() (if defined)\nor repr(object).\nencoding defaults to sys.getdefaultencoding().\nerrors defaults to 'strict'."

In [53]:
print(string.hexdigits)

0123456789abcdefABCDEF


In [57]:
# Find in Py3.x is:
'mybirthday'.find('b', 3)

-1

### Character Classification

In [61]:
print(string.ascii_lowercase)
print(string.ascii_uppercase)
print(string.digits)

abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789


In [63]:
def is_lower(ch):
    return 'a' <= ch <= 'z'

is_lower('g'), is_lower('G') # upper case is smaller than lower case

(True, False)

### String Formatting
"FORMAT" % (VALUES)
* Conversion Specifications:
    * %s: string
    * %d: decimal integer
    * %f: floating

In [64]:
"My name is %s." % 'Linna'

'My name is Linna.'

In [65]:
n1 = 4
n2 = 5
"2**10 = %d and %d * %d = %f" % (2**10, n1, n2, n1 * n2)

'2**10 = 1024 and 4 * 5 = 20.000000'

Set the **width of columns** independently:
* The - after each % in the converstion specifications indicates left justification.

In [72]:
i = 1
print("%-4s%-5s%-6s%-8s%-13s%-15s" % \
      ('i', 'i**2', 'i**3', 'i**5', 'i**10', 'i**20'))
while i <= 10:
    print("%-4d%-5d%-6d%-8d%-13d%-15d" % (i, i**2, i**3, i**5, i**10, i**20)) # %-4d, %-5d, %-6d,...
    i += 1

i   i**2 i**3  i**5    i**10        i**20          
1   1    1     1       1            1              
2   4    8     32      1024         1048576        
3   9    27    243     59049        3486784401     
4   16   64    1024    1048576      1099511627776  
5   25   125   3125    9765625      95367431640625 
6   36   216   7776    60466176     3656158440062976
7   49   343   16807   282475249    79792266297612001
8   64   512   32768   1073741824   1152921504606846976
9   81   729   59049   3486784401   12157665459056928801
10  100  1000  100000  10000000000  100000000000000000000


### Exercise

##### 1. 

In [74]:
'pinapple' < 'Peach'

False