<img src="images/lasalle_logo.png" style="width:375px;height:110px;">

# Week 4 – Strings

### WIM250 - Introduction to Scripting Languages 
### Instructor: Ivaldo Tributino

Source:

- Automate the boring stuff with Python: practical programming for total beginners by Sweigart, A.
- Python Cookbook, 3rd Edition by David Beazley and Brian K. Jones
- Python for Everybody Exploring Data Using Python 3 by Dr. Charles R. Severance

## A string is a sequence

```python
college = 'LASALLE COLLEGE'
```

<img src="images/lasalle_college.png" style="width:475px;height:100px;">

You can access the characters one at a time with the bracket operator:

```python
letter = college[0]
print(letter)
L
```

The expression in brackets is called an `index`. The `index` indicates which character in the sequence you want.

**Observation:** In python lists are `0-indexed`. So the first element is `0`, second is `1`, so on. So if the there are `n` elements in a list, the last element is `n-1`. Remember this!


In [None]:
college = 'LASALLE COLLEGE'
idx = 4
letter = college[idx]
print(letter)

## Multiline Strings with Triple Quotes / Multiline Comments

A multiline string in Python begins and ends with either three single quotes or three double quotes. While double quotes are used for comments that span multiple lines. 

In [None]:
take_a_tour = '''LaSalle College Vancouver is located in an 80,000-square-foot, 
state-of-the-art learning space. 
Our modern learning facilities have everything you need to train on real-world 
equipment and technology.
'''

"""Join us for our upcoming Virtual and In-person Open House to learn about 
our different schools, available programs, creative careers, and cutting-edge 
online and on-campus facilities!
"""

print(take_a_tour)

## Getting the length of a string using `len`

Do you remember the built-in function `len()`? We can use it to get the number of characters in a string:

In [None]:
length = len(college)
length

In [None]:
i=0
while i < length:
    letter = college[i]
    i = i + 1               # Increment
    print(letter, end =" ") # end =" ", printing a element in same line 

print()    

#Can we use for loop?

for letter in college:
    print(letter, end = " ") 

If we execute a cell with the following code, we will receive a Traceback. What is the reason for the error?

```python
college[length]

```

In [None]:
try:
    x = college[length]
    print(x)
except:
    print('What is the reason for the Traceback?')

If you what to know the `last` letter of a string you can use the `negative indices`, which count `backward` from the end of the string.

In [None]:
college[-1]

**Exercise 1:** Write a `while loop` that starts at the `last character` in the string and works its way `backwards` to the first character in the string, printing each letter on a separate line.

Output:
```
E
G
E
L
L
O
C
 
E
L
L
A
S
A
L
```


In [None]:
i=-1
while i >= -length: # -15
    letter = college[i]
    i -=1               # Increment 
    print(letter) 

## String slices

The operator returns the part of the string from the `n-th` character to the `m-th` character, `including the first` but `excluding the last`.

If you` omit the first` index (before the colon), the slice `starts at the beginning` of the string. If you `omit the second` index, the slice `goes to the end` of the string:


In [None]:
college[0:7] # omitting the first Lasalle college

In [None]:
college[8:] # omitting the second

If the first index is greater than or equal to the second the result is an `empty` string, represented by two quotation marks:

In [None]:
college[0:0]

In [None]:
college[9:8]

**Exercise 2:** Since college is a string, what does `college[:]` mean?

In [None]:
college[:]

## Looping and counting

It is tempting to use the operator on the left side of an assignment, with the intention of `changing` or `add` a character in a string. For example:

In [None]:
try:
    college[0] = 'l'  # also college[15] = 'V'
except:
    print('str object does not support item assignment')

The reason for the error is that `strings are immutable`, which means you can’t change an existing string. The best you can do is create a new string that is a variation on the original:

In [None]:
college = 'LASALLE COLLEGE'
print(college)
college = college + ' VANCOUVER' # concatenates
print(college)

Let's count the number of times the letter "A" appears in the string  college:

In [None]:
count = 0
for letter in college:
    if letter == 'a':
        count +=1
print(count)        

**Exercise 3:** Encapsulate this code in a function named count, and generalize it so that it accepts the string and the letter as arguments.

In [None]:
def count(letter, string):
    count = 0
    for l in string:
        if l == letter:
            count+=1
    return count 

In [None]:
count('a', college)

In [None]:
count('A', college)

Python does not handle `uppercase` and `lowercase` letters the same way that people do. All the uppercase letters come before all the lowercase letters, so:

```
'Z' < 'a'
```

In [None]:
'Z' < 'a'

In [None]:
'a' < 'b'

In [None]:
'Lasalle' < 'LAsalle'

In [None]:
##Built-in function that builds a new sorted list from an iterable.
sorted('Lasalle') 

## String methods

Strings are an example of Python `objects`. An object contains both data (the actual string itself) and `methods`, which are effectively `functions` that are `built into the object`. 

Calling a `method` is similar to calling a `function` (it takes arguments and returns a value) but the syntax is different. We call a method by appending the method name to the variable name using the period as a delimiter.

Example:

`str.capitalize()`
Return a copy of the string with its first character capitalized and the rest lowercased.

Source of documentation for string methods would be: https://docs.python.org/library/stdtypes.html#string-methods.

In [None]:
# The empty parentheses indicate that this method takes no argument.

'ivaldo'.capitalize()

Another example, the method `upper()` takes a string and returns a new string with all `uppercase` letters:
Instead of the function syntax `upper(str)`, it uses the method syntax `str.upper()`. 

In [None]:
'lasalle'.upper()

Let's a method that takes argument, for example, the string method named `find` that searches for the position of one string within another:

<img src="images/lasalle_college.png" style="width:475px;height:100px;">

In [None]:
college.find("a")

In [None]:
x = college.find("SALLE")
college[x:]

In [None]:
# It can take as a second argument the index where it should start:
college.find("L", 12)

One common task is to remove white space (spaces, tabs, or newlines) from the beginning and end of a string using the `strip` method:


In [None]:
' WIM250 '.strip()

Some methods such as `startswith()`, `isuppper()` and `islower()` return boolean values. There are several string methods that have names beginning with the word `is`. These methods return a Boolean value that describes the nature of the string. Here are some common `is-string` methods:

- `isalpha()` returns True if the string consists only of letters and not blank.
- `isalnum()` returns True if the string consists only of letters and numbers and not blank.
- `isdecimal()` returns True if the string consists only of numeric characters and is not blank.
- `isspace()` returns True if the string consists only of spaces, tabs, and newlines is not blank.
- `istitle()` returns True if the string consists only of words that begin with an uppercase letter followed by only lowercase latters.

In [None]:
'Lasalle'.startswith('l')

In [None]:
'lasalle'.startswith('l')

In [None]:
'lasalle'.islower()

In [None]:
college.isupper()

**Exercise 4:** There is a string method called `count` that is similar to the function in the previous exercise. Let's see it below.

In [None]:
letter = "A"
start = 1          # range [start, end]
end = 20
college.count(letter, start, end)

## Format operator

The format operator, `%` allows us to construct strings, replacing parts of the strings with the data stored in variables. When applied to integers, `%` is the modulus operator. But when the first operand is a string, `%` is the format operator.
The first operand is the format string, which contains one or more format sequences that specify how the second operand is formatted. The result is a string.

The following example uses `%d` to format an integer, `%g` to format a floating-point number `(don’t ask why)`, and `%s` to format a string:


In [None]:
'For %d years, I have eaten %g hamburgers at %s a year' % (10, 0.5, 'McDonalds')

## Justifying Text with `rjust()`, `ljust()` and `center()`

String methods return a padded version of the string they are called on, with spaces inserted to justify the text.

In [None]:
c_college = college.center(40, '*') # string.center(length, character)
r_college = college.rjust(40, '&')
l_college = college.ljust(40, '*')

print(c_college)
print(r_college)
print(l_college)

There methods are specially useful when you need to print tabular data that has the correct spacing. Run the following code:

In [None]:
def printPicnic(itemsDict, leftWidth, rightWidth):
    print('PICNIC ITEMS'.center(leftWidth + rightWidth, '-'))
    for k, v in itemsDict.items():
        print(k.ljust(leftWidth, '.') + str(v).rjust(rightWidth)) 
        

In [None]:
picnicItems = {'sandwiches': 4, 'apples': 12, 'cups': 4, 'cookies': 8000}
printPicnic(picnicItems, 12, 5)

**Exercise 5**: Through the string below:

str = 'WDIM150/WIM250 : INTROduction TO SCRIPTING LANGUAGES'

Use `find` and string `slicing` to extract the portion of the string after the slash and colon character and then use the `upper()`, `ljust()` and `center()`methods to print:

```
****** INTRODUCTION TO SCRIPTING LANGUAGES ******
--- WDIM150
--- WIM250

```

In [None]:
st = 'WDIM150/WIM250 : INTROduction TO SCRIPTING LANGUAGES'

slash = st.find('/')           # 7    
c1 = st[:slash]               # WDIM150
c2 = st[slash+1:slash+7]      # WIM250

c1 = c1.rjust(len(c1)+1).rjust(len(c1)+4,'-')   # --- WDIM150
c2 = c2.rjust(len(c2)+1).rjust(len(c2)+4,'-')   # --- WIM250

colon = st.find(':')                         #15
name = st[colon+2:].upper()                 # INTRODUCTION TO SCRIPTING LANGUAGES           
name = name.center(len(name)+2)             # " INTRODUCTION TO SCRIPTING LANGUAGES "  
name = name.center(len(name)+12,'*')        # ****** INTRODUCTION TO SCRIPTING LANGUAGES ******

print(name)
print(c1)
print(c2)

For the above exercise, `.split()` can be used instead of string slice, and the `format()` function can also be used to easily align things. All you need to do is use the` <, >,` or `^` characters along with a desired width. For example:

In [None]:
course = st.split(':')
name = course[1].upper() 
name = format(name, '<{length}'.format(length = len(name)+1))
name = format(name, '*^{length}'.format(length = len(name)+12))
name