<p style="text-align:center">
PSY 341K <b>Python Coding for Psychological Sciences</b>, Fall 2019

<img src="https://github.com/sathayas/JupyterPythonFall2019/blob/master/Images/PythonLogo.png?raw=true" alt="Python logo" width="400">
</p>

<h1 style="text-align:center"> String manipulation </h1>

<h4 style="text-align:center"> October 1, 2019 </h4>
<hr style="height:5px;border:none" />
<p>

# 1. Escape characters
<hr style="height:1px;border:none" />

At the beginning of the semester, we covered that, in order to print a string with a
quotation mark (`'`), we can use double quotations (`"`).

In [1]:
print("O'Rilley Auto Parts")

O'Rilley Auto Parts


But what if we want to print double quotation marks? Here is how to do it.

In [2]:
print('The conductor said, \"the door is closing.\"')

The conductor said, "the door is closing."


In other words, we use `\"` in place of `"` to indicate a double quotation mark. This
is an example of **escape characters**, special characters used as part of a string.
Here are some examples.

|                 |                 |
|:--              |:--              |
|Escape character |Printed as       |
|`\'`             |Single quotation |
|`\"`             |Double quotation |
|`\t`             |Tab              |
|`\n`             |New line         |
|`\\`             |Backslash        |



Here are some examples of the escape characters:

In [3]:
print('O\'Reilly Auto Parts')

O'Reilly Auto Parts


In [4]:
print('This statement \nis a multi-line \nstatement.')

This statement 
is a multi-line 
statement.


In [5]:
print('Key\tFunction\nF11\tShow desktop\nF3\tMission control')

Key	Function
F11	Show desktop
F3	Mission control


### Exercise
Write a print statement to print out the following:
1. `A writer's block is a "myth."`
2. 
```
Short cut    Function
CTRL+C       Copy
CTRL+V       Paste
```


# 2. Changing cases
<hr style="height:1px;border:none" />

Say, you have a program to ask the user to enter his/her favorite fruit.

[`<FavoriteFruits.py>`](https://github.com/sathayas/PythonClassFall2019/blob/master/stringExamples/FavoriteFruits.py)

In [None]:
print('What is your favorite fruit?')
userFruit = input()

if userFruit == 'apple':
    print('I like ' + userFruit + ' too.')
else:
    print('I eat ' + userFruit + ' occasionally')

You can try entering `'apple'` and `'Apple'` and see what happens. If you enter
`'apple'`, then the program prints out `'I like apple too'`. However, if you
enter `'Apple'` instead, it prints out `'I eat Apple occasionally'` instead.
Why? This is because string data are case sensitive in Python. In other words,
`'apple'` and `'Apple'` are treated as two different strings.

One way to overcome this challenge is to use the **`upper()`** and **`lower()`**
methods for strings. As you can guess from their names, you can use the
`upper()` method with a string to make it all CAPS.

In [2]:
'apple'.upper()

'APPLE'

In [3]:
'APPLE'.lower()

'apple'

And likewise the `lower()` method changes a string to all lower case. With the
`lower()` method, we can revise the previous program as

In [None]:
print('What is your favorite fruit?')
userFruit = input().lower()

if userFruit == 'apple':
    print('I like ' + userFruit + ' too.')
else:
    print('I eat ' + userFruit + ' occasionally')

and the program becomes case insensitive since the input is converted to the
lower case.

There are methods to check cases, **`isupper()`** and **`islower()`**. You can use
whether a word consists entirely of upper or lower case letters.

In [4]:
word = 'NCAA'
word.isupper()

True

In [5]:
word = 'Python'
word.isupper()

False

In [6]:
word = 'austin'
word.islower()

True

Since a string can be treated like a list, you can manipulate or evaluate a sub-string (or a part of a string).

In [7]:
newWord = word[0].upper() + word[1:]
newWord

'Austin'

In [8]:
newWord[0].isupper()

True

### Exercise
1. **Day of the week**. Write a program to ask the user enter a day of the week. Regardless of upper case or low case letters in the user input, the program prints out the day of the week in the following format:
```
Monday
Thursday
...
```
In other words, the first character is in upper case, and the rest of the string is in lower case.

# 3. Examining character types
<hr style="height:1px;border:none" />

There are some methods for string data, useful for examining user input.
  * **`isalpha()`** - `True` when a string consists of letters, not blank or numbers
  * **`isalnum()`** - `True` when a string consists of letters and numbers, *not blank or special characters*
  * **`isdecimal()`** - `True` when a string consists of numeric characters only.

Here are some examples

[`<InputCheck.py>`](https://github.com/sathayas/PythonClassFall2019/blob/master/stringExamples/InputCheck.py)

In [None]:
while True:
    print('Please enter your 5-digit ZIP code')
    yourZIP = input()
    if yourZIP.isdecimal():
        break
    print('A ZIP code can only have numbers')

while True:
    print('Please enter a word')
    yourWord = input()
    if yourWord.isalpha():
        break
    print('That is not a proper word. Try again')

while True:
    print('Choose a new password (letters and numbers only)')
    yourPassword = input()
    if yourPassword.isalnum():
        break
    print('That is not a proper password. Try again')

You can try these with intentionally entering invalid entries.

### Exercise
1. **Last name or UT EID**. Write a program to ask the user to enter his/her last name or UT EID. The program determines whether input is a last name or a UT EID, and prints out either of the following:
 * Your last name is REDENBACHER
 * Your UT EID is 'or54321'
 
Notice that all the letters are in upper case for the last name, whereas all the letters are in lower case for the UT EID. You can assume that the last name consists solely of alphabets (no hyphens or apostrophe).

# 4. `split()` and `join()` methods
<hr style="height:1px;border:none" />

When you have a long string, you can split it by the **`split()`** method.

In [19]:
text = 'Python is a widely used programming language.'
text.split()

['Python', 'is', 'a', 'widely', 'used', 'programming', 'language.']

By default, the `split()` method splits a string by white space characters
(space, new line `\n`, and tab `\t`). The `split()` method produces a list of substrings.

You can also specify the character separating sub-strings (also known as a
*delimiter*). Here are examples.

In [20]:
listAnimals = 'cat,dog,horse,lion,moose'
listAnimals.split(',')

['cat', 'dog', 'horse', 'lion', 'moose']

In [22]:
longText = '''This is a multi-line text.
This includes multiple sentences on multiple lines.
Some lines may be short.
However, other lines may be longer, containing a larger number of words.
You can split this text, line by line, using the split() method.
'''
longText.split('\n')

['This is a multi-line text.',
 'This includes multiple sentences on multiple lines.',
 'Some lines may be short.',
 'However, other lines may be longer, containing a larger number of words.',
 'You can split this text, line by line, using the split() method.',
 '']

If you have a list of strings, then you can join them into a single string using the
**`join()`** method. For example,

In [23]:
animals = ['cat', 'dog', 'horse', 'lion', 'moose']
', '.join(animals)

'cat, dog, horse, lion, moose'

In [24]:
' & '.join(animals)

'cat & dog & horse & lion & moose'

### Exercise
1. **Word counter**. Write a ***function*** that takes a sentence as the input and counts the number of words in that sentence. You may assume that the sentence only includes letters and spaces. The number of words is returned from this function.

# 5. `replace()` method
<hr style="height:1px;border:none" />

Say, you want to replace a certain sub-string from a string with another substring.
For example, you want to replace the word "is" with "was". You can use
the **`replace()`** method to do this.

In [25]:
text = 'This time, his bagel is freshly baked'
text.replace(' is ', ' was ')

'This time, his bagel was freshly baked'

Noticed that a space was intentionally included before and after “is”. Without
these spaces, the output becomes somewhat strange.

In [26]:
text.replace('is', 'was')

'Thwas time, hwas bagel was freshly baked'

Earlier you saw a sentence split into a list of words. However, the list includes
punctuation marks.

In [27]:
statement = 'Thus, students enjoy learning how to program in Python.'
statement.split()

['Thus,',
 'students',
 'enjoy',
 'learning',
 'how',
 'to',
 'program',
 'in',
 'Python.']

Notice `'Thus,'` instead of `'Thus'`, and `'Python.'` instead of `'Python'`. You
can eliminate punctuation marks with the `replace()` method.

In [28]:
statement.replace(',','').replace('.','').split()

['Thus',
 'students',
 'enjoy',
 'learning',
 'how',
 'to',
 'program',
 'in',
 'Python']

### Exercise

1. **De-punctuation**. Write a function that takes a sentence as an input parameter, and removes any punctuation marks
  * `','` (comma)
  * `'.'` (period)
  * `'?'` (question mark)
  * `'!'` (exclamation mark)
  * `';'` (semicolon)
  * `':'` (colon)
  * `'/'` (forward slash)

The function returns the processed sentence without punctuation marks.