# Manipulating Strings in Python
***
## Learning Objectives
In this lesson you will:

        1. Learn the fundamentals of processing text stored in string values
        2. Apply various methods to strings


## Links to topics and functions:
>- <a id='Lists'></a>[String Literals](#String-Literals)
>- <a id='methods'></a>[String Methods](#String-Methods)


### References:
>- Groner, Chapter 8
>- Sweigart(2015, pp. 123-143)
>- w3Schools: https://www.w3schools.com/python/python_strings.asp

#### Don't forget about the Python visualizer tool: http://pythontutor.com/visualize.html#mode=display

## Table of String Methods:
|Methods/Functions  |Description    |
|:-----------:      |:-------------|
|upper()            |Returns a new string with all UPPER CASE LETTERS|
|lower()            |Returns a new string with all lower case letters|
|isupper()          |Checks whether all the letters in a string are UPPER CASE|
|islower()          |Checks whether all the letters in a string are lower case|
|isalpha()          |Checks whether a string only has letters and is not blank|
|isalnum()          |Checks whether only letters and numbers are in the string|   
|isdecimal()        |Checks whether the string only consists of numeric characters|
|isspace()          |Checks whether the string only contains: spaces, tabs, and new lines|
|istitle()          |Checks whether the string only contains words that start with upper followed by lower case|
|startswith()       |Checks if the string value begins with the string passed to the method
|endswith()         |Checks if the string value ends with the string passed to the method
|join()             |Concatenates a list of strings into one string
|split()            |Basically, "unconcatenates" a string into a list of strings
|rjust()            |Right justifies a string based on an integer value of spaces
|ljust()            |Left justifies a string based on an integer value of spaces
|center()           |Centers a string based on an integer value of spaces
|strip()            |Removes whitespace characters at the beginning and end of string
|rstrip()           |Removes whitespace from the right end of the string
|lstrip()           |Removes whitespace from the left end of the string

# String Literals
>- Basically, this is telling Python where a string begins and ends
>- We have already used single `'` and `"` quotes but what if we want to mix these?


### Using double quotes
>- One wrong and correct way to define a string in Python using quotes

In [4]:
ralphie = "Ralphie is CU's mascot"

#### Another way using escape characters

In [None]:
ralphie = 'Ralphie is CU\'s mascot'

### Escape characters allow us to put characters in a string that would otherwise be impossible

#### Here are some common escape characters

|Escape Character   | Prints as     |
:-----------:       |:----------:   |
|\\'                 |Single quote   |
|\\"                 |Double quote   |
|\t                 |Tab            |
|\n                 |New line       |
|\\\                 |Backslash      |

### Multi-line Strings
>- Use triple quotes
>- All text within triple quotes is considered part of the string
>- This is particularly useful when commenting out your code

### Indexing and Slicing Strings
>- Recall how we used indexes and slicing with lists: `list[1]`, `list[0:3]`, etc
>- Also recall how we said strings are "list-like"
>- We can think of a string as a list with each character having an index

#### Let's slice up some strings

### How many times does each character appear in `ralphie`?

In [5]:
char_count = {}

for char in ralphie:
  char_count.setdefault(char,0)

  char_count[char] += 1

char_count

{'R': 1,
 'a': 2,
 'l': 1,
 'p': 1,
 'h': 1,
 'i': 2,
 'e': 1,
 ' ': 3,
 's': 3,
 'C': 1,
 'U': 1,
 "'": 1,
 'm': 1,
 'c': 1,
 'o': 1,
 't': 1}

#### How many times does 'f' appear in our `ralphie` variable?

In [6]:
char_count['p']

1

#### Recall: get a sorted count of characters from `charCount`

In [7]:
sorted(char_count.items(), key=lambda x:x[1], reverse=True)

[(' ', 3),
 ('s', 3),
 ('a', 2),
 ('i', 2),
 ('R', 1),
 ('l', 1),
 ('p', 1),
 ('h', 1),
 ('e', 1),
 ('C', 1),
 ('U', 1),
 ("'", 1),
 ('m', 1),
 ('c', 1),
 ('o', 1),
 ('t', 1)]

## String Methods

### upper(), lower(), isupper(), islower()

In [9]:
ralphie.upper()

"RALPHIE IS CU'S MASCOT"

##### Are all the letters uppercase?

In [10]:
ralphie.isupper()

False

##### Are all the letters lowercase?

In [11]:
ralphie.islower()

False

#### We can also type strings prior to the method

In [12]:
'HELLO'.isupper()

True

In [13]:
'hello'.islower()

True

In [14]:
'hello123'.islower()

True

In [15]:
'123'.islower()

False

### `isalpha()`, `isalnum()`, `isdecimal()`, `isspace()`, `istitle()`

>- These can be useful for data validation

##### Does the string only contain letters with no space characters?

In [17]:
ralphie.isalpha()

False

##### Does the string only contain letters or numbers with no spaces?

##### Does the string only contain numbers?

##### Does the string contain only words that start with a capital followed by lowercase letters (i.e., Title Case)?

In [18]:
ralphie.istitle()

False

#### Example showing how the `isX` methods are useful
>- Task: create a program that will ask a user for their age and print their age to the screen
>>- Create data validation for age requiring only numbers for the input
>>- If the user does not enter a number, ask them to enter one.

In [20]:
while True:

  age = input("What is your age?")

  if age.isdecimal():
    break

  else:

    print("Please enter a number for your age")

print(age)

What is your age?thirty
Please enter a number for your age
What is your age?30
30


### `startswith()` and `endswith()` methods

##### Does the string start/end with a particular string?

In [21]:
ralphie= "Go Buffs!"

In [22]:
ralphie.startswith('Go')

True

In [24]:
'Hello'.startswith('he')

False

In [26]:
ralphie.startswith('Buffs')

False

In [27]:
ralphie.endswith('!')

True

### `join()` and `split()` methods

#### `join()`
>- Take a list of strings and concatenate them into one string
>- The join method is called on a string value and is usually passed a list value

In [28]:
cu_leeds = ['marketing', 'finance', 'management', 'analytics']

In [29]:
cu_leeds_join= ', '.join(cu_leeds)

In [31]:
cu_leeds_join

'marketing, finance, management, analytics'

In [32]:
' and '.join(cu_leeds)

'marketing and finance and management and analytics'

#### `split()`
>- Commonly used to split a multi-line string along the newline characters
>- The split method is called on a string value  and returns a list of strings

In [33]:
# Run this cell to define dean_letter

dean_letter = '''
Dear Dean:
We have been working really hard
to learn Python this semester.
The skills we are learning in
the analytics program will
translate into highly demanded
jobs and higher salaries than
those without anlaytics skills.
'''

#### Split `dean_letter` based on the line breaks
>- Will result in a list of all the string values based on line breaks

In [34]:
dean_letter.split('\n')

['',
 'Dear Dean:',
 'We have been working really hard',
 'to learn Python this semester.',
 'The skills we are learning in',
 'the analytics program will',
 'translate into highly demanded',
 'jobs and higher salaries than',
 'those without anlaytics skills.',
 '']

##### Splitting on another character

In [36]:
dean_letter.split(':')

['\nDear Dean',
 '\nWe have been working really hard\nto learn Python this semester.\nThe skills we are learning in\nthe analytics program will\ntranslate into highly demanded\njobs and higher salaries than\nthose without anlaytics skills.\n']

##### The default separator is any white space (new lines, spaces, tabs, etc)

In [None]:
dean_letter.split()

##### We can change the default number of splits if we pass a second parameter

In [38]:
dean_letter.split(' ', 3)

['\nDear',
 'Dean:\nWe',
 'have',
 'been working really hard\nto learn Python this semester.\nThe skills we are learning in\nthe analytics program will\ntranslate into highly demanded\njobs and higher salaries than\nthose without anlaytics skills.\n']

### Justifying Text with `rjust()`, `ljust()`, and `center()`
>- General syntax: `string.rjust(length, character)` where:
>>- length is required and represents the total length of the string
>>- character is optional and represents a character to fill in missing space

In [39]:
'Hello'.rjust(10)

'     Hello'

##### We can insert another character for the spaces

In [40]:
'Hello'.rjust(10, '-')

'-----Hello'

In [41]:
'Hello'.ljust(10, '$')

'Hello$$$$$'

##### Insert another character for spaces

In [42]:
'Hello'.center(10)

'  Hello   '

In [43]:
'Hello'.center(20, '*')

'*******Hello********'

### Justifying Text Example
>- Task: write a function that accepts 3 parameters: itemsDict, leftWidth, rightWidth and prints a table for majors and salaries
>>- itemsDict will be a dictionary variable storing salaries (the values) for majors (the keys)
>>- leftWidth is an integer parameter that will get passed to the ljust() method to define the column width of majors
>>- rightWidth is an integer parameter that will get passed to the ljust() method to define the column width of salaries

In [47]:
def print_salary(itemsDict, leftWidth, rightWidth):

  print('Major'.ljust(leftWidth, ), 'Salary'.ljust(rightWidth, ))

  print('-' * (leftWidth + rightWidth))

  for key, value in itemsDict.items():

    print(key.ljust(leftWidth, '.') + str(value).rjust(rightWidth))

In [49]:
salary_dict= {'Marketing' : 50000,
              'Accounting' : 55000,
              'Analytics': 57000,
              'Management': 60000}

print_salary(salary_dict, 15, 7)

Major           Salary 
----------------------
Marketing......  50000
Accounting.....  55000
Analytics......  57000
Management.....  60000


### HW: Some basic analytics on our salary table
>- How many total majors were analyzed? Name the variable `sampSize`
>- What was the average salary of all majors? Name the variable `avgSal`

<a id='top'></a>[TopPage](#Teaching-Notes)