# Welcome to the Dark Art of Coding:
## Introduction to Python
Strings

<img src='../images/dark_art_logo.600px.png' width='300' style="float:right">

# Objectives
---

In this session, students should expect to:

* Learn to create strings
* Learn to perform basic manipulations on strings...
* Explore basic string manipulations
* Explore using string methods


<h1>What is a string?</h1>
<img src='pearl_necklace.jpg'>

Photo by <a href="https://www.flickr.com/photos/tiararama/3530304167">tiarama</a><br>
Attribution http://creativecommons.org/licenses/by-nd/4.0/


# Strings
---

In [None]:
# Using single quotes

phrase = 'python rocks'
print(phrase)           # print() displays the value on the "screen" and
                        #     and will not display the quotes
phrase                  # in Jupyter/IPython, leaving off the print() function
                        #     simply evaluates the variable and displays it as
                        #     a string, with quotes 

In [None]:
# Using double quotes is also fine (but physically harder to type)

phrase = "python rocks"
           
phrase                  

In [None]:
# Watch out if mixing apostrophes and single quotes

apostrophe = 'You've written code'

In [None]:
# Using double quotes to encapsulate the single quote
#     solves this problem

apostrophe = "You've written code"
apostrophe

In [None]:
# Using a backwards slash to "escape" the apostrophe also works

escape = 'You\'ve written code'
escape

In [None]:
# Numbers encapsulated by quotes are still strings:

num = '42'
num

In [None]:
# To convert a numeric string to an integer
integer_form = int('42')
integer_form

In [None]:
# Or convert a numeric string to a float

float_form = float('42.0')
float_form

# Escape characters
---

Character  | Displays
----------|----------
\'        | Single quote
\"        | Double quote
\t        | Tab 
\n        | Newline (line break, return character)
\\\\        | Backslash

In [None]:
# Printing using a newline escape character
#     enables you to put content on separate lines

print("Python\nPython 3\nPython 3.6")

In [None]:
# Raw strings let you preserve a string "as is"
#    escape slashes and all

print(r'You\'re gonna be a great programmer!')

# Experience Points!
---

On the **IPython interpreter** do each of the following:

Task | Sample Object(s)
:---|:---
Create a string with your first name, a tab, your last name and a newline in it; `print()` the string | `\t`, `\n`
Create a string with a triple tab in it; `print()` the string | `\t\t\t`
Create a string with a single quote in it; `print()` the string | `'`
Recreate the same string using an alternate method|

When you complete this exercise, please put your green post-it on your monitor. 

If you want to continue on at your own-pace, please feel free to do so.

<img src='../images/green_sticky.300px.png' width='200' style='float:left'>

# Multiline strings
---

In [None]:
# Multiline strings: using triple quotes (either ' OR ")
#     preserve natural newlines and leading spaces, etc.

print("""Multiline Strings!

multiline strings will preserve
    the nuances
        including the newlines and leading spaces and 

yes, you're still gonna be a great programmer!

""")

In [None]:
# One great place to use multiline quotes is as the 
#     first string in a function. This string is
#     automatically used by Python as the documentation
#     or Docstring for your function.

def getRandomNumber():
    '''This function returns a random number
    
    chosen by fair dice roll.
    guaranteed to be random.
    
    hat tip to http://xkcd.com/221/'''
    return 4

In [None]:
help(getRandomNumber)

## xkcd
<img src='random_number.png'>

Attribution http://xkcd.com/221/

# Indexing and slicing
---

## Indexing

Each character in a string has an index:
Indexes start at zero (0)

```Python
language = 'P  Y  T  H  O  N'
            0  1  2  3  4  5
```

Indexes also exist counting backwards from the end:
NOTE: reverse indexing starts counting at -1.


```Python
language = 'P  Y  T  H  O  N'
            0  1  2  3  4  5
           -6 -5 -4 -3 -2 -1 
```

In [None]:
# To reference a specific character in a string, you
#     simply use bracket notation and the index
#     of the character you are interested in.

phrase = 'Pyladies!'
print(phrase[0])

In [None]:
print(phrase[8])

In [None]:
print(phrase[-3])

## Slicing

In [None]:
# To reference more than one character, you 
#     use the same bracketed notation and the
#     * starting index
#     * ending index
# WARNING: Python, slices up to, but NOT including the
#          last index.

print(phrase[0:2])

Why **zero indexing**?

Why the principle of **'up to but not including'**?

[See the note by Djikstra](https://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF)    

In [None]:
print(phrase[-7:-1])

In [None]:
print(phrase[3:-1])

In [None]:
# Leaving the index on either side of the colon blank
#     defaults to going all the way to AND including the end
#     character.

print(phrase[-7:])
print(phrase[:])


In [None]:
# This shortcut is a common way used to create copies...

new_phrase = phrase[:]
print("new_phrase is:", new_phrase)


# Exceeding bounds
---

In [None]:
# If you attempt to exceed the index bounds when accessing a 
#     specific character, you will get an error condition:
#     'index out of range'

print(phrase[9000])

# Finding the length of a string
---

In [None]:
# to find the length of a string, use the len() function

len(phrase)

# Parsing each character, one at a time
---

In [None]:
for char in phrase:
    print(char) 

# `in` as a logical operator
---

In [None]:
# `in` allows you to test a string to see if a 
#      substring is present anywhere in the string

'Py' in phrase

In [None]:
# The `in` test is case-sensitive

'Pyladies!' in phrase

In [None]:
'Fortran' not in phrase

In [None]:
# DANGER WILL ROBINSON:
# WARNING: the empty string `''` is always 
#          considered to be present in a string


'' in phrase

# Experience Points!
---

In your **text editor** create a simple script called:

```bash
my_strings_01.py```

Execute your script in the **IPython interpreter** using the command:

```bash
run my_strings_01.py```

I suggest that as you add each feature to your script that you run it right away to test it incrementally.

1. Assign the label `myname` to a string with your first and last names
1. Assign the label `shortname` to a slice of `myname` that has all letters **except** the first and last letters
1. `print()` `shortname` to the screen
1. Assign a label `result` to the a test of whether `python` is present in `myname`
1. `print()` "python is embedded in my name: " and `result` to the screen 

When you complete this exercise, please put your green post-it on your monitor. 

If you want to continue on at your own-pace, please feel free to do so.

<img src='../images/green_sticky.300px.png' width='200' style='float:left'>

# Comparisons
---

In [None]:
term1 = 'Python'
term2 = 'python'

In [None]:
# Python interprets string equality based on 
#     case sensitivity:
#     Thus 'Python' and 'python' are different


term1 == term2

In [None]:
term3 = 'abcde'
term4 = 'abcdz'

In [None]:
# Python accounts for lexigraphical order (loosely alphabetical order)
#     when determining whether a string comes before or
#     after another string

term3 < term4

In [None]:
# lexigraphical order puts capital letters first

term1 < term2

Lexigraphical order, what?
---

<img src='ascii-0-127.gif' width='600'>

Attribution: [ascii chart: http://www.jimprice.com/ascii-0-127.gif](http://www.jimprice.com/ascii-0-127.gif)

# Object Methods
## >>> particularly string methods
---

In [None]:
# Every string in Python is also an object

phrase

Objects in Python are associated with a programming paradigm called "Object Oriented Programming"

Each object has attributes and behaviors associated with it, even if it is not inherently obvious at first.

Behaviours are often also called methods (or functions).

Strings are no exception.

Attributes and behaviours are accessed via dot.notation:

```python
phrase.some_attribute
phrase.some_behaviour()```

Attributes are accessed using the `name_of_the_object.name_of_the_attribute`

Behaviours are accessed similarly, but because they are methods/functions, they need to be **called** by using the parenthesis.

**All** functions/methods are called using parenthesis.

In [None]:
# Strings in Python come with behaviors such as the 
#     .upper() method which makes a copy of the string
#     and converts it to uppercase.

phrase.upper()

In [None]:
# Strings in Python come with behaviors such as the 
#     .lower() method which makes a copy of the string
#     and converts it to lowercase.

phrase.lower()

Generically referencing strings:

The generic version of a string is called `str` so it is common to see string methods prefixed by str:

```python
str.upper()
str.lower()
```


In [None]:
# One very common place where str.upper() or str.lower() is used is
#     in normalizing inputs for easy comparison
#     In the following example, you don't know if your
#     User will input with all CAPS, all lower, or MiXeD CaSe...


choice = input('Who is your favorite superhero? ')
if choice.lower() == 'selina kyle':
    print('I like Catwoman, too')
else:
    print("Catwoman is better!")

# Finding an index
---

Sometimes we want to know: 

* whether a character is present in a string
* where that character may be found

We will look at `phrase.find()` and leave it as an exercise for the student to research `phrase.index()`

In [None]:
sentence = 'Python Rox'
sentence

In [None]:
# If we want to find where the 't' string is located, we 
#     can use the find function.

sentence.find('on')

# P y t h o n   R o x
# 0 1 2 3 4 5 6 7 8 9

In [None]:
# If we want to find where the 'o' string is located, we 
#     can use the find function.
# But this shows us the location of the FIRST 'o'

sentence.find('o')

# P y t h o n   R o x
# 0 1 2 3 4 5 6 7 8 9

In [None]:
sentence.find?


In [None]:
# To continue processing the string to find other examples
#     of the character 'o', we can start the search
#     ONE character up, from where the last character was found.

sentence.find('o', 5)

# P y t h o n   R o x
# 0 1 2 3 4 5 6 7 8 9

In [None]:
# To automate such processing users often use a placeholder 
#     variable to help identify where to start the next search

start = sentence.find('o')
print(start)
next = sentence.find('o', start + 1)
print(next)

# P y t h o n   R o x
# 0 1 2 3 4 5 6 7 8 9

# Testing string characteristics
---

In [None]:
# There are methods to test for characteristics:
# These type of methods almost always start with
#     "is" >> isupper(), islower()


heroine = 'CATWOMAN'
heroine.isupper()

In [None]:
heroine.islower()

In [None]:
# How do you find out which methods exist for strings.
# Use tab-completion

sentence.

Method | Purpose
-------|--------
.isalpha()     | Verifies whether ALL the characters are alphabetic
.isalnum()     | Verifies whether ALL the characters are alphabetic or numeric
.isdecimal()     | Verifies whether ALL the characters are numerical
.isspace()     | Verifies whether ALL the characters are whitespace (\t, \n, ' ', etc)
.istitle()     | Verifies whether the string is in 'Title Case'

In [None]:
# NOTE: even strings that have not been assigned a label
#       have the same methods and attributes associated with
#       them.

'bullwhip'.islower()

In [None]:
# NOTE: numerical characters in a string do not have a sense
#       of upper or lowercase.

'42'.isupper()

In [None]:
# Method chaining... 
# is allowed AND is very Pythonic.

heroine = 'C4TW0M4N'

heroine.lower().isalpha()

In [None]:
# As noted above, str is the base class (i.e. blueprint) for 
#     all string objects.

help(str.isspace)

In [None]:
# any str object will have the same methods

help(heroine.isspace)

In [None]:
while True:
    num_burglaries = input('How many cat burglaries did you commit this week? ')
    if num_burglaries.isdecimal():
        break
    print('That was not a number, please input a number\n')
        

Method | Purpose
------------------|--------
.startswith()     | Verifies whether the string STARTS with a substring
.endswith()     | Verifies whether string ENDS with a substring
.join()     | Combine all the elements of a sequence (like a list) using the string
.split()     | Separate the string into substrings by splitting on a given string

In [None]:
address = 'bishop street|honolulu|hawaii'

results = address.split('|')
print(results)

new_string = ' <---> '.join(results)
print(new_string)

Method | Purpose
-------|--------
.rjust()     | Right justify the string in a field of given width
.ljust()     | Left justify the string in a field of given width
.center()     | Center the string in a field of given width

In [None]:
print('*', 'aloha'.center(40), '*')

In [None]:
print('Date'.ljust(13), 'Name'.center(20), 'Address'.rjust(20))

Method | Purpose
-------|--------
.strip()     | Remove all the given characters from the string (on both ends)
.rstrip()     | Remove all the given characters from the right end of the string 
.lstrip()     | Remove all the given characters from the left end of the string

In [None]:
newline_str = '***\n\n\n*this string\nhas\nnewlines and stars*\n\n\n***'
print(newline_str)

print('-' * 60)

clean_version = newline_str.strip('\n*')
print(clean_version)

# Experience Points!
---

In your **text editor** create a simple script called:

```bash
my_strings_02.py```

Execute your script in the **IPython interpreter** using the command:

```bash
run my_strings_02.py```

I suggest that as you add each feature to your script that you run it right away to test it incrementally.

1. Assign the label `myname` to a string with your first and last names separated by a space
1. Assign a label to the result of splitting (`.split()`) the variable `myname` based on spaces
1. Test whether `myname` is ALL uppercase: IF you don't remember which method to use, research string functions.
1. Create a field 42 characters wide and right justify your name in the field: IF you don't remember which method to use, research string functions.

When you complete this exercise, please put your green post-it on your monitor. 

If you want to continue on at your own-pace, please feel free to do so.

<img src='../images/green_sticky.300px.png' width='200' style='float:left'>

# String Formatting

Method | Purpose
-------|--------
.format()     | return a formatted version of a string

In [None]:
help(str.format)

In [None]:
boilerplate = '''Catwoman is a complex character with many plot lines. Several women have
used the name Catwoman, including: {}, {}, and {}.
'''.format('Selina Kyle', 'Holly Robinson', 'Eiko Hasigawa')

print(boilerplate)

In [None]:
positionals = '{1} {0}'.format('last', 'first')

print(positionals)

In [None]:
headline = '''In a battle of the superheroes between {0} and {1},
in this round, {0} clearly came out on top, getting away with 
the jewels. {0} snuck away in the dark of night.'''.format('Catwoman', 'Batman')

print(headline)

In [None]:
# Alignment

left_align = '{:20}'.format('Selina')
print(left_align)

left_align = '{:<20}'.format('Selina')
print(left_align)

right_align = '{:>20}'.format('Holly')
print(right_align)

center_align = '{:^20}'.format('Eiko')
print(center_align)

In [None]:
# Padding

left_pad = '{:*<20}'.format('Selina')
print(left_pad)

right_pad = '{:_>20}'.format('Kyle')
print(right_pad)

In [None]:
# Truncation

trunc = '{:.6}'.format('Selina Kyle')
print(trunc)

# Combining them all together

trunc_pad = '{:->20.6}'.format('Selina Kyle')
print(trunc_pad)

In [None]:
# Numbers

print('{:d}'.format(42))

print('{:f}'.format(2.71))

print('{:20d}'.format(42))

print('{:6.2f}'.format(2.718281828459045))

print('{:+d}'.format(42))

print('{:d}'.format(-42))

In [None]:
# Named Placeholders

name = '{first} {last}'.format(first='Selina', last='Kyle')

print(name)

# want to learn more?

https://pyformat.info