# Welcome to the Dark Art of Coding:
## Introduction to Python
Strings

<img src='../images/dark_art_logo.600px.png' width='300' style="float:right">

# Objectives
---

In this session, students should expect to:

* Learn to create strings
* Learn to perform basic manipulations on strings...
* Explore basic string manipulations
* Explore us string methods


<h1>What is a string?</h1>
<img src='pearl_necklace.jpg'>

Photo by <a href="https://www.flickr.com/photos/tiararama/3530304167">tiarama</a><br>
Attribution http://creativecommons.org/licenses/by-nd/4.0/


# Strings
---

In [None]:
# Using single quotes

phrase = 'python rocks'
print(phrase)           # print() displays the value on the "screen" and
                        #     and will not display the quotes
phrase                  # in Jupyter/IPython, leaving off the print() function
                        #     simply evaluates the variable and displays it as
                        #     a string, with quotes 

In [None]:
# Using double quotes is also fine (but physically harder to type)

phrase = "python rocks"
           
phrase                  

In [None]:
# Watch out if mixing apostrophes and single quotes

apostrophe = 'You've written code'

In [None]:
# Using double quotes to encapsulate the single quote
#     solves this problem

apostrophe = "You've written code"
apostrophe

In [None]:
# Using a backwards slash to "escape" the apostrophe also works

escape = 'You\'ve written code'
escape

# Escape characters
---

Character  | Displays
----------|----------
\'        | Single quote
\"        | Double quote
\t        | Tab 
\n        | Newline (line break, return character)
\\\\        | Backslash

In [None]:
# Printing using a newline escape character
#     enables you to put content on separate lines

print("Python\nPython 3\nPython 3.6")

In [None]:
# Raw strings let you preserve a string "as is"
#    escape slashes and all

print(r'You\'re gonna be a great programmer!')

# Multiline strings
---

In [None]:
# Multiline strings: using triple quotes (' OR ")
#     preserve natural newlines and leading spaces, etc.

print('''Multiline Strings!

multiline strings will preserve
    the nuances
        including the newlines and leading spaces and 

yes, you're still gonna be a great programmer!

''')

In [None]:
# One great place to use multiline quotes is as the 
#     first string in a function. This string is
#     automatically used by Python as the documentation
#     or Docstring for your function.

def getRandomNumber():
    '''This function returns a random number
    
    chosen by fair dice roll.
    guaranteed to be random.
    
    hat tip to http://xkcd.com/221/'''
    return 4

In [None]:
help(getRandomNumber)

## xkcd
<img src='random_number.png'>

Attribution http://xkcd.com/221/

# Indexing and slicing
---

## Indexing

Each character in a string has an index:
Indexes start at zero (0)

```Python
language = 'P  Y  T  H  O  N'
            0  1  2  3  4  5
```

Indexes also exist counting backwards from the end:
NOTE: reverse indexing starts counting at -1.


```Python
language = 'P  Y  T  H  O  N'
            0  1  2  3  4  5
           -6 -5 -4 -3 -2 -1 
```

In [None]:
# To reference a specific character in a string, you
#     simply use bracket notation and the index
#     of the character you are interested in.

phrase = 'Pyladies!'
print(phrase[0])

In [None]:
print(phrase[8])

In [None]:
print(phrase[-1])

## Slicing

In [None]:
# To reference more than one character, you 
#     use the same bracketed notation and the
#     * starting index
#     * ending index
# WARNING: Python, slices up to, but NOT including the
#          last index.

print(phrase[0:2])

Why **zero indexing**?

Why the principle of **'up to but not including'**?

[See the note by Djikstra](https://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF)    

In [None]:
print(phrase[-7:-1])

In [None]:
# Leaving the index on either side of the colon blank
#     defaults to going all the way to AND including the end
#     character.

print(phrase[-7:])
print(phrase[:])


In [None]:
# This shortcut is a common way used to create copies...

new_phrase = phrase[:]
print("new_phrase is:", new_phrase)


# `in` as a logical operator
---

In [None]:
# `in` allows you to test a string to see if a 
#      substring is present anywhere in the string

'Py' in phrase

In [None]:
# The `in` test is case-sensitive

'py' in phrase

In [None]:
'Fortran' not in phrase

In [None]:
# DANGER WILL ROBINSON:
# WARNING: the empty string `''` is always 
#          considered to be present in a string


'' in phrase

# Object Methods
## >>> particularly string methods
---

In [None]:
# Every string in Python is also an object

phrase

Objects in Python are associated with a programming paradigm called "Object Oriented Programming"

Each object has attributes and behaviors associated with it, even if it is not inherently obvious at first.

Behaviours are often also called methods (or functions).

Strings are no exception.

Attributes and behaviours are accessed via dot.notation:

```python
phrase.some_attribute
phrase.some_behaviour()```

Attributes are accessed using the `name_of_the_object.name_of_the_attribute`

Behaviours are accessed similarly, but because they are methods/functions, they need to be **called** by using the parenthesis.

**All** functions/methods are called using parenthesis.

In [None]:
# Strings in Python come with behaviors such as the 
#     .upper() method which makes a copy of the string
#     and converts it to uppercase.

phrase.upper()

In [None]:
# Strings in Python come with behaviors such as the 
#     .lower() method which makes a copy of the string
#     and converts it to lowercase.

phrase.lower()

Generically referencing strings:

The generic version of a string is called `str` so it is common to see string methods prefixed by str:

```python
str.upper()
str.lower()
```


In [None]:
# One very common place where str.upper() or str.lower() is used is
#     in normalizing inputs for easy comparison
#     In the following example, you don't know if your
#     User will input with all CAPS, all lower, or MiXeD CaSe...


choice = input('Who is your favorite superhero? ')
if choice.lower() == 'selina kyle':
    print('I like Catwoman, too')
else:
    print("Catwoman is better!")

In [None]:
# There are methods to test for characteristics:
# These type of methods almost always start with
#     "is" >> isupper(), islower()


heroine = 'CATWOMAN'
heroine.isupper()

In [None]:
heroine.islower()

Method | Purpose
-------|--------
.isalpha()     | Verifies whether ALL the characters are alphabetic
.isalnum()     | Verifies whether ALL the characters are alphabetic or numeric
.isdecimal()     | Verifies whether ALL the characters are numerical
.isspace()     | Verifies whether ALL the characters are whitespace (\t, \n, ' ', etc)
.istitle()     | Verifies whether the string is in 'Title Case'

In [None]:
# NOTE: even strings that have not been assigned a label
#       have the same methods and attributes associated with
#       them.

'bullwhip'.islower()

In [None]:
# NOTE: numerical characters in a string do not have a sense
#       of upper or lowercase.

'42'.isupper()

In [None]:
# Method chaining... 
# is allowed AND is very Pythonic.


heroine.lower().islower()

In [None]:
# As noted above, str is the base class (i.e. blueprint) for 
#     all string objects.

help(str.isspace)


str.isspace?

In [None]:
# any str object will have the same methods

help(heroine.isspace)

In [1]:
while True:
    num_burglaries = input('How many cat burglaries did you commit this week? ')
    if num_burglaries.isdecimal():
        break
    print('That was not a number, please input a number\n')
        

How many cat burglaries did you commit this week? a
That was not a number, please input a number

How many cat burglaries did you commit this week? b
That was not a number, please input a number

How many cat burglaries did you commit this week? 3


Method | Purpose
------------------|--------
.startswith()     | Verifies whether the string STARTS with a substring
.endswith()     | Verifies whether string ENDS with a substring
.join()     | Combine all the elements of a sequence (like a list) using the string
.split()     | Separate the string into substrings by splitting on a given string

Method | Purpose
-------|--------
.rjust()     | Right justify the string in a field of given width
.ljust()     | Left justify the string in a field of given width
.center()     | Center the string in a field of given width

Method | Purpose
-------|--------
.strip()     | Remove all the given characters from the string (on both ends)
.rstrip()     | Remove all the given characters from the right end of the string 
.lstrip()     | Remove all the given characters from the left end of the string

# String Formatting

Method | Purpose
-------|--------
.format()     | return a formatted version of a string

In [None]:
help(str.format)

In [None]:
boilerplate = '''Catwoman is a complex character with many plot lines. Several women have
used the name Catwoman, including: {}, {}, and {}.
'''.format('Selina Kyle', 'Holly Robinson', 'Eiko Hasigawa')

print(boilerplate)

In [None]:
positionals = '{1} {0}'.format('last', 'first')

print(positionals)

In [None]:
headline = '''In a battle of the superheroes between {0} and {1},
in this round, {0} clearly came out on top, getting away with 
the jewels. {0} snuck away in the dark of night.'''.format('Catwoman', 'Batman')

print(headline)

In [None]:
# Alignment

left_align = '{:20}'.format('Selina')
print(left_align)

left_align = '{:<20}'.format('Selina')
print(left_align)

right_align = '{:>20}'.format('Holly')
print(right_align)

center_align = '{:^20}'.format('Eiko')
print(center_align)

In [None]:
# Padding

left_pad = '{:*<20}'.format('Selina')
print(left_pad)

right_pad = '{:_>20}'.format('Kyle')
print(right_pad)

In [None]:
# Truncation

trunc = '{:.6}'.format('Selina Kyle')
print(trunc)

# Combining them all together

trunc_pad = '{:->20.6}'.format('Selina Kyle')
print(trunc_pad)

In [None]:
# Numbers

print('{:d}'.format(42))

print('{:f}'.format(2.71))

print('{:20d}'.format(42))

print('{:6.2f}'.format(2.718281828459045))

print('{:+d}'.format(42))

print('{:d}'.format(-42))

In [None]:
# Named Placeholders

name = '{first} {last}'.format(first='Selina', last='Kyle')

print(name)

# want to learn more?

https://pyformat.info