# Agenda

- Slices
- Strings are immutable
- Methods

# Strings

We can create strings with quotes (either single or double). The strings can contain any characters we want from Unicode (which is a fancy way of saying: all languages + emjois + icons + flags). We can also have special characters, such as `\n` (newline) or `\t` (tab).  We write those with two characters (backslash + something), but it really takes up one character.

In [1]:
s = 'abcd\nefgh'   # abcd + newline + efgh

print(s)

abcd
efgh


In [2]:
len(s)   # how many characters?

9

In [3]:
s = 'a  b  c  d'
len(s)

10

In [4]:
# spaces are characters, too -- they take up space!

In [5]:
# when I get input from the user, I'm getting a string

s = input('Enter something: ')



Enter something: this is something


In [6]:
# I can retrieve a character from that string

s[5]   # character #6, because character #0 is the first one

'i'

In [7]:
s = '\n'
len(s)

1

In [8]:
# if you want a literal backslash, then you need to escape it with ... a backslash

s = 'abcd\\nefgh'
print(s)


abcd\nefgh


In [9]:
len(s)

10

In [10]:
# if you're using Windows, and if you want to work with files, then
# you'll undoubtedly have all sorts of filenames and paths like this:

filename = 'c:\abcd\efgh\ijkl.txt'

print(filename)

c:bcd\efgh\ijkl.txt


In [11]:
# \a is also a special character, the "alarm bell"
# what can we do?

# option 1: double the backslash, so that \a becomes \\a then the \ isn't special any more
filename = 'c:\\abcd\\efgh\\ijkl.txt'

# now we can be sure that we have the literal backslashes, and not special characters
print(filename)


c:\abcd\efgh\ijkl.txt


In [13]:
# why work so hard?
# Python provides "raw strings" where all of the backslashes are doubled
# this is perfect for working with Windows filenames

# raw string has r before the opening quote

filename = r'c:\abcd\efgh\ijkl.txt'   # thanks to the opening r, all \ become \\
print(filename)


c:\abcd\efgh\ijkl.txt


In [14]:
# when we use print, we're asking Python to display things on the screen
# this means that \n becomes an actual newline, etc.

s = 'abcd\nefgh'
print(s)

abcd
efgh


In [15]:
# so what happens in Jupyter when I just type s?  What am I seeing?
s

'abcd\nefgh'

In [16]:
# that's known as the "printed representation" of the data
# it's really meant for programmers to see precisely what the value is
# the quotes show us that we have a string
# the \n shows us it's a newline character, without actually descending a line

In [17]:
# I can use an index to retrieve any character I want from a string
# what if I want multiple characters?

# that's where a "slice" comes in

In [18]:
s = 'abcdefghijklmnopqrstuvwxyz'

s[10]  # this means: give me index 10 (i.e., character #11)

'k'

In [23]:
s[10:20]  # this means: give me the slice starting at index 10, until (not including) index 20

'klmnopqrst'

In [21]:
# the syntax is always: [start:end+1]  
# meaning, the end index is always 1 beyond what we'll see

s[0:5] # start at index 0, go up to (and not including) index 5

'abcde'

In [22]:
s[100]   # this index is too big!

IndexError: string index out of range

In [24]:
# some slices

s[10:20]   # start at index 10, up to and not including index 20

'klmnopqrst'

In [25]:
s[:20]    # start at the beginning (index 0), up to and not including index 20

'abcdefghijklmnopqrst'

In [26]:
s[20:]   # start at index 20, go through the end of the string

'uvwxyz'

In [27]:
# remember, if we were to specify index 25 (i.e., 'z'), it would up to and not including
s[20:25]

'uvwxy'

In [28]:
# it turns out, though, that slices are forgiving if you go beyond the edge
s[20:2000]

'uvwxyz'

In [29]:
# other string things
# search in a string with "in" operator

'j' in s

True

In [30]:
'!' in s

False

In [31]:
'fgh' in s  # this means: is the sequence 'fgh' in s?

True

In [32]:
'fhy' in s

False

# Strings are immutable

That is: You can never change a string once you have defined it.

This *doesn't* mean that you're prevented from assigning a new string value to a variable that already referred to a string. Rather, it means that you can't change strings.

In [33]:
s[0] = '!'     # can I assign a new character to s at index 0?

TypeError: 'str' object does not support item assignment

In [34]:
s = 'abcde'
s = s + 'fghij'   # isn't this modifying a string?  NO

In [35]:
s

'abcdefghij'

In [36]:
s += '!'   # doesn't this change the string?
s

'abcdefghij!'

# Exercise: Pig Latin

1. Ask the user to enter a word. (One word, all lowercase, no punctuation, no spaces.)
2. Print the word's translation into Pig Latin.

The rules of Pig Latin:
- Check the first letter of a word.
- If the first letter is a vowel (a, e, i, o, or u) then add `way` to the word.
- In all other cases, move the first letter to the end, and then add `ay`.

Some examples:
- `table` -> `abletay`
- `computer` -> `omputercay`
- `elephant` -> `elephantway`
- `papaya` -> `apayapay`


In [39]:
word = input('Enter a word: ')

# does the word start with a vowel?
if word[0] == 'a' or word[0] == 'e' or word[0] == 'i' or word[0] == 'o' or word[0] == 'u':
    print(word + 'way')

Enter a word: computer


In [44]:
# let's try this instead:

word = input('Enter a word: ')

# this is effectively
# if word[0] == 'a' or True or True or True or True:    which is always True!
if word[0] == 'a' or 'e' or 'i' or 'o' or 'u':
    print(word + 'way')
    
# why does it think that every word starts with a vowel?

Enter a word: computer
computerway


In [45]:
# there is an even better way

word = input('Enter a word: ')

if word[0] in 'aeiou':
    print(word + 'way')  
#     print(f'{word}way')

# move the first letter to the end
# add ay
else:
    print(word[1:] + word[0] + 'ay')    # all but the first + the first + ay
#     print(f'{word[1:]}{word[0]}ay')

Enter a word: computer
omputercay


# Methods

So far, all of the verbs we've seen in Python have been functions:

- `len`
- `print`
- `input`
- `int`
- `str`

One of the problems with these functions is that you can accidentally call a function with an inappropriate argument. For example, `len(6)`, which will give us an error. There isn't any obvious connection between a function and the arguments it can take. You can look in the documentation, but the function itself won't tell you.

That's where *methods* come in. They are also verbs, but they are verbs connected to the data types. So it's much more obvious how they're supposed to be used.

They have a different syntax:

    FUNCTION(DATA)   # this is how we call a function
    DATA.METHOD()    # this is how we call a method
    
You can't call a method without indicating what data structure it's being run on. Which means that you might get an error saying that the method doesn't exist, but it won't tell you that the value is incorrect.

There are many *many* more methods than functions in Python.   
    
This comes from the world of object-oriented programming, which some people have called "noun-oriented programming."  Data first, and verbs second -- as we see with methods.    

In [47]:
# let's say that I ask the user to enter their name, and then print a greeting

name = input('Enter your name: ')

print(f'Hello, {name}!')

Enter your name:             Reuven           
Hello,             Reuven           !


In [48]:
name

'            Reuven           '

In [50]:
# what I'd like is to remove the whitespace from the start and end of the string
# we can do this with the str.strip method -- meaning, we can call 'strip' on a string

name.strip()   # this returns a new string, one based on name, but without any leading/trailing spaces

'Reuven'

In [51]:
# have I changed name? No, strings are immutable
name

'            Reuven           '

In [52]:
# how can I sort of change name? By assigning back to it:
name = name.strip()
name

'Reuven'

In [53]:
# strings have tons of methods
# you can see them in many editors, including Jupyter, by typing a string name, then . then tab
# or something similar

name.  

SyntaxError: invalid syntax (351532555.py, line 5)

In [54]:
s = 'aBcD eFgH'
s.lower()   # this returns a new string, based on s, with only lowercase letters

'abcd efgh'

In [55]:
s.upper()   # this returns a new string, based on s, with only capital letters

'ABCD EFGH'

In [56]:
s.capitalize()   # this returns a new string, based on s, with the first letter capitalized and the rest lowercase


'Abcd efgh'

In [57]:
s.title()  # capitalize the start of every word

'Abcd Efgh'

In [58]:
s.swapcase()  # this is the most useless method in all of Python

'AbCd EfGh'

In [59]:
# is method chaining a good idea?

name = '     Reuven     '
name.strip().lower()     

'reuven'

In [60]:
# or is it better to do it in parts?
name = name.strip()
name = name.lower()

# the answer: it depends... with strings, this is totally fine

In [61]:
# Python requires one command per line, normally
# you can get away with more if you have open parentheses
# a sneaky trick is to open parentheses just to split things up over several lines

s = '  aBcD eFgH   '

s.strip().lower()[2:4]

'cd'

In [62]:
# I can also do this, which is increasingly common and popular among data scientists:

(
    s
    .strip()
    .lower()
    [2:4]
)

'cd'

In [63]:
# we've seen something like this before:

x = input('Enter a number: ')

print(x + 5)


Enter a number: 3


TypeError: can only concatenate str (not "int") to str

In [65]:
# we've seen something like this before:

x = input('Enter a number: ')

n = int(x)    # get an integer based on x
print(n + 5)  # add that int + 5


Enter a number: hello


ValueError: invalid literal for int() with base 10: 'hello'

In [67]:
# we can check to see if a string can be turned into an integer
# I can run a string method, str.isdigit, which returns True
# if a string is non-empty and contains only digits

# in other words: if you can turn a string into an int, then str.isdigit returns True


x = input('Enter a number: ')

if x.isdigit():
    n = int(x)    # get an integer based on x
    print(n + 5)  # add that int + 5
else:
    print(f'Hey! {x} is not numeric!')

Enter a number: hello
Hey! hello is not numeric!


In [68]:
int('-123')   # this works fine, but str.isdigit() will return False

-123

# Exercise: Calculator

1. Ask the user to enter two integers, and assign them to two variables.
2. If both of the inputs are indeed integers, then print their sum.
3. If one or both is not, then print an error message for the user.

Example:

    Enter first number: 10
    Enter second number: 20
    10 + 20 = 30
    
    Enter first number: hello
    Enter second number: out there
    hello is not numeric
    out there is not numeric
    
Remember: You can have `if` statements inside of `if` blocks!

In [72]:
first = input('Enter first number: ').strip()
second = input('Enter second number: ').strip()

if first.isdigit() and second.isdigit():
    first = int(first)
    second = int(second)
    total = first + second

    print(f'{first} + {second} = {total}')
else:
    if not first.isdigit():
        print(f'{first} is not numeric!')
    if not second.isdigit():
        print(f'{second} is not numeric!')

Enter first number: asdfasfdsa
Enter second number: asdfasfdadsdfasfafd
asdfasfdsa is not numeric!
asdfasfdadsdfasfafd is not numeric!


In [None]:
# let's make it a more general-purpose calculator

first = input('Enter first number: ').strip()
second = input('Enter second number: ').strip()
op = input('Enter an operator: ').strip()

if first.isdigit() and second.isdigit():
    first = int(first)
    second = int(second)

    if op == '+':
        total = first + second
    elif op == '-':
        total = first - second
    elif op == '*':
        total = first * second
    elif op == '/':
        total = first / second
    else:
        total = '(Unsupported operator)'

    print(f'{first} {op} {second} = {total}')
else:
    if not first.isdigit():
        print(f'{first} is not numeric!')
    if not second.isdigit():
        print(f'{second} is not numeric!')