# Strings

## We can do things with strings

We've already seen  in Data 8 some operations that can be done with strings.

In [1]:
first_name = "Franz"
last_name = "Kafka"
full_name = first_name + last_name
print(full_name)

FranzKafka


Remember that computers don't understand context.

In [2]:
full_name = first_name + " " + last_name
print(full_name)

Franz Kafka


## Strings are made up of sub-strings

You can think of strings as a [sequence](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md#sequence) of smaller strings or characters. We can access a piece of that sequence using square brackets `[]`.

In [3]:
full_name[1]

'r'

<div class="alert alert-danger">
Don't forget, Python (and many other langauges) start counting from 0.
</div>

In [4]:
full_name[0]

'F'

In [5]:
full_name[4]

'z'

## You can slice strings using  `[ : ]`

If you want a range (or "slice") of a sequence, you get everything *before* the second index, i.e,. Python slicing is *exclusive*:

In [6]:
full_name[0:4]

'Fran'

In [7]:
full_name[0:5]

'Franz'

You can see some of the logic for this when we consider implicit indices.

In [8]:
full_name[:5]

'Franz'

In [9]:
full_name[5:]

' Kafka'

If we want to find out how long a string is, we can use the `len` function:

In [10]:
len(full_name)

11

## Strings have methods

* There are other operations defined on string data. These are called **string [methods](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md#method)**. 
* The Jupyter Notebooks lets you do tab-completion after a dot ('.') to see what methods an [object](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md#object) (i.e., a defined variable) has to offer. Try it now!

In [14]:
str.

SyntaxError: invalid syntax (<ipython-input-14-58171b034407>, line 1)

Let's look at the `upper` method. What does it do? Let's take a look at the documentation. Jupyter Notebooks let us do this with a question mark ('?') before *or* after an object (again, a defined variable).

In [15]:
str.upper?

So we can use it to upper-caseify a string. 

In [16]:
full_name.upper()

'FRANZ KAFKA'

You have to use the parenthesis at the end because upper is a method of the string class.
<p></p>
<div class="alert alert-danger">
Don't forget, simply calling the method does not change the original variable, you must *reassign* the variable:
</div>

In [19]:
print(full_name)

FRANZ KAFKA


In [20]:
full_name = full_name.upper()
print(full_name)

FRANZ KAFKA


For what it's worth, you don't need to have a variable to use the `upper()` method, you could use it on the string itself.

In [21]:
"Franz Kafka".upper()

'FRANZ KAFKA'

What do you think should happen when you take upper of an int?  What about a string representation of an int?

In [22]:
1.upper()

SyntaxError: invalid syntax (<ipython-input-22-b26feabcdc28>, line 1)

In [23]:
"1".upper()

'1'

## Challenge 1: Write your name

1. Make two string variables, one with your first name and one with your last name.
2. Concatenate both strings to form your full name and [assign](https://github.com/dlab-berkeley/python-intensive/blob/master/Glossary.md#assign) it to a variable.
3. Assign a new variable that has your full name in all upper case.
4. Slice that string to get your first name again.

In [37]:
firstname ='Alleanna'
lastname = 'Clark'
fullname = firstname+ " " +lastname
fullname = fullname.upper()
print(fullname)
print(fullname[0:8])

ALLEANNA CLARK
ALLEANNA


## Challenge 2: Try seeing what the following string methods do:

    * `split`
    * `join`
    * `replace`
    * `strip`
    * `find`

In [17]:
my_string = "It was a Sunday morning at the height of spring."
words = my_string.split()
words

['It', 'was', 'a', 'Sunday', 'morning', 'at', 'the', 'height', 'of', 'spring.']

In [24]:
my_string.find('was')

3

In [53]:
my_string.join?

In [50]:
my_string.join('!!')

'!It was a Sunday morning at the height of spring.!'

In [54]:
'-'.join(words)

'It-was-a-Sunday-morning-at-the-height-of-spring.'

In [55]:
my_string.split('t')

['I', ' was a Sunday morning a', ' ', 'he heigh', ' of spring.']

In [57]:
my_string.replace("a", '4', 1)

'It w4s a Sunday morning at the height of spring.'

In [58]:
my_string.strip(".")

'It was a Sunday morning at the height of spring'

In [59]:
my_string.strip()

'It was a Sunday morning at the height of spring.'

## Challenge 3: Working with strings

Below is a string of Edgar Allen Poe's "A Dream Within a Dream":

In [2]:
poem = '''Take this kiss upon the brow!
And, in parting from you now,
Thus much let me avow —
You are not wrong, who deem
That my days have been a dream;
Yet if hope has flown away
In a night, or in a day,
In a vision, or in none,
Is it therefore the less gone?  
All that we see or seem
Is but a dream within a dream.

I stand amid the roar
Of a surf-tormented shore,
And I hold within my hand
Grains of the golden sand —
How few! yet how they creep
Through my fingers to the deep,
While I weep — while I weep!
O God! Can I not grasp 
Them with a tighter clasp?
O God! can I not save
One from the pitiless wave?
Is all that we see or seem
But a dream within a dream?'''

In [3]:
poem.strip("?")

'Take this kiss upon the brow!\nAnd, in parting from you now,\nThus much let me avow —\nYou are not wrong, who deem\nThat my days have been a dream;\nYet if hope has flown away\nIn a night, or in a day,\nIn a vision, or in none,\nIs it therefore the less gone?  \nAll that we see or seem\nIs but a dream within a dream.\n\nI stand amid the roar\nOf a surf-tormented shore,\nAnd I hold within my hand\nGrains of the golden sand —\nHow few! yet how they creep\nThrough my fingers to the deep,\nWhile I weep — while I weep!\nO God! Can I not grasp \nThem with a tighter clasp?\nO God! can I not save\nOne from the pitiless wave?\nIs all that we see or seem\nBut a dream within a dream'

In [4]:
poem.replace("?","")

'Take this kiss upon the brow!\nAnd, in parting from you now,\nThus much let me avow —\nYou are not wrong, who deem\nThat my days have been a dream;\nYet if hope has flown away\nIn a night, or in a day,\nIn a vision, or in none,\nIs it therefore the less gone  \nAll that we see or seem\nIs but a dream within a dream.\n\nI stand amid the roar\nOf a surf-tormented shore,\nAnd I hold within my hand\nGrains of the golden sand —\nHow few! yet how they creep\nThrough my fingers to the deep,\nWhile I weep — while I weep!\nO God! Can I not grasp \nThem with a tighter clasp\nO God! can I not save\nOne from the pitiless wave\nIs all that we see or seem\nBut a dream within a dream'

What is the difference between `poem.strip("?")` and `poem.replace("?", "")` ?

The difference is that poem.strip("?") just removes the last question mark because it is strips the string of the last character, whereas poem.replace("?","") replaces every question question mark with the section input which, in this case, is nothing.

At what index does the word "*and*" first appear? Where does it last appear?

In [34]:
poem.find('And')


30

In [38]:
poem.rfind('And')

359

How can you answer the above accounting for upper- and lowercase?

You can make all of the letters uppercase or lowercase using the .upper or .lower function, then apply the .find function.

## Challenge 4: Counting Text

Below is a string of Robert Frost's "The Road Not Taken":

In [None]:
poem = '''Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.'''

Using the `len` function and the string methods, answer the following questions:

How many characters (letters) are in the poem?

In [7]:
len(poem)

657

How many words?

In [44]:
split = poem.split()
split

['Take',
 'this',
 'kiss',
 'upon',
 'the',
 'brow!',
 'And,',
 'in',
 'parting',
 'from',
 'you',
 'now,',
 'Thus',
 'much',
 'let',
 'me',
 'avow',
 '—',
 'You',
 'are',
 'not',
 'wrong,',
 'who',
 'deem',
 'That',
 'my',
 'days',
 'have',
 'been',
 'a',
 'dream;',
 'Yet',
 'if',
 'hope',
 'has',
 'flown',
 'away',
 'In',
 'a',
 'night,',
 'or',
 'in',
 'a',
 'day,',
 'In',
 'a',
 'vision,',
 'or',
 'in',
 'none,',
 'Is',
 'it',
 'therefore',
 'the',
 'less',
 'gone?',
 'All',
 'that',
 'we',
 'see',
 'or',
 'seem',
 'Is',
 'but',
 'a',
 'dream',
 'within',
 'a',
 'dream.',
 'I',
 'stand',
 'amid',
 'the',
 'roar',
 'Of',
 'a',
 'surf-tormented',
 'shore,',
 'And',
 'I',
 'hold',
 'within',
 'my',
 'hand',
 'Grains',
 'of',
 'the',
 'golden',
 'sand',
 '—',
 'How',
 'few!',
 'yet',
 'how',
 'they',
 'creep',
 'Through',
 'my',
 'fingers',
 'to',
 'the',
 'deep,',
 'While',
 'I',
 'weep',
 '—',
 'while',
 'I',
 'weep!',
 'O',
 'God!',
 'Can',
 'I',
 'not',
 'grasp',
 'Them',
 'with',


In [46]:
words = len(split)
words

144

How many lines? (HINT: A line break is represented as  `\n`  )

In [56]:
20

20

How many stanzas?

In [57]:
4

4

How many unique words? (HINT: look up what a `set` is)

In [47]:
len(set(poem))

41

Remove commas and check the number of unique words again. Why is it different?

In [54]:
new_poem=poem.replace("," , "")
new_poem

'Take this kiss upon the brow!\nAnd in parting from you now\nThus much let me avow —\nYou are not wrong who deem\nThat my days have been a dream;\nYet if hope has flown away\nIn a night or in a day\nIn a vision or in none\nIs it therefore the less gone?  \nAll that we see or seem\nIs but a dream within a dream.\n\nI stand amid the roar\nOf a surf-tormented shore\nAnd I hold within my hand\nGrains of the golden sand —\nHow few! yet how they creep\nThrough my fingers to the deep\nWhile I weep — while I weep!\nO God! Can I not grasp \nThem with a tighter clasp?\nO God! can I not save\nOne from the pitiless wave?\nIs all that we see or seem\nBut a dream within a dream?'

In [55]:
len(set(new_poem))

40

It would be different because commas become part of the word since there is no space in between the word they come after and the next word. 