# Strings

Python strings are just pieces of text.

In [1]:
our_string = "Hello World!"

In [2]:
our_string

'Hello World!'

So far we know how to add them together.

In [3]:
"I said: " + our_string

'I said: Hello World!'

We also know how to repeat them multiple times.

In [4]:
our_string * 3

'Hello World!Hello World!Hello World!'

Python strings are [immutable](https://docs.python.org/3/glossary.html#term-immutable).
That's just a fancy way to say that
they cannot be changed in-place, and we need to create a new string to
change them. Even `some_string += another_string` creates a new string.
Python will treat that as `some_string = some_string + another_string`,
so it creates a new string but it puts it back to the same variable.

`+` and `*` are nice, but what else can we do with strings?

## Slicing

Slicing is really simple. It just means getting a part of the string.
For example, to get all characters between the second place between the
characters and the fifth place between the characters, we can do this:

In [5]:
our_string[2:5]

'llo'

So the syntax is like `some_string[start:end]`.

This picture explains how the slicing works:

![image.png](attachment:image.png)

But what happens if we slice with negative values?

In [6]:
our_string[-5:-2]

'orl'

It turns out that slicing with negative values simply starts counting
from the end of the string.

![Slicing with negative values](../images/slicing2.png)

If we don't specify the beginning it defaults to 0, and if we don't
specify the end it defaults to the length of the string. For example, we
can get everything except the first or last character like this:

In [7]:
our_string[1:]

'ello World!'

In [8]:
our_string[:-1]


'Hello World'

Remember that strings can't be changed in-place.

In [9]:
our_string[:5] = 'Howdy'

TypeError: 'str' object does not support item assignment

There's also a step argument we can give to our slices, but I'm not
going to talk about it now.

## Indexing

So now we know how slicing works. But what happens if we forget the `:`?

In [10]:
our_string[1]

'e'

That's interesting. We got a string that is only one character long. But
the first character of `Hello World!` should be `H`, not `e`, so why did
we get an e?

Programming starts at zero. Indexing strings also starts at zero. The
first character is `our_string[0]`, the second character is
`our_string[1]`, and so on.

In [11]:
our_string[0]

'H'

In [12]:
our_string[1]

'e'

In [13]:
our_string[2]

'l'

In [14]:
our_string[3]

'l'

In [15]:
our_string[4]

'o'

So string indexes work like this:

![image.png](attachment:image.png)

How about negative values?

In [16]:
our_string[-1]

'!'

We got the last character.

But why didn't that start at zero? `our_string[-1]` is the last
character, but `our_string[1]` is not the first character!

That's because 0 and -0 are equal, so indexing with -0 would do the same
thing as indexing with 0.

Indexing with negative values works like this:

![Indexing with negative values](../images/indexing2.png)

## String methods

Python's strings have many useful methods.
[The official documentation](https://docs.python.org/3/library/stdtypes.html#string-methods)
covers them all, but I'm going to just show some of the most commonly
used ones briefly. Python also comes with built-in documentation about
the string methods and we can run `help(str)` to read it. We can also
get help about one string method at a time, like `help(str.upper)`.

Again, nothing can modify strings in-place. Most string methods
return a new string, but things like `our_string = our_string.upper()`
still work because the new string is assigned to the old variable.

Also note that all of these methods are used like `our_string.stuff()`,
not like `stuff(our_string)`. The idea with that is that our string
knows how to do all these things, like `our_string.stuff()`, we don't
need a separate function that does these things like `stuff(our_string)`.
We'll learn more about methods [later](classes.md).

Here's an example with some of the most commonly used string methods:

In [17]:
our_string.upper()

'HELLO WORLD!'

In [18]:
our_string.lower()

'hello world!'

In [19]:
our_string.startswith('Hello')

True

In [20]:
our_string.endswith('World!')

True

In [21]:
our_string.endswith('world!')  # Python is case-sensitive

False

In [22]:
our_string.replace('World', 'there')

'Hello there!'

In [23]:
our_string.replace('o', '@', 1)   # only replace one o

'Hell@ World!'

In [24]:
'  hello 123  '.lstrip()    # left strip

'hello 123  '

In [25]:
'  hello 123  '.rstrip()    # right strip

'  hello 123'

In [26]:
'  hello 123  '.strip()     # strip from both sides

'hello 123'

In [27]:
'  hello abc'.rstrip('cb')  # strip c's and b's from right

'  hello a'

In [28]:
our_string.ljust(30, '-')

'Hello World!------------------'

In [29]:
our_string.rjust(30, '-')

'------------------Hello World!'

In [30]:
our_string.center(30, '-')

'---------Hello World!---------'

In [31]:
our_string.count('o')   # it contains two o's

2

In [32]:
our_string.index('o')   # the first o is our_string[4]

4

In [33]:
our_string.rindex('o')  # the last o is our_string[7]

7

In [34]:
'-'.join(['hello', 'world', 'test'])

'hello-world-test'

In [35]:
'hello-world-test'.split('-')

['hello', 'world', 'test']

In [36]:
our_string.upper()[3:].startswith('LO WOR')  # combining multiple things

True

The things in square brackets that the split method gave us and
we gave to the join method were lists. We'll talk more about
them [later](lists-and-tuples.md).

## String formatting

To add a string in the middle of another string, we can do something
like this:

In [37]:
name = 'Juan'
'My name is ' + name + '.'

'My name is Juan.'

But that gets complicated if we have many things to add.

In [38]:
location = 'office'
day = 'Wednesday'

In [39]:
"My name is " + name + " and I learn Python in the " + location + " every " + day + "."

'My name is Juan and I learn Python in the office every Wednesday.'

Instead it's recommended to use string formatting. It means putting
other things in the middle of a string.

Python has multiple ways to format strings. One is not necessarily
better than others, they are just different. Here's a few ways to solve
our problem:

- `.format()`-formatting, also known as new-style formatting. This
    formatting style has a lot of features, but it's a little bit more
    typing than `%s`-formatting.

In [40]:
"Hello {}.".format(name)

'Hello Juan.'

In [41]:
"My name is {} and I learn Python in the {} every {}.".format(name, location, day)

'My name is Juan and I learn Python in the office every Wednesday.'

- `%s`-formatting, also known as old-style formatting. This has less
    features than `.format()`-formatting, but `'Hello %s.' % name` is
    shorter and faster to type than `'Hello {}.'.format(name)`. I like
    to use `%s` formatting for simple things and `.format` when I need
    more powerful features.

In [42]:
"Hello %s." % name

'Hello Juan.'

In [43]:
"My name is %s and I learn Python in the %s every %s." % (name, location, day)

'My name is Juan and I learn Python in the office every Wednesday.'

In the second example we had `(name, location, day)` on the right
side of the `%` sign. It was a tuple, and we'll talk more about them later.

If we have a variable that may be a tuple we need to wrap it in another
tuple when formatting:

In [44]:
thestuff = (1, 2, 3)

In [45]:
"we have %s" % thestuff

TypeError: not all arguments converted during string formatting

In [46]:
"we have %s and %s" % ("hello", thestuff)

'we have hello and (1, 2, 3)'

In [47]:
"we have %s" % (thestuff,)

'we have (1, 2, 3)'

Here (thestuff,) was a tuple that contained nothing but thestuff.

- f-strings are even less typing, but new in Python 3.6. **Use this only if
    you know that nobody will need to run your code on Python versions older
    than 3.6.** Here the f is short for "format", and the content of the
    string is same as it would be with `.format()` but we can use variables
    directly.

In [48]:
>>> f"My name is {name} and I learn Python in the {location} every {day}."

'My name is Juan and I learn Python in the office every Wednesday.'

All of these formatting styles have many other features also:

In [49]:
'Three zeros and number one: {:04d}'.format(1)

'Three zeros and number one: 0001'

In [50]:
'Three zeros and number one: %04d' % 1

'Three zeros and number one: 0001'

If you need to know more about formatting I recommend reading
[this](https://pyformat.info/).

## Other things

We can use `in` and `not in` to check if a string contains another
string.

In [51]:
>>> our_string = "Hello World!"

In [52]:
"Hello" in our_string

True

In [53]:
"Python" in our_string

False

In [54]:
"Python" not in our_string

True

We can get the length of a string with the `len` function. The name
`len` is short for "length".

In [55]:
len(our_string)   # 12 characters

12

In [56]:
len('')     # no characters

0

In [57]:
len('\n')    # python thinks of \n as one character

1

We can convert strings, integers and floats with each other with
`str`, `int` and `float`. They aren't actually functions, but they
behave a lot like functions. We'll learn more about what they really
are [later](classes.md).

In [58]:
str(3.14)

'3.14'

In [59]:
float('3.14')

3.14

In [60]:
str(123)

'123'

In [61]:
int('123')

123

Giving an invalid string to `int` or `float` produces an error
message.

In [62]:
int('lol')

ValueError: invalid literal for int() with base 10: 'lol'

In [63]:
float('hello')

ValueError: could not convert string to float: 'hello'

## Summary

- Slicing returns a copy of a string with indexes from one index to
    another index. The indexes work like this:

![image.png](attachment:image.png)

- Indexing returns one character of a string. Remember that we don't
    need a `:` with indexing. The indexes work like this:

![image.png](attachment:image.png)

- Python has many string methods. Use
    [the documentation](https://docs.python.org/3/library/stdtypes.html#string-methods)
    or `help(str)` when you don't rememeber something about them.
- String formatting means adding other things to the middle of a string.
    There are multiple ways to do this in Python. You should know how to
    use at least one of these ways.
- The `in` keyword can be used for checking if a string contains another
    string.
- `len(string)` returns string's length.
- We can use `str`, `int` and `float` to convert values to different
    types.

## Exercises

1. Create a program that asks the user to enter their name and their age. 
Print out a message addressed to them that tells them the year that they will turn 100 years old.

2. Add on to the previous program by asking the user for another number and printing out that many copies of the previous message.

3. Print out that many copies of the previous message on separate lines. (Hint: the string "\n is the same as pressing the ENTER button)

4. This program is supposed to say something loudly. Fix it.

In [64]:
message = input("What do you want me to say? ")
message.upper
print(message, "!!!")

StdinNotImplementedError: raw_input was called, but this frontend does not support input requests.

5. Clean the ugly_mixed_case string:
    remove white spaces, 
    capitalise the initial leter and everything else to lowercase
    replace the word bad with good
    concatenate with the second line

In [65]:
ugly_mixed_case = '   ThIS LooKs BAd '
concatenate = ', and this goes at the end'