<a id="strings"></a>
# Strings

Strings are text that is made up of *characters*, which are letters, numbers, punctuation, and some special characters called *escape characters*. 
Strings have operations and functions that combine, remove, search, and replace for parts of text. 
This is the string `"hello"`. Each character has a *position* in the string, and the positions start counting from `0` rather than `1`.

<img src="string1.jpg" width="400">

## Creating strings
Strings are text that can be inside either two double quotes `"` or two single quotes `'`. 
The same quote starts and ends the string. 

In [1]:
print("this is a double-quoted string.")
print('this is a single-quoted string.')

this is a double-quoted string.
this is a single-quoted string.


If you need to put a double quote in a string, you can use a single-quoted string. 
If you need to put a single quote in a string, you can use a double-quoted string.

In [2]:
print("this is a double-quoted string with a single quote ' in it.")
print('this is a single-quoted string with a double quote " in it.')

this is a double-quoted string with a single quote ' in it.
this is a single-quoted string with a double quote " in it.


### Escape characters
Escape characters are characters following a *backslash* character `\` in a string. 
If you need to put both a double quote and a single quote in a string, you can use a single-quoted or double-quoted string,
you use an *escape character* backward-quote `\` to keep that quote from ending the string.

In [None]:
print("this double-quoted string has both a double quote \" and single quote ' inside.")
print('this single-quoted string has both a double quote " and single quote \' inside.')

The backslash gives a special meaning to the character that follows it. 
`\t` means the tab key on the keyboard, and `\n` is the enter key, called a [*newline*](#glossary_newline). 
Because the backslash `\`  changes the meaning of the character that follows it, to really get a backslash you need to use `\\`.

In [None]:
print('this is the tab "\t" character')
print('this is the enter "\n" character')
print('this is the backslash "\\" character')

You can put a double quote in a double-quoted string by using backslash to escape it, an *escaped* double quote, that can go into a double-quoted string but not end the string. 

In [None]:
print("this is a double-quoted string with a double quote \" in it")

An escaped single quote can be put in a single-quoted string the same way.

In [None]:
print('this is a single-quoted string with a single quote \' in it')

Strings beginning and ending with three double quotes """ can go across lines. 

In [1]:
print("""
This string is on 
two lines.
""")


This string is on 
two lines.



### `input` function
The `input` function can print some text on the console and wait for you to type a string, then give that string as the value.

In [38]:
print("The text you typed is ", input('Enter some text: '))

Enter some text: sd
The text you typed is  sd


## Operations on strings
Strings can be combined with the `+` add operator. 

In [4]:
s = "Again and "
s = s + "again ..."
print("s is ", s)

s is  Again and again ...


Strings and numbers *cannot* be combined with the `+` operator.
This gives and error

In [5]:
s = "abc" + 3

TypeError: can only concatenate str (not "int") to str

The number has to be converted to a string first.

In [6]:
s = "abc" + str(3)
print(s)

abc3


The `+=` shortcut works as well.

In [5]:
s = "Again and "
s += "again ..."
print("s is ", s)

s is  Again and again ...


Strings can be repeated with the `*` multiplication operator. 

In [6]:
s = "hello " * 3
print("s is ", s)

s is  hello hello hello 


The `*=` shortcut works as well.

In [24]:
s = "hello "
s *= 3
print("s is ", s)

s is  hello hello hello 


## Getting data from strings
Characters in strings can accessed with '[' and ']', where *string*`[`*position*`]` gives the character at *position* in the string. 
Python operators and functions count character positions from 0 rather than 1. 
The string can count *position* from the beginning of the string, or if *position* is negative, it can count from the back of the string. 

In [10]:
s = "A red hat"
print("s starts as ",s)
print("positions:   012345678")
print("s[0] is ", s[0])
print("s[3] is ", s[3])
print("s[-1] is ", s[-1])

s starts as  A red hat
positions:   012345678
s[0] is  A
s[3] is  e
s[-1] is  t


### Slices
Strings that are part of other strings are *slices* or *substrings*. 
The slice operator looks like *variable*`[`*start*`:`*end*`]` where the slice has all the characters from the character in the *start* position to the character *just before* the *end* position. 

A slice that has no *start* starts from the beginning of the string. 
A slice that has no *end* goes to the end of the string. 

In [11]:
s = "A red hat"
print("s starts as ", s)
print("positions:   012345678")
print("s[:1] is ", s[:1])
print("s[2:5] is ", s[2:5])
print("s[6:] is ", s[6:])

s starts as  A red hat
positions:   012345678
s[:1] is  A
s[2:5] is  red
s[6:] is  hat


### `center` function
The `center` function add spaces to the beginning and end of a string.

In [10]:
print("|"+"middle".center(20)+"|")

|       middle       |


The character to use instead of a space can be specified.

In [11]:
print("|"+"middle".center(20, "*")+"|")

|*******middle*******|


### `count` function
The `count` function will count how many times a substring can be found in a string.

In [46]:
s = "the cat in the hat is at the store."
print('s.count("the") is ', s.count("the"))
print('s.count("at") is ', s.count("at"))

s starts as  the cat in the hat is at the store.
s.count("the") is  3
s.count("at") is  3


### `find` function
The `find` function gives the first position where you can find a substring in a string. 
The link https://docs.python.org/3/library/stdtypes.html#string-methods gives a full list of string functions.

In [43]:
s = "A red hat"
print("s starts as ", s)
print("positions:   012345678")
print('s.find("red") is ', s.find("red"))

s starts as  A red hat
positions:   012345678
s.find("red") is  2


### `float` function
If you have a string but would like to use it as a number, the `float` function will make a real number out of it.

In [37]:
s = "3.1"
x = float(s)
print("float(", s, ") is ", x)

float( 3.1 ) is  3.1


### `index` function
The `index` function returns the first position, starting at 0, a substring occurs in a string.

In [12]:
print("one two three".index("two"))

4


### `int` function
If you have a string but would like to use it as a number, the `int` function will make an integer out of it.

In [35]:
s = "5"
x = int(s)
print("int(", s, ") is ", x)

int( 5 ) is  5


If the string is not a number, you get an error.

In [47]:
s = "invalid"
x = int(s)
print("int(", s, ") is ", x)

ValueError: invalid literal for int() with base 10: 'invalid'

### `len` function
The `len` function gives the value of the string length.

In [15]:
s = "A red hat"
length = len(s)
print("len(", s, ") is ", length)

len( A red hat ) is  9


### `lower` function
The `lower` function makes all alphabet characters  in a string into small letters or [*lowercase*](#glossary_lowercase).

In [27]:
s = "THE CAT IN THE HAT IS AT THE STORE"
print("s.lower() is ", s.lower())

s starts as  THE CAT IN THE HAT IS AT THE STORE
s.lower() is  the cat in the hat is at the store


### `reversed` function
The `reversed` function returns the string with the characters reversed.

In [None]:
s = "abcdef"
reversed(s)
print(s)

### `str` function 
If you have a number, the `str` function will let you use it as a string. This is taking str() of an integer.

In [31]:
x = 5
s = str(x)
print("str(", x, ") is ", s)

str( 5 ) is  5


This is taking str() of a real number.

In [32]:
x = 3.1
s = str(x)
print("str(", x, ") is ", s)

str( 3.1 ) is  3.1


### `upper` function
The `upper` function makes all alphabet characters in a string into capitals or [*uppercase*](#glossary_uppercase).

In [26]:
s = "the cat in the hat is at the store"
print("s.upper() is ", s.upper())

s starts as  the cat in the hat is at the store
s.upper() is  THE CAT IN THE HAT IS AT THE STORE


## String tests
Various functions test conditions on strings.

### `endswith` function
The `endswith` function will check whether a string ends with a substring.

In [6]:
s = "the cat in the hat is at the store."
print('s.endswith("store") is ', s.endswith("store"))
print('s.endswith("hat") is ', s.endswith("hat"))

s.endswith("store") is  False
s.endswith("hat") is  False


### `in` test
`in` checks if a substring is present in a string.

In [13]:
print("two" in "one two three")
print("four" in "one two three")

True


### `isalpha` function
`str.isalpha()` is true if `str` only contains letters, `a` to `z` or `A` to `Z`.

In [4]:
print("onlyletters".isalpha())
print("not only letters".isalpha())

True
False


### `isdigit` function
`str.isdigit()` is true if `str` only contains digits `0` to `9`.

In [5]:
print("1776".isdigit())
print("the answer is 42".isdigit())

True
False


### `isalnum` function
`str.isalnum()` is true if `str` only contains digits `0` to `9` or letters `a` to `z`, `A` to `A`.

In [9]:
print("abc123".isalnum())
print("has-non-alphanumerics".isalnum())

True
False


### `islower` function
`str.islower()` is true if `str` only contains only lowercase letters `a` to `z`.

In [10]:
print("lowercase".islower())
print("notAllLowercase".islower())

True
False


### `isspace` function
`str.isspace()` is true if `str` only contains only spaces.

In [11]:
print("    ".isspace())
print("not all spaces".isspace())

True
False


### `isupper` function
`str.isupper()` is true if `str` only contains only uppercase letters `A` to `Z`.

In [12]:
print("UPPERCASE".isupper())
print("notAllLuppercase".isupper())

True
False


### `not in` test
`not in` checks if a substring is not present in a string.

In [15]:
print("two" not in "one two three")
print("four" not in "one two three")

False
True


### `startswith` function
The `startswith` function will check whether a string begins with a substring.

In [14]:
s = "the cat in the hat is at the store."
print('s.startswith("the") is ', s.startswith("the"))
print('s.startswith("cat") is ', s.startswith("cat"))

s.startswith("the") is  True
s.startswith("cat") is  False


## Functions that change strings
Various functions change the contents of strings in place.

### `replace` function
The `replace` function will find the substring of a string that matches a *pattern*, and replace it with another string.   

In [1]:
s = "the red hat"
print('s.replace("red", "blue") is ', s.replace("red", "blue"))

s.replace("red", "blue") is  the blue hat


### `strip` function
The `strip` function removes a *newline* at the end of a string. Lines read from a text file will end with a newline. Printing a file with a newline will add an extra blank line.

<img src="strip1.jpg" width="400">

In [40]:
s = "line from file\n"
print(s, "before stripping")
print(s.strip(), "after stripping")

line from file
 before stripping
line from file after stripping


## Other operations on strings
Example
Loop through the letters in the word "banana":

for x in "banana":
  print(x)

## Formatting strings -- the `Format` function
The `format` function can make output look just the way you want. 
The format function looks like this.

&nbsp;&nbsp;&nbsp;&nbsp;`'{:`*format* `:`*format* `...}.format(`*item*, *item*`)`

where each *format* is listed below. 

This is an example of a `format` output that includes more than one *format*.

&nbsp;&nbsp;&nbsp;&nbsp;`'{:17s}  ${:7.2f}'.format(name, total)`

The `:3d` format will make a string out of an integer where the string is at least 3 characters wide, with spaces on the left if the integer is less than 3 characters long.

In [22]:
s ='{:3d}'.format(2)
t ='{:3d}'.format(65)
u ='{:3d}'.format(138)
print(s, ' is the number 2 with format ":3d"')
print(t, ' is the number 65 with format ":3d"')
print(u, ' is the number 138 with format ":3d"')

  2  is the number 2 with format ":3d"
 65  is the number 65 with format ":3d"
138  is the number 138 with format ":3d"


The `:<3d` format will make a string out of an integer where the string is at least 3 characters wide, with spaces on the right if the integer is less than 3 characters long.

In [21]:
s ='{:<3d}'.format(2)
t ='{:<3d}'.format(65)
u ='{:<3d}'.format(138)
print(s, ' is the number 2 with format "<:3d"')
print(t, ' is the number 65 with format "<:3d"')
print(u, ' is the number 138 with format "<:3d"')

2    is the number 2 with format "<:3d"
65   is the number 65 with format "<:3d"
138  is the number 138 with format "<:3d"


The `:^3d` format will make a string out of an integer where the string is at least 3 characters wide but with the number in the middle if the integer is less than 3 characters long.

In [20]:
s ='{:^3d}'.format(2)
t ='{:^3d}'.format(65)
u ='{:^3d}'.format(138)
print(s, ' is the number 2 with format "^:3d"')
print(t, ' is the number 65 with format "^:3d"')
print(u, ' is the number 138 with format "^:3d"')

 2   is the number 2 with format "^:3d"
65   is the number 65 with format "^:3d"
138  is the number 138 with format "^:3d"


For real numbers, the `:2f` format will make a string out of a real number where the string has up to 2 numbers in the fraction, with zeros on the right if the fraction of the real number has less than 2 digits.

In [41]:
s ='{:.2f}'.format(4)
t ='{:.2f}'.format(1.9)
u ='{:.2f}'.format(8.26)
print(s, ' is the 4 with format ":.2f"')
print(t, ' is the 1.9 with format ":.2f"')
print(u, ' is the 8.26 with format ":.2f"')

4.00  is the 4 with format ":.2f"
1.90  is the 1.9 with format ":.2f"
8.26  is the 8.26 with format ":.2f"


For strings, he `:5s` format will make a string at least 5 characters wide but with spaces on the right if the string is less than 3 characters long. 
The `:>5s` will make the string at least 5 characters long but with spaces on the left, and `^5s` will make the string at least 5 characters long but in
the center of the string.

In [17]:
s ='{:5s}'.format("one")
t ='{:>5s}'.format("one")
u ='{:^5s}'.format("one")
v ='{:^5s}'.format("eight")
print(s, ' is the "one" with format ":5s"')
print(t, ' is the "one" with format ":>5s"')
print(u, ' is the "one" with format ":^5s"')
print(v, ' is the "eight" with format ":5s"')

one    is the "one" with format ":5s"
  one  is the "one" with format ":>5s"
 one   is the "one" with format ":^5s"
eight  is the "eight" with format ":5s"
