## Lecture 7

The objectives of this lecture are to:

1. Learn how Python handles text -- the `str` object.
2. Printing text -- user output.
3. Querying text -- user input.

# The `str` object

Creation and manipulation of text is unavoidable when writing computer programs. One of the strengths of Python, compared to other programming languages, is that these tasks are relatively simple. For example, Python does not distinguish between a single character `'a'` and a "string" of characters `"abc"`. These are both considered strings and may be represented as values of type `str`,

In [1]:
char = "a"
string = "abc"

print(type(char), type(string))

<class 'str'> <class 'str'>


Subsequently, Python allows the use of either single or double quotes to create string values, as long as both are used consistently,

In [2]:
'This will work'
"This will also work"
"This will not work'

SyntaxError: EOL while scanning string literal (<ipython-input-2-9a5cb8d0a1c8>, line 3)

The length of a string is not limited, other than by the memory accessible by the interpreter. The shortest string is the *empty string* and we may determine the length (in number of characters) using the built-in `len()`,

In [3]:
empty_string = ""
not_empty_string = "Hello! "

print(len(empty_string), len(not_empty_string))

0 7


Strings are *sequences* of characters, which allow us to do many things that we do not have the background to understand, yet. For now we will learn a sub-set of string manipulations that do not require a detailed understanding of sequences.

### Operations on Strings

Initially, we will learn several basic operations on strings:

* Concatenation -- creating a single string out of many.
* Repetition -- creating a single string out of multiples of another single string.
* Conversion -- converting a string to another value and vice versa.

The syntax for addition of strings is intuitive and involves the `+` operator,

In [4]:
"Nikola" + "Tesla"

'NikolaTesla'

Concatenation maintains the order of the original strings in the newly created one,

In [5]:
# I forgot a " "
"Nikola" + " " + "Tesla"

'Nikola Tesla'

We may use the `*` operator to repeat a string an integer number of times,

In [6]:
"METALLICA! " * 10

'METALLICA! METALLICA! METALLICA! METALLICA! METALLICA! METALLICA! METALLICA! METALLICA! METALLICA! METALLICA! '

Typically this is done for a positive integer value, but zero and negative integers are allowed but not especially useful,

In [7]:
"More cowbell!" * 0

''

In [8]:
"Mighty Morphin Power Rangers" * -10

''

Frequently we would like to convert a string to another value or vice versa. This is accomplished using built-in functions for each type we are interested in,

In [9]:
number_of_cookies = 3

# Wrong way! Why?
message = "There are " + number_of_cookies + " cookies."

print(message)

TypeError: Can't convert 'int' object to str implicitly

In [10]:
# Correct way, convert from an int to str first!
message = "There are " + str(number_of_cookies) + " cookies."

print(message)

There are 3 cookies.


The opposite conversion can be done, as well. This is very convenient when inputting data from the user or a file, which will always string-valued,

In [11]:
input_string = "4.56893"

input_value = float(input_string)

input_value

4.56893

This functionality is not "smart", the string must be directly convertible without pre-processing,

In [14]:
input_string = "I want 3 cheeseburgers!"

number_of_cheeseburgers = int(input_string)

print(number_of_cheeseburgers)

ValueError: invalid literal for int() with base 10: 'I want 3 cheeseburgers!'

### Special Characters

Frequently you will need to create strings that contain characters that are part of Python syntax. This creates a problem for the interpreter to interpret the string properly!,

In [15]:
string = "Have you heard of the book "Charlotte's Web" ?"

SyntaxError: invalid syntax (<ipython-input-15-4b1ed0bc5bdf>, line 1)

In this case, we want to use double quotations both to create a string and as character values within the string. When we want to use special characters as values, we must use the *escape character* which is a backslash `\` in Python,

In [17]:
string = "Have you heard of the book \"Charlotte\'s Web\"?"

print(string)

'Have you heard of the book "Charlotte\'s Web"?'

When we use the escape character to represent a special character this is called an *escape sequence*. An escape sequence is effectively representing one character and many of them are related to multiline strings (next section):

|**Escape Sequence**|**Description**|
|-------------------|---------------|
|\'                 |single quote   |
|\"                 |double quote   |
|\\\                 |backslash      |
|\t                 |tab            |
|\n                 |newline        |
|\r                 |carriage return|


In [18]:
# escape sequence for the backslash value
string = "\\"
len(string)

1

### Multiline Strings

Using the string creation syntax that you learned, you are limited to strings that span one line of text (although the length of the line is not constrained),

In [19]:
"First line
Second line"

SyntaxError: EOL while scanning string literal (<ipython-input-19-264a72bea14f>, line 1)

In order to create multiline strings, you must use three sets of quotes (single or double) at the beginning and end of the string definition,

In [23]:
string = """First Line
Second Line"""

string

'First Line\nSecond Line'

You should have noticed that Python inserted a *newline* escape sequence in between the two lines of text in the string. This escape sequence intuitively represents a new line in a string. This is an additional point of complexity that programmers must deal with, but is transparent to the user,

In [24]:
print(string)

First Line
Second Line


In the past, most operating systems had their own (different) format for indicating new lines. Programmers would need to take these different formatting strings into account depending on where their program was going to be executed. Modern operating systems have converged to the "\n" format for indicating a new line; Windows operating systems still used their own formatting which is the combination escape sequence "\r\n" (??!?!).

These specific details are transparent to Python programmers, which simply use the "\n" escape sequence regardless of the operating system. The Python interpreter takes care of adding OS-specific formatting when printing strings!


# Printing Text -- user output

The basic user output function in Python is `print()` which you have seen used frequently throughout the course so far. In this section we will learn more about it,

In [25]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



The `file` and `flush` keyword arguments involve concepts that are beyond this course, so for now just understand that their default values result in `print()` printing output to your screen. The `sep` argument specifies the string to insert between arguments that are to be printed,

In [26]:
# an example of using something other than a space as a separator
print(1, 2, 3, sep=", ")

1, 2, 3


The `end` argument specifies the string to append after the last value,

In [28]:
print("Hello", end="")
print(" World!")

Hello
 World!


Finally, the value arguments are either strings or a type conversion is attempted to result in a string, which are printed in order separated by `sep` ending in `end`! The arguments do not need to be of the same type!

In [29]:
print(print, 1, 2, "three!", 4, sep="; ")

<built-in function print>; 1; 2; three!; 4


In [30]:
# Let's practice with a more complex example
def convert_to_celsius(fahrenheit):
    return (fahrenheit - 32.0) * 5.0 / 9.0

value1 = 60.0
value2 = 30.0
value3 = 0.0

# Method 1, use print to do the formatting
print(value1, ", ", value2, ", and ", value3, " degrees Fahrenheit are equal to ", sep="", end="")
print(convert_to_celsius(value1), convert_to_celsius(value2),convert_to_celsius(value3), sep=", ", end=", ")
print("respectively.")

# Method 2, use string manipulations
message = str(value1) + ", " + str(value2) + ", and " + str(value3) + " degrees Fahrenheit are equal to "
message += str(convert_to_celsius(value1)) + ", " 
message += str(convert_to_celsius(value2)) + ", " 
message += str(convert_to_celsius(value3)) + ", respectively."
print(message)

60.0, 30.0, and 0.0 degrees Fahrenheit are equal to 15.555555555555555, -1.1111111111111112, -17.77777777777778, respectively.
60.0, 30.0, and 0.0 degrees Fahrenheit are equal to 15.555555555555555, -1.1111111111111112, -17.77777777777778, respectively.


Both methods are tedious, but using `print()` is a less so because the type conversions are implicit.


# Text Input -- user input

The function `input()` is the most simple way to query text input from the user. Its usage is very simple, calling the function results in the interpreter waiting on input from you up until a newline ("Enter") character, 

In [33]:
string = input()

print("You typed: ", string)

"Hello World"
You typed:  "Hello World"


For convenient you can supply a string as an input argument to "prompt" the user,

In [34]:
string = input("Input? ")

print("You typed:", string)

Input? Hello!
You typed: Hello!


The `input()` function always outputs a string, so if you want the user to input number data you must attempt a type conversion,

In [37]:
age_string = input("Input your age: ")

age = int(age_string)
age

Input your age: 37.75


ValueError: invalid literal for int() with base 10: '37.75'

# Exercises



### PragProg Section 4.7

**1.** What value does each of the following expressions evaluate to?

In [None]:
'Computer' + 'Science'

In [None]:
'Darwin\'s'

In [None]:
'H2O' * 3

In [None]:
'CO2' * 3


***2.*** Express each of the following phrases as Python strings using the
appropriate type of quotation marks (single, double, or triple) and, if
necessary, escape sequences. There is more than one correct answer for
each of these phrases. 
a. They’ll hibernate during the winter.

b. “Absolutely not,” he said.

c. “He said, ‘Absolutely not,’” recalled Mel.

d. hydrogen sulfide

e. left\right

***3.*** Rewrite the following string using single or double quotes instead of triple
quotes:

In [None]:
'''A
B
C'''

***4.*** Use the built-in function <i>len</i> to find the length of the empty string.

***5.*** Given variables x and y , which refer to values 3 and 12.5, respectively,
use the print function to print the following messages. When numbers
appear in the messages, variables x and y should be used.

a. The rabbit is 3.

b. The rabbit is 3 years old.

c. 12.5 is average.

d. 12.5 * 3

e. 12.5 * 3 is 37.5.

In [None]:
ad

***6.*** What is printed by the following code?

In [None]:
>>> first = 'John'
>>> last = 'Doe'
>>> print(last + ', ' + first)

***7.*** Use <i>input</i> to prompt the user for a number, store the number entered as
a float in a variable named <i>num</i> , and then print the contents of <i>num</i> .

***8.*** Complete the examples in the docstring and then write the body of the
following function:

In [None]:
def repeat(s, n):
""" (str, int) -> str
Return s repeated n times; if n is negative, return the empty string.
>>> repeat('yes', 4)
'yesyesyesyes'
>>> repeat('no', 0)
>>> repeat('no', -2)
>>> repeat('yesnomaybe', 3)
"""

***9.*** Complete the examples in the docstring and then write the body of the
following function:

In [None]:
def total_length(s1, s2):
""" (str, str) -> int
Return the sum of the lengths of s1 and s2.
>>> total_length('yes', 'no')
5
>>> total_length('yes', '')
>>> total_length('YES!!!!', 'Noooooo')
"""