# CHAPTER 3 - STRINGS

## STRINGS BASICS

- In Python, text is represented as a string, which is a **sequence** of characters (letters, digits, and symbols).
- In Python, we indicate that a value is a string by putting either single or double quotes around it.

### Operations on Strings
- `len(string)` -- to get length of a string
- `+` -- can add strings, but not str and type float or int
- `*` -- to repeat a string 
- `+=` -- add to another string and save value
- `int("3")` -- convert str to int
- `float("3.4")` -- convert str to float
- `str(<input>)` -- convert the input to a string data type. Useful for converting numbers to strings so you can concatenate strings and numbers.

In [1]:
# String notation using single quotes.
string1 = 'Aristotle'
print(string1)

# String notation using double quotes.
string2 = "Issac Newton"
print(string2)

Aristotle
Issac Newton


### String Literals

A string literal can be created by writing a text(a group of Characters ) surrounded by a single(”), double(“”), or triple quotes.  By using triple quotes we can write multi-line strings or display them in the desired way. 

In [2]:
m = """
This is a string literal. 
        Spaces are printed as is. 
  Hello there!"""
print(m)


This is a string literal. 
        Spaces are printed as is. 
  Hello there!


### Printing in Python

In [3]:
print('abbcd', 2, 3)
print('abbcd', 2, 3, sep='\n') # default separator is space

abbcd 2 3
abbcd
2
3


### Concatenating String

In [4]:
string3 = 'Four score and ' + str(7) + ' years ago'
print(string3)

Four score and 7 years ago


### Converting strings to numbers

In [5]:
print(int('0'), type(int('0')))
print(int('11'), type(int('11')))
print(int('-324'), type(int('-324')))
print(float('-324.40'), type(float('-324.40')))

0 <class 'int'>
11 <class 'int'>
-324 <class 'int'>
-324.4 <class 'float'>


### Repeating Strings

In [6]:
string4 = 'AT' * 5
string5 = '-' * 5
print(string4)
print(string5)

ATATATATAT
-----


### Using Special Characters in Strings
- How would you put a single quote inside a string that is declared using a single quote?
- `'\\'` -- how would you print `/\/\`
- `'\n'`
- `'\''`
- `'\"'`
- `'\t'` -- useful for parsing TSV files

### Getting information from the Keyboard

In [7]:
number = input('Please enter a number: ')
print(f'The number is {number}.')

The number is 7.


## STRING FORMATTING

### What can you format?
String-formatting allows you to do at least five things. These string-format specifiers come after a colon. 
1. specify width
2. align data to left, right, or center; make sure to explain the default alignment for string and number; make sure to 
3. specify padding character when specified width is greater than the width of the string/number being used
4. specify precision for floating values
5. add commas to numbers for easier viewing; 

### Oldest Method

In [8]:
PI = 3.14159265359 
name = 'PI'
print('%s is %.2f' % (name, PI))  # oldest way format specifier is <width>.<precision><type>

PI is 3.14


### Oldest Method

In [9]:
PI = 3.14159265359 
name = 'PI'
print('%s is %.2f' % (name, PI))  # oldest way format specifier is <width>.<precision><type>

PI is 3.14


### Newest Method
Newest and fastest method for string formatting.
`{<index>:<format-specifier>}` where the format specifier is: `<padding_character><alignment><width>.<comma><precision><type>`
- padding_character can be anything. most common are
  - space (default)
  - `-`
  - `0` 
- alignment
  - `<` -- left aligned
  - `^` -- center aligned
  - `>` -- right aligned
- width is a number. If the width is greater than the number of characters, then:
  - strings are left aligned
  - numbers are right aligned 
- comma is used to add commas to large numbers for easier viewing
- precision controls how many decimal places to show. **NOTE** This uses round, but leaves trailing zeros.
- type specifies the data type
  - `f` -- float
  - `d` -- integer
  - `s` -- string

In [10]:
PI = 3.14159265359 
name = 'PI'
# {<name_of_variable>:<format-specifier>} where the format specifier is <width>.<precision><type>
print(f'{name} is {PI:.2f}') # newest way

PI is 3.14


### Examples

In [11]:
## Example 1
course_number = 'EAS503'
class_size = 113
class_average = 92.3

str_format = '{}'.format(course_number)
f_string = f'{course_number}'

print(str_format)
print(f_string)

EAS503
EAS503


In [12]:
## Example 2

course_number = 'EAS503'
class_size = 113
class_average = 92.3

str_format = 'The course number is {}.'.format(course_number)
f_string = f'The course number is {course_number}.'

print(str_format)
print(f_string)

The course number is EAS503.
The course number is EAS503.


In [13]:
## Example 3 use index

course_number = 'EAS503'
class_size = 113
class_average = 92.3
str_format = 'The course number is {}. It has {} students.'.format(course_number, class_size)
str_format = 'The course number is {0}. It has {1} students.'.format(course_number, class_size)
f_string = f'The course number is {course_number}. It has {class_size} students.'

print(str_format)
print(f_string)

The course number is EAS503. It has 113 students.
The course number is EAS503. It has 113 students.


In [14]:
## Example 4 change index

course_number = 'EAS503'
class_size = 113
class_average = 92.3
str_format = 'The course number is {1}. It has {0} students.'.format(course_number, class_size)
f_string = f'The course number is {class_size}. It has {course_number} students.'

print(str_format)
print(f_string)

The course number is 113. It has EAS503 students.
The course number is 113. It has EAS503 students.


In [15]:
## Example 5 adding a float

course_number = 'EAS503'
class_size = 113
class_average = 92.3
str_format = 'The course number is {0}. It has {1} students. The class average is {2}.'.format(course_number, class_size, class_average)
f_string = f'The course number is {class_size}. It has {course_number} students. The class average is {class_average}.'

print(str_format)
print(f_string)

The course number is EAS503. It has 113 students. The class average is 92.3.
The course number is 113. It has EAS503 students. The class average is 92.3.


In [16]:
## Example 6 specify number of spaces to use -- width

course_number = 'EAS503'
class_size = 113
class_average = 92.3
str_format = 'The course number is {0:10}. It has {1:10} students. The class average is {2:10}.'.format(course_number, class_size, class_average)
f_string = f'The course number is {course_number:10}. It has {class_size:10} students. The class average is {class_average:10}.'

print(str_format)
print(f_string)

The course number is EAS503    . It has        113 students. The class average is       92.3.
The course number is EAS503    . It has        113 students. The class average is       92.3.


In [17]:
## Example 7 right align

course_number = 'EAS503'
class_size = 113
class_average = 92.3
str_format = 'The course number is {0:>10}. It has {1:>10} students. The class average is {2:>10}.'.format(course_number, class_size, class_average)
f_string = f'The course number is {course_number:>10}. It has {class_size:>10} students. The class average is {class_average:>10}.'

print(str_format)
print(f_string)

The course number is     EAS503. It has        113 students. The class average is       92.3.
The course number is     EAS503. It has        113 students. The class average is       92.3.


In [18]:
##Example 8 left align

course_number = 'EAS503'
class_size = 113
class_average = 92.3
str_format = 'The course number is {0:<10}. It has {1:<10} students. The class average is {2:<10}.'.format(course_number, class_size, class_average)
f_string = f'The course number is {course_number:<10}. It has {class_size:<10} students. The class average is {class_average:<10}.'

print(str_format)
print(f_string)

The course number is EAS503    . It has 113        students. The class average is 92.3      .
The course number is EAS503    . It has 113        students. The class average is 92.3      .


In [19]:
## Example 9 center align
course_number = 'EAS503'
class_size = 113
class_average = 92.3
str_format = 'The course number is {0:^10}. It has {1:^10} students. The class average is {2:^10}.'.format(course_number, class_size, class_average)
f_string = f'The course number is {course_number:^10}. It has {class_size:^10} students. The class average is {class_average:^10}.'

print(str_format)
print(f_string)

The course number is   EAS503  . It has    113     students. The class average is    92.3   .
The course number is   EAS503  . It has    113     students. The class average is    92.3   .


In [20]:
## Example 10 Padding with zeros
## Zero padding does not require a alignment specifier

student_id = 223333

str_format = 'The number padded {} padded with zeros {:08}'.format(student_id, student_id)
f_string = f'The number padded {student_id} padded with zeros {student_id:08}'

print(str_format)
print(f_string)

The number padded 223333 padded with zeros 00223333
The number padded 223333 padded with zeros 00223333


In [21]:
## Example 11 Padding with dashes
student_id = 223333

str_format = 'The number padded {} padded with zeros {:->8}'.format(student_id, student_id)
f_string = f'The number padded {student_id} padded with zeros {student_id:->8}'

print(str_format)
print(f_string)

The number padded 223333 padded with zeros --223333
The number padded 223333 padded with zeros --223333


In [22]:
## Example 12 Adding Commas
number = 123456
print(f'{number:,}')
number = 123456.2345
print(f'{number:,.2f}')

123,456
123,456.23


## CREATING A TABLE

In [23]:
title = '|' + '{:^51}'.format('Cereal Yields (kg/ha)') + '|'
line = '+' + '-'*15 + '+' + ('-'*8 + '+')*4
row = '| {:<13} |' + ' {:6,d} |'*4
header = '| {:^13s} |'.format('Country') + (' {:^6d} |'*4).format(1980, 1990,
                                                                  2000, 2010)
print('+' + '-'*(len(title)-2) + '+',
      title,
      line,
      header,
      line,
      row.format('China', 2937, 4321, 4752, 5527),
      row.format('Germany', 4225, 5411, 6453, 6718),
      row.format('United States', 3772, 4755, 5854, 6988),
      line,
      sep='\n')

+---------------------------------------------------+
|               Cereal Yields (kg/ha)               |
+---------------+--------+--------+--------+--------+
|    Country    |  1980  |  1990  |  2000  |  2010  |
+---------------+--------+--------+--------+--------+
| China         |  2,937 |  4,321 |  4,752 |  5,527 |
| Germany       |  4,225 |  5,411 |  6,453 |  6,718 |
| United States |  3,772 |  4,755 |  5,854 |  6,988 |
+---------------+--------+--------+--------+--------+


### Writing to a file
Use with context to manage closing file automatically.

In [24]:
with open('test_file.txt', 'w') as file:
    file.write('line1\nline2')

In [25]:
title = '|' + '{:^51}'.format('Cereal Yields (kg/ha)') + '|'
line = '+' + '-'*15 + '+' + ('-'*8 + '+')*4
row = '| {:<13} |' + ' {:6,d} |'*4
header = '| {:^13s} |'.format('Country') + (' {:^6d} |'*4).format(1980, 1990,
                                                                  2000, 2010)
file_content = '\n'.join(('+' + '-'*(len(title)-2) + '+',
      title,
      line,
      header,
      line,
      row.format('China', 2937, 4321, 4752, 5527),
      row.format('Germany', 4225, 5411, 6453, 6718),
      row.format('United States', 3772, 4755, 5854, 6988),
      line))

with open('test_file2.txt', 'w') as file:
    file.write(file_content)

In [26]:
import os
os.remove('test_file.txt')
os.remove('test_file2.txt')

## STRING METHODS

- We have already encountered functions: built-in functions and functions we have defined. A method is another kind of function that is attached to a particular type. This section covers
the methods that are attached to string types.  
- Method calls in this form—`'browning'.capitalize()`—are shorthand for this: `str.capitalize('browning')`. 
- Methods are like functions, except that the first argument must be an object of the class in which the method is defined.

### Strip Methods

In [27]:
# strip() -- Strip spaces on the left and right of string
my_string = '   ABBCCC  '
print(my_string, 'Dummy')
print(my_string.strip())

   ABBCCC   Dummy
ABBCCC


In [28]:
# strip(chars) -- Strip chars on the left and right of string
my_string = 'ABBCCCA'
print(my_string.strip('A'))
my_string = '  ABBCCCA'
print(my_string.strip('A'))

BBCCC
  ABBCCC


In [29]:
# lstrip() -- Strip spaces from the left of string
my_string = '   ABBCCC  '
print(my_string, 'Dummy')
print(my_string.lstrip(), 'Dummy')

   ABBCCC   Dummy
ABBCCC   Dummy


In [30]:
# rstrip() -- Strip spaces from the right of string
my_string = '   ABBCCC  '
print(my_string, 'Dummy')
print(my_string.rstrip(), 'Dummy')

   ABBCCC   Dummy
   ABBCCC Dummy


### Case Methods

In [31]:
# islower() -- Check if all alphabet characters are lower case
my_string = 'EAS503'
print(my_string.islower())

False


In [32]:
# isupper() -- Check if all alphabet characters are upper case
my_string = 'EAS503'
print(my_string.isupper())

True


In [33]:
# lower() -- Lower case the string; returns a new string
my_string = 'EAS503'
print(my_string.lower())

eas503


In [34]:
# upper() -- Upper case the string; returns a new string
my_string = 'eas503'
print(my_string.upper())

EAS503


In [35]:
# title() -- Make the first letter of each word upper case
my_string = 'the lazy dog jumped over the quick brown fox'
print(my_string.title())

The Lazy Dog Jumped Over The Quick Brown Fox


In [36]:
# capitalize() -- Make the first letter upper case
my_string = 'the lazy dog jumped over the quick brown fox'
print(my_string.capitalize())

The lazy dog jumped over the quick brown fox


In [37]:
# swapcase() -- Make upper case lower case and lower case upper case
my_string = 'tHe laZy dOg Jumped oveR thE quIck bRown Fox'
print(my_string.swapcase())

ThE LAzY DoG jUMPED OVEr THe QUiCK BrOWN fOX


### Content Methods
- `isalpha()` -- Returns `True` if all the characters are alphabet 
- `isdecimal()` -- Returns `True` if all the characters are numbers (0-9); USE THIS!
- `isdigit()` -- Returns `True` if all the characters are numbers (0-9), superscripts (`"\u00B2"`), or fractions `'\u00BC'`; 
- `isnumeric()` -- Returns `True` if all the characters are numbers (0-9), superscripts (`"\u00B2"`), fractions `'\u00BC'`, or Roman Numerals!
- `isalnum()` -- Returns `True` if the string is alpha numeric
- `startswith(substring)` -- Returns `True` if the string starts with the specific input argument
- `endswith(substring)` -- Returns `True` if the string ends with the specific input argument
- `find(substring)` -- Returns the index if the character is found; otherwise returns `-1`
- `index(substring)` -- Returns the index if the character is found; otherwise **raises an error**
- `in` operator -- Returns `True` if the operand on the left exists in the string
- `count(substring)` -- Returns the number of times the substring occurs

### Modification Methods
- `replace()` -- Replaces character(s) with other character(s); returns a new string
- You can **chain** methods! **Chaining** lets you avoid having to save intermediate results. 
```{code-cell} ipython3
my_string = '(EAS503)'
print(my_string.replace('(', '').replace(')', ''))
```
- `zfill(number_of_zeros)` -- prepend zeros to a string; returns a new string