# Strings

Python strings are **immutable** (like Java) which means they cannot be changed after they are created.


## Creating a String
can use **either single quotes or double quotes**

In [1]:
# Single word
'hello'
# We can also use double quote
"String built with double quotes. Isn't it!!"

#String literals inside triple quotes, """" or ''', can span multiple lines of text.
"""Multiple
Line
String"""

'Multiple\nLine\nString'

In [4]:
# Assign s as a string
s = 'Hello World'

In [5]:
len(s)

11

In [6]:
# Show first element (in this case a letter)
s[0]

'H'

In [7]:
s[1]

'e'

## String Slicing

We can use a <code>:</code> to perform *slicing* which grabs everything up to a designated point. For example:
![image.png](attachment:image.png)

In [8]:
# after first character to the length of s which is len(s)
s[1:]

'ello World'

In [9]:
# Note that there is no change to the original s
s

'Hello World'

In [10]:
# Grab everything UP TO the 3rd index
s[:3]

'Hel'

In [11]:
# Grab everything, but go in step sizes of 2
s[::2]

'HloWrd'

In [12]:
# We can use this to print a string backwards
s[::-1]

'dlroW olleH'

## String Immutability
It's important to note that strings have an important property known as *immutability*. This means that once a string is created, the elements within it can not be changed or replaced. For example:

In [25]:
s

'Hello World'

In [26]:
# Let's try to change the first letter to 'x'
s[0] = 'x'

TypeError: 'str' object does not support item assignment

Notice how the error tells us directly what we can't do, change the item assignment!

Something we *can* do is concatenate strings!

In [27]:
s

'Hello World'

In [29]:
# We can reassign s completely though!
#  for example the expression ('hello' + 'there') takes in the 2 strings 'hello' and 'there' and builds a new string 'hellothere'
s = s + ' concatenate me!'
print(s)

We can use the multiplication symbol to create repetition!

In [32]:
letter = 'z'

In [33]:
letter*10

'zzzzzzzzzz'

## Basic Built-in String methods

In [12]:
#Unlike Java, the '+' does not automatically convert numbers or other types to string form. 
# The str() function converts values to a string form so they can be combined with other strings
pi = 3.14

print('The value of pi is ' + str(pi) )

The value of pi is 3.14


In [20]:
# Upper Case a string
s.upper()

'HELLO WORLD'

In [21]:
# Lower case
s.lower()

'hello world'

In [22]:
# Split a string by blank space (this is the default)
s.split()

['Hello', 'World']

In [23]:
# Split by a specific element (doesn't include the element that was split on)
s.split('W')

['Hello ', 'orld']

In [17]:
# s.isalpha()/s.isdigit()/s.isspace()... -- tests if all the string chars are in the various character classes
s[5].isspace()

True

In [25]:
s.join(['a','bb','cccc'])#Join each char with String

'aHello WorldbbHello Worldcccc'

## Changing case
We can use methods to capitalize the first word of a string, or change the case of the entire string.

In [26]:
# Capitalize first word in string
s.capitalize()

'Hello world'

In [27]:
s.upper()

'HELLO WORLD'

In [28]:
s.lower()

'hello world'

Remember, strings are immutable. None of the above methods change the string in place, they only return modified copies of the original string.

In [29]:
s

'Hello World'

To change a string requires reassignment:

In [30]:
s = s.upper()
s

'HELLO WORLD'

In [31]:
s = s.lower()
s

'hello world'

## Location and Counting

In [32]:
s.count('o') # returns the number of occurrences, without overlap

2

In [33]:
s.find('o') # returns the starting index position of the first occurence

4

## Formatting
The <code>center()</code> method allows you to place your string 'centered' between a provided string with a certain length. Personally, I've never actually used this in code as it seems pretty esoteric...

In [34]:
s.center(20,'*')

'****hello world*****'

The <code>expandtabs()</code> method will expand tab notations <code>\t</code> into spaces:

In [35]:
'hello\thi'.expandtabs()

'hello   hi'

# is check methods
These various methods below check if the string is some case. Let's explore them:

In [36]:
s = 'hello'

<code>isalnum()</code> will return True if all characters in **s** are alphanumeric

In [37]:
s.isalnum()

True

<code>isalpha()</code> will return True if all characters in **s** are alphabetic

In [38]:
s.isalpha()

True

<code>islower()</code> will return True if all cased characters in **s** are lowercase and there is
at least one cased character in **s**, False otherwise.

In [39]:
s.islower()

True

<code>isspace()</code> will return True if all characters in **s** are whitespace.

In [40]:
s.isspace()

False

<code>istitle()</code> will return True if **s** is a title cased string and there is at least one character in **s**, i.e. uppercase characters may only follow uncased characters and lowercase characters only cased ones. It returns False otherwise.

In [41]:
s.istitle()

False

<code>isupper()</code> will return True if all cased characters in **s** are uppercase and there is
at least one cased character in **s**, False otherwise.

In [42]:
s.isupper()

False

Another method is <code>endswith()</code> which is essentially the same as a boolean check on <code>s[-1]</code>

In [43]:
s.endswith('o')

True

## Built-in Reg. Expressions
Strings have some built-in methods that can resemble regular expression operations.
We can use <code>split()</code> to split the string at a certain element and return a list of the results.
We can use <code>partition()</code> to return a tuple that includes the first occurrence of the separator sandwiched between the first half and the end half.

In [44]:
s.split('e')

['h', 'llo']

In [45]:
s.partition('l')

('he', 'l', 'lo')

# String Formatting

    player = 'Thomas'
    points = 33
    
    'Last night, '+player+' scored '+str(points)+' points.'  # concatenation
    
    f'Last night, {player} scored {points} points.'          # string formatting


There are three ways to perform string formatting.
* The oldest method involves placeholders using the modulo `%` character.
* An improved technique uses the `.format()` string method.
* The newest method, introduced with Python 3.6, uses formatted string literals, called *f-strings*.

## Print Formatting

We can use the .format() method to add formatted objects to printed string statements. 

The easiest way to show this is through an example:

In [19]:
'Insert another string with curly brackets: {}'.format('The inserted string')

'Insert another string with curly brackets: The inserted string'

In [1]:
s = 'hello world'

## (Oldest) Formatting with placeholders
You can use <code>%s</code> to inject strings into your print statements. The modulo `%` is referred to as a "string formatting operator".

In [47]:
print("var = %s" % 10)

var = 10


In [50]:
print("var x = %s and y = %s " %('some','more'))

var x = some and y = more 


In [51]:
x, y = 'some', 10
print("var x = %s and y = %s " %(x,y))

var x = some and y = 10 


### Format conversion methods.
It should be noted that two methods <code>%s</code> and <code>%r</code> convert any python object to a string using two separate methods: `str()` and `repr()`. 

Note that `%r` and `repr()` deliver the *string representation* of the object, including quotation marks and any escape characters.

In [54]:
print('He said his name was %s.' %'Fred')
print('He said his name was %r.' %['Fred','Nitin'])

He said his name was Fred.
He said his name was ['Fred', 'Nitin'].


As another example, `\t` inserts a tab into a string.

In [5]:
print('I once caught a fish %s.' %'this \tbig')
print('I once caught a fish %r.' %'this \tbig')

I once caught a fish this 	big.
I once caught a fish 'this \tbig'.


The `%s` operator converts whatever it sees into a string, including integers and floats. The `%d` operator converts numbers to integers first, without rounding. Note the difference below:

In [6]:
print('I wrote %s programs today.' %3.75)
print('I wrote %d programs today.' %3.75)   

I wrote 3.75 programs today.
I wrote 3 programs today.


### Padding and Precision of Floating Point Numbers
Floating point numbers use the format <code>%5.2f</code>. Here, <code>5</code> would be the minimum number of characters the string should contain; these may be padded with whitespace if the entire number does not have this many digits. Next to this, <code>.2f</code> stands for how many numbers to show past the decimal point. Let's see some examples:

In [7]:
print('Floating point numbers: %5.2f' %(13.144))

Floating point numbers: 13.14


In [8]:
print('Floating point numbers: %1.0f' %(13.144))

Floating point numbers: 13


In [9]:
print('Floating point numbers: %1.5f' %(13.144))

Floating point numbers: 13.14400


In [10]:
print('Floating point numbers: %10.2f' %(13.144))

Floating point numbers:      13.14


In [11]:
print('Floating point numbers: %25.2f' %(13.144))

Floating point numbers:                     13.14


For more information on string formatting with placeholders visit https://docs.python.org/3/library/stdtypes.html#old-string-formatting

### Multiple Formatting
Nothing prohibits using more than one conversion tool in the same print statement:

In [12]:
print('First: %s, Second: %5.2f, Third: %r' %('hi!',3.1415,'bye!'))

First: hi!, Second:  3.14, Third: 'bye!'


## Formatting with the `.format()` method
A better way to format objects into your strings for print statements is with the string `.format()` method. The syntax is:

    'String here {} then also {}'.format('something1','something2')
    
For example:

In [13]:
print('This is a string with an {}'.format('insert'))

This is a string with an insert


### The .format() method has several advantages over the %s placeholder method:

#### 1. Inserted objects can be called by index position:

In [14]:
print('The {2} {1} {0}'.format('fox','brown','quick'))

The quick brown fox


#### 2. Inserted objects can be assigned keywords:

In [15]:
print('First Object: {a}, Second Object: {b}, Third Object: {c}'.format(a=1,b='Two',c=12.3))

First Object: 1, Second Object: Two, Third Object: 12.3


#### 3. Inserted objects can be reused, avoiding duplication:

In [16]:
print('A %s saved is a %s earned.' %('penny','penny'))
# vs.
print('A {p} saved is a {p} earned.'.format(p='penny'))

A penny saved is a penny earned.
A penny saved is a penny earned.


### Alignment, padding and precision with `.format()`
Within the curly braces you can assign field lengths, left/right alignments, rounding parameters and more

In [17]:
print('{0:8} | {1:9}'.format('Fruit', 'Quantity'))
print('{0:8} | {1:9}'.format('Apples', 3.))
print('{0:8} | {1:9}'.format('Oranges', 10))

Fruit    | Quantity 
Apples   |       3.0
Oranges  |        10


By default, `.format()` aligns text to the left, numbers to the right. You can pass an optional `<`,`^`, or `>` to set a left, center or right alignment:

In [18]:
print('{0:<8} | {1:^8} | {2:>8}'.format('Left','Center','Right'))
print('{0:<8} | {1:^8} | {2:>8}'.format(11,22,33))

Left     |  Center  |    Right
11       |    22    |       33


You can precede the aligment operator with a padding character

In [19]:
print('{0:=<8} | {1:-^8} | {2:.>8}'.format('Left','Center','Right'))
print('{0:=<8} | {1:-^8} | {2:.>8}'.format(11,22,33))

Left==== | -Center- | ...Right


Field widths and float precision are handled in a way similar to placeholders. The following two print statements are equivalent:

In [20]:
print('This is my ten-character, two-decimal number:%10.2f' %13.579)
print('This is my ten-character, two-decimal number:{0:10.2f}'.format(13.579))

This is my ten-character, two-decimal number:     13.58
This is my ten-character, two-decimal number:     13.58


Note that there are 5 spaces following the colon, and 5 characters taken up by 13.58, for a total of ten characters.

For more information on the string `.format()` method visit https://docs.python.org/3/library/string.html#formatstrings

## (Latest) Formatted String Literals (f-strings)

Introduced in Python 3.6, f-strings offer several benefits over the older `.format()` string method described above. For one, you can bring outside variables immediately into to the string rather than pass them as arguments through `.format(var)`.

In [55]:
name = 'Fred'

print(f"He said his name is {name}.")

He said his name is Fred.


Pass `!r` to get the string representation:

In [56]:
print(f"He said his name is {name!r}")

He said his name is 'Fred'


#### Float formatting follows `"result: {value:{width}.{precision}}"`

Where with the `.format()` method you might see `{value:10.4f}`, with f-strings this can become `{value:{10}.{6}}`


In [57]:
num = 23.45678
print("My 10 character, four decimal number is:{0:10.4f}".format(num))
print(f"My 10 character, four decimal number is:{num:{10}.{6}}")

My 10 character, four decimal number is:   23.4568
My 10 character, four decimal number is:   23.4568


Note that with f-strings, *precision* refers to the total number of digits, not just those following the decimal. This fits more closely with scientific notation and statistical analysis. Unfortunately, f-strings do not pad to the right of the decimal, even if precision allows it:

In [58]:
num = 23.45
print("My 10 character, four decimal number is:{0:10.4f}".format(num))
print(f"My 10 character, four decimal number is:{num:{10}.{6}}")

My 10 character, four decimal number is:   23.4500
My 10 character, four decimal number is:     23.45


If this becomes important, you can always use `.format()` method syntax inside an f-string:

In [59]:
num = 23.45
print("My 10 character, four decimal number is:{0:10.4f}".format(num))
print(f"My 10 character, four decimal number is:{num:10.4f}")

My 10 character, four decimal number is:   23.4500
My 10 character, four decimal number is:   23.4500


For more info on formatted string literals visit https://docs.python.org/3/reference/lexical_analysis.html#f-strings