# Strings and Stuff  in (Monty) Python

&nbsp;

<p align="center"> 
<img src="./images/Spam.gif">
</p>

In [1]:
import numpy as np

## Strings are just arrays of characters

In [2]:
my_string = 'spam'

my_string,len(my_string),my_string[0],my_string[0:2]

('spam', 4, 's', 'sp')

In [3]:
my_string[::-1]

'maps'

#### But unlike numerical arrays, you cannot reassign elements:

In [4]:
my_string[0] = "S"

TypeError: 'str' object does not support item assignment

#### Or do math-like stuff ...

In [5]:
my_string.sum()

AttributeError: 'str' object has no attribute 'sum'

### "Arithmetic" with Strings

In [6]:
my_string = 'spam'
my_egg = "eggs"

my_string + my_egg

'spameggs'

In [7]:
my_string + " " + my_egg

'spam eggs'

In [8]:
4 * (my_string + " ") + my_egg

'spam spam spam spam eggs'

In [9]:
print(4 * (my_string + " ") + my_string + " and\n" + my_egg)     # use \n to get a newline with the print function

spam spam spam spam spam and
eggs


### String operators and comparisons

* String comparison is performed using the characters in both strings.
* The characters in both strings are compared one by one.
* When different characters are found then their [Unicode](https://en.wikipedia.org/wiki/List_of_Unicode_characters#Basic_Latin) value is compared.
* The character with lower [Unicode](https://en.wikipedia.org/wiki/List_of_Unicode_characters#Basic_Latin) value is considered to be smaller.

In [10]:
"spam" == "good"

False

In [11]:
"spam" != "good"

True

In [12]:
"spam" == "spam"

True

In [13]:
"sp" < "spam"

True

In [14]:
"spam" < "eggs"

False

In [15]:
"sp" in "spam"

True

In [16]:
"sp" not in "spam"

False

In [17]:
my_string.isalpha()

True

In [18]:
my_string.isdigit()

False

In [19]:
my_string.isspace()

False

## Python supports `Unicode` characters

You can enter `unicode` characters directly from the keyboard (depends on your operating system), or you can use the `ASCII` encoding. 

[Unicode - ASCII encoding list](https://en.wikipedia.org/wiki/List_of_Unicode_characters).

For example the `ASCII` ecoding for the greek capital omega is `U+03A9`, so you can create the character with `\U000003A9`

In [20]:
my_resistor = "Spam has an electrical resistance of greater than 100 M\U000003A9"

print(my_resistor)

Spam has an electrical resistance of greater than 100 MÎ©


In [21]:
Î© = 100e6

Î© * np.pi

314159265.3589793

### [Emoji](https://en.wikipedia.org/wiki/Emoji) are unicode characters, so you can use them a well (not all OSs will show all characters!)

In [22]:
radio_active = "\U00002622"
wink = "\U0001F609"

print((radio_active * 5) + " " + (wink * 3))

â˜¢â˜¢â˜¢â˜¢â˜¢ ðŸ˜‰ðŸ˜‰ðŸ˜‰


### Emoji can not be used as variable names (at least not yet ...)

In [23]:
â˜¢ = 2.345

â˜¢ ** 2

SyntaxError: invalid character in identifier (<ipython-input-23-ae684a469a13>, line 1)

### Raw strings - `r" "`
 * Sometime you do not want python to interpret anything in the string
 * You can do this by adding a "r" to the front of the string

In [24]:
my_resistor = r"Spam has an electrical resistance of greater than 100 M\U000003A9"

print(my_resistor)

Spam has an electrical resistance of greater than 100 M\U000003A9


### Watch out for variable types! 

In [25]:
n = 42

print("I would like " + n + " orders of spam")

TypeError: can only concatenate str (not "int") to str

In [26]:
print("I would like " + str(n) + " orders of spam")

I would like 42 orders of spam


----

## Python `f-string` formatting

In [27]:
my_a = 42
my_b = 1.23456
my_c = True
my_d = 'Spam'

In [28]:
type(my_a), type(my_b), type(my_c), type(my_d)

(int, float, bool, str)

In [29]:
f"I would like {my_a} orders of {my_d}"

'I would like 42 orders of Spam'

In [30]:
my_output = f"I would like {my_a} orders of {my_d}"

print(my_output)

I would like 42 orders of Spam


In [31]:
f"The float {my_b} can be printed with only two places after the decimal: {my_b:.2f}"

'The float 1.23456 can be printed with only two places after the decimal: 1.23'

In [32]:
f"The integer {my_a} can be printed in hex: {my_a:x}, octal: {my_a:o}, or binary: {my_a:b}"

'The integer 42 can be printed in hex: 2a, octal: 52, or binary: 101010'

In [33]:
f"The number {my_b} times 1000 in scientific notation: {my_b * 1000 :.2e}"

'The number 1.23456 times 1000 in scientific notation: 1.23e+03'

In [34]:
f"The value {my_c} as a float: {my_c:f}"

'The value True as a float: 1.000000'

In [35]:
f"The value {my_c} as an integer: {my_c:d}"

'The value True as an integer: 1'

----
### Who are you who are so wise in the ways of science?

&nbsp;
<p align="center">
<img src="./images/Witch.gif">
</p>

&nbsp;
## Output from `DataFrames - .iterrows()`

In [36]:
import pandas as pd

In [37]:
witch_table = pd.read_csv('./Data/Witches.csv')

In [38]:
print(witch_table)

   Object  Density Does_It_Float
0    Wood     0.43           Yes
1   Bread     0.21           Yes
2   Apple     0.92           Yes
3  Cherry     0.53           Yes
4    Duck     0.87           Yes
5   Witch     0.98           Yes


In [39]:
for index, row in witch_table.iterrows():
    my_out_string = (f"The object: {row['Object']} has a density of {row['Density']:.1f} g/cc")
    print(my_out_string)

The object: Wood has a density of 0.4 g/cc
The object: Bread has a density of 0.2 g/cc
The object: Apple has a density of 0.9 g/cc
The object: Cherry has a density of 0.5 g/cc
The object: Duck has a density of 0.9 g/cc
The object: Witch has a density of 1.0 g/cc


#### Padding - `{Variable:N}`

* `{row['Object']:8}` - the variable `row['Object']` in 8 spaces
* `{row['Density']:5.1f}` - the variable `row['Density']` in 5 spaces with 1 decimal place

In [40]:
for index, row in witch_table.iterrows():
    my_out_string = (f"The object: {row['Object']:8} has a density of {row['Density']:5.1f} g/cc")
    print(my_out_string)

The object: Wood     has a density of   0.4 g/cc
The object: Bread    has a density of   0.2 g/cc
The object: Apple    has a density of   0.9 g/cc
The object: Cherry   has a density of   0.5 g/cc
The object: Duck     has a density of   0.9 g/cc
The object: Witch    has a density of   1.0 g/cc


#### Justified Strings - `{Variable:>N}`

* By default, the strings are justified to the left, number to the right.
* Use the `>` character to right-justify, and `<` to the left justify.
* `{row['Object']:>8}` - the variable `row['Object']` right-justified in 8 spaces
* `{row['Density']:<5.1f}` - the variable `row['Density']` left-justified in 5 spaces with 1 decimal place.

In [41]:
for index, row in witch_table.iterrows():
    my_out_string = (f"The object: {row['Object']:>8} has a density of {row['Density']:<5.1f} g/cc")
    print(my_out_string)

The object:     Wood has a density of 0.4   g/cc
The object:    Bread has a density of 0.2   g/cc
The object:    Apple has a density of 0.9   g/cc
The object:   Cherry has a density of 0.5   g/cc
The object:     Duck has a density of 0.9   g/cc
The object:    Witch has a density of 1.0   g/cc


### Really long strings

* add `\n` for line breaks

In [42]:
long_string = (
    f"Well, there's egg and bacon; egg sausage and bacon; "
    f"egg and spam; egg bacon and spam; egg bacon sausage and spam; \n"
    f"spam bacon sausage and spam; spam egg spam spam bacon and spam: "
    f"spam sausage spam spam bacon spam tomato and spam; \n"
    f"spam spam spam egg and spam; spam spam spam spam spam spam baked beans spam spam spam \n"
    f"or Lobster Thermidor au Crevette with a Mornay sauce served in a Provencale manner with shallots \n"
    f"and aubergines garnished with truffle pate, brandy and with a fried egg on top and spam."
)

In [43]:
print(long_string)

Well, there's egg and bacon; egg sausage and bacon; egg and spam; egg bacon and spam; egg bacon sausage and spam; 
spam bacon sausage and spam; spam egg spam spam bacon and spam: spam sausage spam spam bacon spam tomato and spam; 
spam spam spam egg and spam; spam spam spam spam spam spam baked beans spam spam spam 
or Lobster Thermidor au Crevette with a Mornay sauce served in a Provencale manner with shallots 
and aubergines garnished with truffle pate, brandy and with a fried egg on top and spam.


----

## Python has lots of built-in [String Methods](https://docs.python.org/3/library/stdtypes.html#string-methods).

&nbsp;

<p align="center"> 
<img src="./images/Hovercraft.gif">
</p>

&nbsp;

In [44]:
line = "My hovercraft is full of eels"

In [45]:
line

'My hovercraft is full of eels'

### Find and Replace

In [46]:
line.replace('is full of eels', 'has no wheels')

'My hovercraft has no wheels'

### Justification and Cleaning

In [51]:
line.center(100)

'                                   My hovercraft is full of eels                                    '

In [52]:
line.ljust(100)

'My hovercraft is full of eels                                                                       '

In [53]:
line.rjust(100, "*")

'***********************************************************************My hovercraft is full of eels'

In [54]:
line2 = "            My hovercraft is full of eels      "

In [55]:
line2

'            My hovercraft is full of eels      '

In [56]:
line2.strip()

'My hovercraft is full of eels'

In [57]:
line3 = "*$*$*$*$*$*$*$*$My hovercraft is full of eels*$*$*$*$"

In [58]:
line3

'*$*$*$*$*$*$*$*$My hovercraft is full of eels*$*$*$*$'

In [59]:
line3.strip('*$')

'My hovercraft is full of eels'

In [60]:
line3.lstrip('*$')

'My hovercraft is full of eels*$*$*$*$'

### Splitting and Joining

In [61]:
line.split()

['My', 'hovercraft', 'is', 'full', 'of', 'eels']

In [62]:
'___'.join(line.split())

'My___hovercraft___is___full___of___eels'

In [63]:
' '.join(line.split()[::-1])

'eels of full is hovercraft My'

### Line Formatting

In [64]:
anotherline = "mY hoVErCRaft iS fUlL oF eEELS"

In [65]:
anotherline

'mY hoVErCRaft iS fUlL oF eEELS'

In [66]:
anotherline.upper()

'MY HOVERCRAFT IS FULL OF EEELS'

In [67]:
anotherline.lower()

'my hovercraft is full of eeels'

In [68]:
anotherline.title()

'My Hovercraft Is Full Of Eeels'

In [69]:
anotherline.capitalize()

'My hovercraft is full of eeels'

In [70]:
anotherline.swapcase()

'My HOveRcrAFT Is FuLl Of Eeels'