# 1. Arithmetic, Variable Assignment and Strings

## Prerequisites
- Enthusiasm for and determination to learn Python for Data Science.
- Patience... lots of patience: this is essential for learning anything, but especially coding, where literally nothing will ever go right first time.

## Learning objectives
- Understand the difference between integers and floats.
- Understand the difference between None and int 0.
- Know how to use Python as a calculator.
- Know how to assign variables, and variable naming conventions.
- Understand strings and some basic string methods including .format().
- Know how to perform indexing and slicing.

# Introduction

## What is Python?
- A coding language: a system of rules that can be used to operate a computer.
- Dynamically typed (we will go into detail later).
- Can have many uses (networks, games, graphics, apps).
- We are focusing on data science applications of Python.
- Python can be used for the whole data pipeline, from data input to model validation.

## What is an IDE?
- Integrated Development Environment: a place to write programs in a language.
- Most famous for Python are PyCharm and Jupyter Notebook (also Spyder).
- We will use Jupyter Notebook (advantages/drawbacks will be clear over time).
- Jupyter Notebook works using cells.
- You input your code into each cell and then run the code per cell.
- Each cell acts as a 'mini-script' which can be executed independently of other cells.
- The file type of a Jupyter Notebook is a __.ipynb file__ (IPython NoteBook).
- .ipynb files cannot be opened directly by any program on your computer; rather, you have to open Jupyter Notebook first and then open the file from inside it. This is due to the structure of the file (too complicated to cover here).
- The other common type of Python file is a .py script: this is a single file written usually in a text editor, and when it is executed, the entirety of the code runs in one go, rather than individual blocks.
- .py scripts can be opened using any text editor including TextEdit on Mac or the equivalent on Windows (also other text editors).

## Jupyter Notebook
- Has many keyboard shortcuts, you will get used to them.
- Use Esc key to use shortcuts: use Esc + H to find a list.
- 2 main types of cell:
    - Code: this is self-explanatory.
    - Markdown: used to write explanatory text/titles etc in a notebook.

## Comments
- A comment is a piece of text added to code to explain it and make it easier to understand for anyone reading who isn't the author.
- \# at the start of a line indicates that a line is a comment in a code cell.
- There are 2 schools of thought on comments: some argue that comments should be used extensively, others argue they should be minimal as the code should be self-explanatory.
- Do not be afraid of adding quite a lot of comments at first; as you get better at coding you will begin to realise what requires a comment and what doesn't.
- Remember __readability and reproducibility are paramount__. Use this to guide your usage of comments.

## Data Types in Python
Python can take in many data types, these include:
- Integers (int).
- Floating Point Numbers (float).
- Strings (str).
- Lists (list).
- Dictionaries (dict).
- Tuples (tuple).
- Sets (set).
- Booleans (bool).

## Numbers in Python
- Numbers in Python are of two types: 
    - Integers (int).
    - Floating-point numbers (float).
- Put simply: 
    - An integer is an integer.
    - A floating-point number is any number with a decimal point.
- The only catch with floating-point numbers is that due to the nature of floating-point calculations, they are not exactly what you expect them to be, and hence it is often better to round them.

## Arithmetic in Python
- Arithmetic is fairly simple: we use the standard operators: 
### / * + - <br>
- Python will follow Order of Operations (BIDMAS).
- We use double asterisk __\*\*__ for powers (roots are just fractional powers).
- Division always returns a floating-point number.
- There are a couple of special operations:
    - Modulo (x%y) gives the remainder of dividing x by y.
    - Floor division (x//y) gives the result of dividing x by y rounded down to the nearest integer.
- When using arithmetic operators, leave one space either side by convention, unless not leaving a space makes it clearer.

In [None]:
# Addition
2 + 1

3

In [None]:
# Subtraction
2 - 1

1

In [None]:
# Multiplication
2 * 2

4

In [None]:
# Division
10 / 1

10.0

# Floor Division
7 // 4

In [None]:
# Modulo
7 % 4

3

In [None]:
# If modulo 2 of a number is equal to zero, the number is even
6 % 2

0

In [None]:
# Powers
2 ** 3

8

In [None]:
# Can also do roots this way
4 ** 0.5

2.0

In [None]:
# Order of Operations followed in Python
(2 + 10) * 10 + 3

123

In [None]:
# Can use parentheses to specify orders
(2 + 10) * (10 + 3)

156

In [None]:
# use round(expression, decimal_places) to give rounded answer
round(10/3, 4)

3.3333

### NoneType
- We must distinguish between None and 0 here.
- None has data type 'NoneType' and is therefore a not a value (can use as placeholder before adding values).
- 0 is an integer, and therefore a value, this shows that we have a value, but the value is the integer 0.
- We can see this clearly when we check the types of each

In [None]:
type(None)

NoneType

In [None]:
type(0)

int

In [None]:
# 0 + 1 works as they are both numbers
0 + 1

1

In [None]:
# None + 1 throws an error, as None means there is nothing there
None + 1

TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

## Variable Assignment
- Often we want to use the same object repeatedly, and hence rather than defining it repeatedly in code we can assign a variable which we use instead.
- This concept is called DRY coding (Don't Repeat Yourself): we will see many examples of this throughout.
- We assign variables by using the equals sign __=__ in Python
- Variables are named using the following guidelines:

In [None]:
x + 1 = y

SyntaxError: cannot assign to operator (<ipython-input-2-ae2623f81281>, line 1)

## Variable Naming
1. Use __snake_case__: all lowercase, no spaces, underscores \_ instead.
2. Names cannot start with a number or use these symbols: 
__\: \' \" \, \< \> \/ \? \| \\ \( \) \! @ \# \$ \% \^ \& \* \~ \- \+__
3. Avoid using 'l' (lowercase l), 'O' (uppercase o), or 'I' (uppercase i) as single character names.
4. Do __NOT__ use Python keywords (see below for a list of keywords).

__If you reassign a Python keyword by accident, use Kernel --> Restart to reset everything back to normal.__ 

See PEP8 for further details in this [link](https://www.python.org/dev/peps/pep-0008/)

In [None]:
# List of keywords
import keyword

for i in keyword.kwlist:
    print(i)

False
None
True
and
as
assert
async
await
break
class
continue
def
del
elif
else
except
finally
for
from
global
if
import
in
is
lambda
nonlocal
not
or
pass
raise
return
try
while
with
yield


Here we create an object named x (a variable) and give it the integer value 5. <br>
The \= sign assigns the value on the right to the object on the left:

In [None]:
x = 3

This is called calling the variable:

In [None]:
x

3

Arithmetic done using variables uses the system of the underlying objects (differs between data types). <br>
We will see how this works as we discover more data types:

In [None]:
x + x

6

We can reassign x to be 10 with no errors: this is called reassignment and is an example of __'dynamic typing'__.<br>
In a __'statically typed'__ language such as C, this would throw an error:

In [None]:
x = 10

We can now see x has changed from 3 to 10:

In [None]:
x

10

#### A short note on the print() function
- Before, we have just called the variable, now we see how to display it.
- The print() function __displays__ the output rather than just returning it: in Jupyter Notebook this often makes no practical difference, but when working in other IDEs, you may not see the output unless you print it.
- This becomes obvious when you want to see 2 things from a cell: Jupyter Notebook will __only show you the latest call__.
- print() statements can be used to display multiple outputs.
- We can specify the 'end' parameter inside the print() function to change how a print() statement ends.
- e.g. __end = "\n\n" provides a new line after the print() output: this can be useful to make the output more readable__.
- A blank print() statement also gives a new line.

In [None]:
# assigning a and b
a = 1
b = 2

# calling a and b
a
b

2

In [None]:
# printing a and b
print(a)
print(b)

1
2


In [None]:
# using end to give a new line
print(a, end="\n\n")
print(b)

1

2


In [None]:
# using a blank print statement to give a new line
print(a)
print()
print(b)

1

2


## Back to Variable Assignment...
We can even redefine x using x itself. <br>
Here, x is being __called__ on the right of the = sign, and being *redefined* on the left. <br>
We can think of this as *new x* = __old x__ + __old x__ (*new x* = __10__ + __10__):

In [None]:
x = x + x

x is now equal to 20:

In [None]:
x

40

In [None]:
# clearly named variables are key when writing production-quality code
shopping_bill = 10.00

vat_rate = 1.2

bill_with_vat = shopping_bill * vat_rate

bill_with_vat

12.0

## Strings in Python
- Strings (str) are a way of representing textual information in Python.
- They are denoted by quotation marks: single ('') or double ("").
- It is usually best to use double speech marks, as apostrophes in text can prematurely end a string.
- Apostrophes or other special characters in strings can be escaped using the backslash \\ character.

## String Functions and Methods
- print() displays the output contained within it.
- For strings, print() interprets escape characters (tabs, new lines etc.) and displays the string without quotations.
- A method is a function associated with an object.
- Strings have many associated methods, we will look at a few here.
- Can find the rest at: 
<br> https://docs.python.org/2/library/stdtypes.html#string-methods

In [None]:
x = "Hello World"
print('hi')

hi


In [None]:
# Using single quotations can be problematic, here, the apostrophe ends the string early
print('What's the problem here?')

SyntaxError: invalid syntax (<ipython-input-19-f68951f46ab8>, line 1)

In [None]:
# we can use double quotations
print("What's the problem here?")

What's the problem here?


In [None]:
# or backslash if we have single and double quotes
print("What\'s the \"problem\" here?")

What's the "problem" here?


In [None]:
# .upper() method makes all caps (not inplace)
print(x.upper())
print(x)


y = 1
y = str(1)
y = int(y)
round(y)

HELLO WORLD
Hello World


1

In [None]:
# if we want to change x itself, we must reassign it
print(x)

x = x.upper()

print(x)

Hello World
HELLO WORLD


In [None]:
z = "how are you?"
z.capitalize()

'How are you?'

In [None]:
# .lower() method makes all lowercase (not inplace)
print(x.lower())

hello world


In [None]:
# .split() method splits on space as default or desired separator
print(x.split())
print(x.split("o"))

['Hello', 'World']
['Hell', ' W', 'rld']


## .format Method
- The .format method is a way of inserting something into a string.
- This can be other strings or a variable taken from elsewhere in your code.
- The syntax used is detailed below.
- When using .format to add in a float to a string, we can specify the width and precision of the decimal.

In [None]:
# default prints in order
print("The {} {} {}".format("fox", "brown", "quick"))

The fox brown quick


In [None]:
# can index
print("The {2} {1} {0}".format("fox", "brown", "quick"))

The quick brown fox


In [None]:
# can use variable keys for readability
print("The {q} {b} {f}".format(f="fox", b="brown", q="quick"))

The quick brown fox


In [None]:
# create long decimal
result = 100/777
print(result, end = "\n\n")

# use value:width.precisionf for formatting
# width is minimum length of string, padded with whitespace if necessary
# precision is decimal places
print("The result was {:1.3f}".format(result))
print("The result was {r:1.3f}".format(r=result))
print("The result was {r:1.7f}".format(r=result))
print("The result was {r:.3f}".format(r=result))

0.1287001287001287

The result was 0.129
The result was 0.129
The result was 0.1287001
The result was 0.129


## String Indexing and Slicing
- Strings are iterable, meaning they can return their elements one at a time.
- Strings are also immutable, meaning their elements cannot be changed once assigned.
- They must be __REASSIGNED__ in order to change them.
- Each character in the string is one element: this includes spaces and punctuation.
- We can make use of this to call back one element (indexing).
- Or a range of elements (slicing).

In Python:
- Indexing starts at 0 (zero).
- Slicing is inclusive at the lower bound (including).
- Slicing is exclusive at the upper bound (up to but not including).

In [None]:
my_first_string = "Hello World"

In [None]:
# Index 0 gives first element
my_first_string[0]

'H'

Use colon to indicate slice, 1:4 returns 2nd (index 1), 3rd (index 2), 4th (index 3) items __but not 5th (index 4)__:

In [None]:
my_first_string[1:4]

'ell'

Absent upper bound starts with first index indicated and gives everything beyond:

In [None]:
my_first_string[1:]

'ello World'

Absent lower bound starts from index 0, up to but not including upper bound:

In [None]:
my_first_string[:3]

'Hel'

We cannot change elements of a string:

In [None]:
my_first_string[0] = 'l'

TypeError: 'str' object does not support item assignment

## Summary
We now understand:
- The nature of ints and floats.
- The nature of None.
- The nature of variables.
- What strings and print formatting are.
<br><br>

We now know how to:
- Use python as a calculator.
- Assign variables and use them.
- Use strings and string methods including .upper(), .lower(), .split() and .format().
- Add in objects to strings using .format().
- Format floats.
- Index strings.
- Slice strings.


Please refer back to this notebook to check if you are unsure of any commands or syntax. <br>
Please use the documentation below to find further string methods and conventions for style in Python.

## Further reading
- PEP8 Style Documentation: https://www.python.org/dev/peps/pep-0008/
- String Methods: https://docs.python.org/2/library/stdtypes.html#string-methods
- Python None: https://docs.python.org/3/c-api/none.html