# Introduction to Python

## What is Python?

Python is an extremely common programming language that is becomming more and more popular with all types of programmers. It especially strong in data handling, and is the language of choice for AI and data science applications including machine learning. 

Python was invented in 1990 by Guido van Rossum and was specifically designed for programmers (not computers). Thus it's one of the best programming languages to learn as a first language. 

> So I set out to come up with a language that made programmers more productive, and if that meant that the programs would run a bit slower, well, that was an acceptable trade-off - Van Rossum




### The Zen of Python

Python has a philosophy built into the language

In [2]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## Running the Python interperter

If you are using the recommended Jupyter Lab, then you can eaisly access a terminal in the same window as this notebook

We can start an interactive Python **interpreter** session by typing the following 

```bash
#  the $ indicates the command is run in a shell
$ python
```

We can see where this Python executable is located on our machine using the bash program `which`:

In [5]:
#  use ! to run bash/shell code in the notebook directly
!which python

'which' is not recognized as an internal or external command,
operable program or batch file.


And what version of Python we are using by running Python with a command line argument:

In [4]:
#  use ! to run bash/shell code in the notebook directly
!python --version

Python 3.8.13


# Python development environment

## iPython

IPython = Interactive Python 
- command shell for interactive computing
- IPython is what runs in Jupyter


## Python development environments (IDEs)

An IDE is where you develop your code. 
In Python, we run code in 2 different ways: 
- iPython: lets you run small snippets of code directly
- Python: script mode, runs entire programs

## Using Python
Python works with (mainly) 2 different things at the lowest level:
1. Data types
2. Functions

Data types store data. Functions act on or change those bits of data.
We name different <i> instances </i> of those data types and save them as variables to make our code easier to read and work with. 

## Data Types - Numbers
There are 2 different data types for holding numeric values in python - floats and integers.
Python can be used just like a calculator to work on these numbers. 

In [6]:
type(21)

int

In [7]:
type(21.21)

float

We can use the typical operators we know `(+, *, /, etc.) ` to perform math operations.

NOTE: `=` in python is used to assign a variable to a name. It is NOT necessary for doinga math operation.

In [8]:
7*3

21

Python will complete the operations based on correct order of operations.

In [10]:
7*3+2

23

We can do exponentiation:

In [11]:
2**3

8

In [12]:
2**2

4

The modulo (`%`) operator gives us the remainder after division:

In [14]:
23%5

3

This can be used to check if a number is even:

In [15]:
24%2

0

## Data Types - Booleans

In [17]:
type(False)

bool

In [1]:
type(0)

int

Many expressions in Python have a "truth value" and evaluate to one of the two boolean values.
- values which are evaluated to `True` or `False`

In [5]:
bool(0)

False

In [4]:
bool([])

False

In [6]:
bool(-1)

True

In [7]:
bool(1)

True

In [8]:
bool(2)

True

In [9]:
bool([0])

True

In [10]:
bool([])

False

## Data Types: Strings

String data types hold text data. If a number appears in a string, it isn't treated as a number (can't be added, multiplied, etc.).
They can be written in multiple ways:

Can use `"`, `'` or `str()`

In [11]:
'DSR is fun!'

'DSR is fun!'

In [12]:
"DSR is very interesting!"

'DSR is very interesting!'

Files are just big lists of characters

`\n` in the newine character 

In [16]:
lines = "DSR is fun! \nDSR is very interesting!"
print(lines)

DSR is fun! 
DSR is very interesting!


Multiple line strings:

In [121]:
# showing that the line break is read from the typed line break


Strings next to each other are joined:

In [21]:
"py" "thon"

'python'

In [29]:
print("py" "thon")

python


In [19]:
print("py", "thon")

py thon


We can add strings together (called concatenation):

In [23]:
'yes' + ' ' + 'no'

'yes no'

In [18]:
"py"+"thon"

'python'

And multiply them:

In [25]:
'yes '*3

'yes yes yes '

What happens when we try to multiply this string?:

In [28]:
'2 '*3

'2 2 2 '

## String formatting

There are many formatting tricks in Python to control how strings appear. 
Most formatting commands involve the `{}` curly brackets.

In [30]:
name = 'anand'
print('{} wokrs as a data scientist'.format(name))

anand wokrs as a data scientist


In [32]:
day = 'Tuesday'
tomorrow = 'Wednesday'
print('Today is {}, tomorrow is {}'.format(day,tomorrow))

Today is Tuesday, tomorrow is Wednesday


In [35]:
day = 'Tuesday'
tomorrow = 'Wednesday'
print(f'Today is {day}, tomorrow is {tomorrow}')

Today is Tuesday, tomorrow is Wednesday


We can control the formatting of decimal places

In [34]:
x= 0.110567
f'{x:.4f}', f'{x:.2f}', f'{x:.1f}' 

('0.1106', '0.11', '0.1')

## String stripping

A common operation is removing whitespaces:

In [36]:
'python is interesting to work'

'python is interesting to work'

Related is to remove characters from the string - this can be done by replacing with `''`

In [37]:
'python is interesting to work'.replace(' ','')

'pythonisinterestingtowork'

## `in`

A way to check if an object exists in an iterable using `in`.  As strings are iterable, this syntax works with strings:

In [38]:
new_string = 'Python'

In [40]:
'p' in new_string

False

In [41]:
'P' in new_string

True

## Conditionals
Booleans can be used in conditional commands. 
Conditional commands work just like the "IF..THEN.." command you know from Microsoft Excel.


Try changing the value of X in the example below:

In [45]:
x = 12

In [47]:
if x <= 10:
    print('x is less than equal to 10')
    y = x**2
    print(y)
else:
    print('x is greater than 10')

x is greater than 10


We can add almost infinite layers to our conditional by adding `elif` to the conditional block

In [48]:
if x <= 10:
    print('x is less than equal to 10')
    y = x**2
    print(y)
elif x < 20:
    print(f'{x} is less than to 20')
else:
    print('x is greater than 10')

12 is less than to 20


In [50]:
y = 20
x = 7
if x <= 10 or y <=20:
    print('x is less than equal to 10')
elif x < 20:
    print(f'{x} is less than to 20')
else:
    print('x is greater than 10')

x is less than equal to 10


## Comparisons
a double equals sign is used to compute comparisons in Python.

An exclamation sign followed by an equals sign means "is not equal to"

## Logical operators

`and`, `or`, `not`

In [51]:
True and True

True

In [52]:
True and False

False

In [53]:
True or True

True

In [54]:
True or False

True

In [56]:
True or True and False

True

In [57]:
False and True or True

True

## Datatypes: List
A list in python is a collection of datapoints. It is created with brackets `[]` or using `list()`. 
We can use **selectors** to select specific items within the list.

In [59]:
my_list = ['dog',1,2,3]
my_list

['dog', 1, 2, 3]

Selectors are called after the list using `[]`. Python starts counting at 0. To get the first element of the list then, we need to use 0 inside the selector. To get the second item, we would put a 1 inside the selector. 

We can also start from the back of the list, using -1. Then every positional element we want to get after the last one, we subtract one further. For example, if we want to get the 2nd-from-last element, we could select this element by putting -2 in the selector. 

In [60]:
my_list[0]

'dog'

In [61]:
my_list[-1]

3

In [62]:
my_list[-2]

2

In [63]:
my_list[2]

2

Selecting a range of a list

In [65]:
weekdays = [
    "Monday",
    "Tuesday",
    "Wednesday",
    "Thursday",
    "Friday",
    "Saturday",
    "Sunday"
]

Select Thursday and Friday from the list

In [68]:
weekdays[-4:-2]

['Thursday', 'Friday']

In [69]:
weekdays[4:2:-1]

['Friday', 'Thursday']

In [None]:
weekdays[-4:-2]

['Thursday', 'Friday']

In [83]:
weekdays[0:7:6]

['Monday', 'Sunday']

In [86]:
weekdays[-6:-8]

[]

## Variables & objects

In Python (unlike other languages) there is a difference between **objects** and **variables**:
- object = the actual data in memory
- variable = a label that refers to an object

Objects have an identity, type and value.  Only the value changes over time.

In Python, variables **refer** to objects.  They are labels for objects - not the object themselves.
- one object can have many labels
- one label = only one object

Below we create two objects

In [88]:
first = [6,7,8]

In [89]:
second = [1,2,3]

We can use two different operators to compare these variables.

The `==` operator checks if the two objects have the same values:

In [90]:
first == second

False

In [91]:
first == first

True

The `is` operator checks whether both variables refer to the same object:

In [92]:
second is first

False

In [99]:
third = [6,7,8]

third is first

False

In [94]:
third == first

True

In [100]:
# third = first
# third is first

Under the hood Python is comparing the object's `id` - a unique value for each object:

In [97]:
id(first)

2455061193600

In [101]:
id(second)

2455061433408

In [102]:
id(third)

2455061193344

# Loops
In Python (and programming in general), we'll find many times when we want to do the same calculation over multiple values. This is what loops are for. They allow us to iterrate through lists, or rows in a table of data. 

### `for`

We can iterate through any object that has multiple smaller objects within it, like a list. 
A list contains multiple smaller elements within it. 

In [103]:
for m in weekdays:
    print(m)

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday


In [105]:
for m in weekdays:
    print(m)
    print(m.replace('day','yad'))
    print('--')

Monday
Monyad
--
Tuesday
Tuesyad
--
Wednesday
Wednesyad
--
Thursday
Thursyad
--
Friday
Friyad
--
Saturday
Saturyad
--
Sunday
Sunyad
--


In [106]:
p = [1,5,4,3,2]

for i in p:
    new_number = i+5
    print('old number:', i)
    print('new number:', new_number)

old number: 1
new number: 6
old number: 5
new number: 10
old number: 4
new number: 9
old number: 3
new number: 8
old number: 2
new number: 7


Question: What will happen if we call `new_number` now?

In [107]:
new_number

7

How could we see all of the new_numbers?

In [112]:
p = [1,5,4,3,2]

new_list = [] #creating a new list
for i in p[1:5]:
    new_number = i+5
    print('old number:', i)
    print('new number:', new_number)
    new_list.append(new_number)

old number: 5
new number: 10
old number: 4
new number: 9
old number: 3
new number: 8
old number: 2
new number: 7


In [113]:
new_list

[10, 9, 8, 7]

A useful tool in python exists for making increasing 'counting-style' lists. This is called `range`. With range, the programmer enters the value at which they want the list to start and stop, and the 'step' of the list. Normally this would be 1. Setting the step to 2 counts off by skipping the intermediary number.

In [114]:
range?

[1;31mInit signature:[0m [0mrange[0m[1;33m([0m[0mself[0m[1;33m,[0m [1;33m/[0m[1;33m,[0m [1;33m*[0m[0margs[0m[1;33m,[0m [1;33m**[0m[0mkwargs[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
range(stop) -> range object
range(start, stop[, step]) -> range object

Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
[1;31mType:[0m           type
[1;31mSubclasses:[0m     


Range doesn't "show" the list, have to convert it to a list to see this. However, it is still iterable.

In [115]:
range(10)

range(0, 10)

In [117]:
list(range(2,10,2))

[2, 4, 6, 8]

In [119]:
for i in list(range(2,10,2)):
    print(i**i)

4
256
46656
16777216


In [121]:
for i in range(2,10,2):
    print(i**i)

4
256
46656
16777216


setting starting point with range

In [122]:
range(3,10)

range(3, 10)

setting step size with range 

In [124]:
list(range(3,10,4))

[3, 7]

## Strings are iterable

In [125]:
for i in 'python':
    print(i)

p
y
t
h
o
n


We can use the builtin `len` to measure the number of characters in a string:

(can also be used to measure length of a list, or any iterable)

In [126]:
len('python')

6

We can see some of the other functionality available on the `str` object using `dir`:

In [127]:
dir(str)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


## Exercises

Write a program to print out:

```
*****                                                                  
  *                                                                    
  *                                                                    
  *                                                                    
  *                                                                    
  *                                                                    
  *  
```

It might be useful to know you can do

`'*' * 2 = '**'`

`'*' + ' ' = '* '`


`'\n'` makes a line break in python

In [131]:
'*****'

'*****'

Write a program to print:
```
1
22
333
4444
55555
666666
7777777
88888888
999999999
```

## Datatypes: Dictionary

Dictionaries are a handy way to keep track of data in key-value pairs. 
`{key1: value1, key2: value2}`

Let's make a list of dictionaries:

In [144]:
data = [
    {'author': 'F. SCOTT FITZGERALD', 'text': 'Action is character'},
    {'author': 'RALPH WALDO EMERSON', 'text': 'Every man is my superior in some way. In that, I learn of him'},
    {'author': 'RALPH WALDO EMERSON', 'text': 'The purpose of life is not to be happy. It is to be useful, to be honorable, to be compassionate, to have it make some difference that you have lived and lived well'},
    {'author': 'Ralph Waldo Emerson', 'text': 'Every man alone is sincere.  At the entrance of a second persion, hypocrisy beings'},
    {'author': 'Majjha Nikaya', 'text': 'This is, because that is.  This is not, because that is not.  This is like this, because this is like that'}
]

We can select our first quote dictionary:

In [146]:
data[0]['author']

'F. SCOTT FITZGERALD'

We can iterate over the keys:

In [148]:
dict_1 = data[0]
for key in dict_1.keys():
    print(key)

author
text


And the same for the values:

In [150]:
for value in dict_1.values():
    print(value)

F. SCOTT FITZGERALD
Action is character


And both at the same time:

In [152]:
for key,value in dict_1.items():
    print(key, value)

author F. SCOTT FITZGERALD
text Action is character


We can then use the dictionaries to make a DataFrame (more on this in the pandas class :) )

In [153]:
import pandas as pd

In [156]:
pd.DataFrame(data)

Unnamed: 0,author,text
0,F. SCOTT FITZGERALD,Action is character
1,RALPH WALDO EMERSON,"Every man is my superior in some way. In that,..."
2,RALPH WALDO EMERSON,The purpose of life is not to be happy. It is ...
3,Ralph Waldo Emerson,Every man alone is sincere. At the entrance o...
4,Majjha Nikaya,"This is, because that is. This is not, becaus..."


## Exercises:

Write a Python program to display the first and last colors from the following list.

color_list = ["Red","Green","White" ,"Black"]

Should display "Red Black"

In [136]:
color_list = ["Red","Green","White" ,"Black"]
print(f"{color_list [0]} {color_list [-1]}")

Red Black


Sum all the items in the following list:
x = [1, 2, 3, 7, 12]

(hint: no using built-in functions!)

In [137]:
x = [1, 2, 3, 7, 12]
sum_x = 0
for i in x:
    sum_x += i
    
sum_x

25

Get the maximum number from a list: y = [-10, 6, 8, 14]

(hint: no using built-in functions!)

In [139]:
y = [-10, 6, 8, 14]
max_y = y[0]

for i in y:
    if max_y < i:
        max_y = i
        
max_y

14

Check if a list is empty or not.

For list p = [ ], should print "True".

For list q = [1, 2, 3, 4] should print "False". 

In [140]:
def isempty(list):
    if len(list) == 0:
        print(True)
    else:
        print(False)

In [143]:
isempty([])

True


In [142]:
isempty([1])

False


Add the {key: value} pair `{"Susan: 43}` to the dictionary `ages = {"Joe": 50, "Gretchen": 31}`

In [159]:
ages = {"Joe": 50, "Gretchen": 31}

In [160]:
ages['Susan'] = 43

In [161]:
ages

{'Joe': 50, 'Gretchen': 31, 'Susan': 43}

In [164]:
ages.update({'Anand': 34})

In [165]:
ages

{'Joe': 50, 'Gretchen': 31, 'Susan': 43, 'Anand': 34}