# Week 1: Python Basics


This week's material is intended as an introductory text in working with data in Python. I will introduce enough of the basic concepts that allows us eventualy to manipulate data and run some Machine Learning algorithms.
It is not necessary to be proficient in Python to be able to be productive in data analysis. Due to time constraints, there are introductory topics that we will not cover in this course such as 'classes' and 'object-oriented programming', which you may find useful. However as we go I will highlight those topics and suggest study material if you wish to pursue more in-depth knowledge.


Python is an interpreted language. The Python interpreter reads and execute each statement one at a time. For example we start with the following statement that defines a variable __'x'__ and assign the value 10 to it.

In [2]:
x=5

We retrieve the value of x by calling it:

In [3]:
x

5

We can also retrieve the value of x by calling the 'print' function as such:

In [4]:
print(x)

5


The print() function could be also used to return the test 'Hello World!'

In [5]:
print('Hello World!')

Hello World!


### Python Programming Concepts and Mechanics

#### Everything is an Object

Python consistently follows an ___object model___. Every number, string, data structure, function and so on could be referred to as a ___Python object___. An _object_ is characterised by its ___type___ (e.g. string, float or function), ___internal data, attributes and methods___.

![OOP](OOP.jpg)

In __Object-Oriented Programming__, the fundamental building blocks are objects.
It differs from __Procedural programming__, where sequential steps are executed.
An object is an entity that stores information.
A __class__ describes an object's type. 
It defines:
 - What data is stored in the object, known as __attributes__.
 - What actions the object can do, known as __methods__.

An attribute is a variable that belongs to an instance of a class.
A method is a function that belongs to an instance of a class.
Attributes and methods are accessed using dot notation. Attributes do not use parentheses, whereas methods do.

Object.method()
variable = Object.attribute

An instance is specific case of a class. For instance, in the code x = 3, x is an instance of the type int.

A class definition is code that defines how a class behaves, including all methods and attributes.

All methods must include self, representing the object instance, as their first parameter.


For example we define a _string_ s as "Hello World!". Obviously the type of this object is a string. We can recall that information by applying the function __type ()__ on __s__, which returns the value 'str'.

In [2]:
s = "Hello World!"
type(s)

str

An example of a method that applies to __strings__ is ___lower()___, which puts every letter in the string in a lower case format.

In [12]:
s.lower()


'hello world!'

#### Comments

Any text that comes after the hash mark or pound sign __#__ is ignored by python. This feature could be used to add comments to your code or to exclude some part of your code without deleting it. 

In [8]:
#The author of this code is K. Smith
#a=3; b=5; c=10
print("Reached this line")#Simple status report

Reached this line


#### Functions vs. object method calls

A function is called using parentheses and by passing a zero or more arguments (i.e. variables)
```
result = f(x, y, z)
g()
```

Almost every object in Python comes with associated functions, known as methods that have access to the object internal information. Methods are called using the following syntax:

```
obj.some_method(x, y, z)
```
Similar to the example we have seen earlier when we created __s__.


Functions can take both _positional_ and _keyword_ arguments:
```python
result = f(a, b, c, d=5, e='foo')
```

More on this later.

#### Attributes and methods

Objects in Python typically have both attributes (other Python objects stored 'inside' the object) and methods (functions associated with an object that can have access to the object's internal data). Both of them are accessed via the syntax: object.attribute_name

```python
In [1]: a = 'foo'

In [2]: a.<Press Tab>
a.capitalize  a.format      a.isupper     a.rindex      a.strip
a.center      a.index       a.join        a.rjust       a.swapcase
a.count       a.isalnum     a.ljust       a.rpartition  a.title
a.decode      a.isalpha     a.lower       a.rsplit      a.translate
a.encode      a.isdigit     a.lstrip      a.rstrip      a.upper
a.endswith    a.islower     a.partition   a.split       a.zfill
a.expandtabs  a.isspace     a.replace     a.splitlines
a.find        a.istitle     a.rfind       a.startswith
```

#### Tab completion

While entering an expression in the shell, pressing the Tab key will search the namespace for any variable (objects, functions,etc) matching the characters you typed so far

In [9]:
a = 'foo'


#### Binary operators and comparison

The standard binary math operations and comparisons behave as expected:

In [10]:
12-8

4

In [11]:
5<=12

True

In [12]:
y = x

In [13]:
x is y

True

In [14]:
x is not y

False

In [15]:
x==y

True

In [16]:
a = None

In [17]:
a is None

True

In [18]:
a = 10
b = 6

Add a to b

In [19]:
a + b

16

Subtract b from a

In [20]:
a - b

4

Multiply a by b

In [21]:
a*b

60

Divide a by b

In [22]:
a/b

1.6666666666666667

Floor-divide a by b dropping any fractional remainder

In [23]:
a//b

1

Raise a to the power b

In [24]:
a**b

1000000

In [108]:
a%b # % is the modulo opeator

5

In [25]:
c = True
d = False

In [26]:
c & d # True if both are True, False otherwise. 

False

In [27]:
c | d # True if either a or b is True.

True

In [28]:
a !=b # True is a is not equal to b

True

##### Scalar Type

Along with the standard library of functions of which we have seen a couple (e.g. type(),print() and the operators), Python has a small set of built-in _types_ for handling numerical data, strings and Booleans. These single value _types_ are sometimes called ___scalar types___. 

For numbers, we the main __types__ we use are integers and float denoted in Python as _int_ and _float_. 

In [29]:
iv = 178994 # example of a int

In [30]:
iv**2

32038852036

In [31]:
fv1 = 8.33
fv2 = 7.35e-5
# examples of float types

In [32]:
fv2

7.35e-05

In [33]:
print(type(fv2))
print(type(iv))

<class 'float'>
<class 'int'>


#### Strings 

Python is renowned for being flexible and powerful in processing text also know as strings type in programming. You can write string values by using th single or double quote. 

In [34]:
a  = 'easy way to store a string'

In [35]:
b = "The type of the variable b is "

In [36]:
print(b, type(b))

The type of the variable b is  <class 'str'>


For multiline strings with line breaks, you can use triple quotes, either ''' or """

In [37]:
c = """
This is a long string
spanning over few
lines

"""

In [38]:
c

'\nThis is a long string\nspanning over few\nlines\n\n'

In [39]:
print(c)


This is a long string
spanning over few
lines




In [40]:
c.count('\n')

5

The backslash character within is string is an _escape character_, meaning that it is used to specify special characters like the newline '\n'. In order to write a string literal with backslashes you need to type it twice:

In [41]:
 stri_1 = '12\\44'

In [42]:
print (stri_1)

12\44


You can concatenate strings by adding them using the __+__ operator.

In [43]:
s_1 = "Hi there! "
s_2 = "How are you?"

s_3 = s_1 + s_2

In [44]:
print (s_3)

Hi there! How are you?


String templating and formatting is an important and useful topic. I will briefly introduce it here but we will revisit it on few different occasions. String objects has a useful method called ___.format()___, that kind be used to create string templates. For example:

In [45]:
template = '{0:.2f} {1:s} are equivalent to US${2:d}'

```python
    . {0:.2f} means to format the first argument as a floating point number with two decimal places
    . {1:s} means to format the second argument with a string
    . {2:d} means to format the third and last argument with a 2 digits integer.
    
```

In [46]:
output = template.format(108.56, "Japanese yen", 1)
print(output)

108.56 Japanese yen are equivalent to US$1


### Lists

Besides the scalar data types we have seen so far one very important data type in Python is a ___list___.  It is considered a collection data type. A list is a collection which is ordered and changeable. Allows duplicate members. It is a container or other objects and can take different types within it. List literals are written within square brackets [ ]. For example:

In [47]:
 l_1 = [2,3,4,0.5] 

In [48]:
print(l_1)

[2, 3, 4, 0.5]


Lists allow for _indexing_ and _slicing_. 
#### Indexing



In [49]:
l_1[0] #indexing starts with zero

2

#### Slicing

In [50]:
l_1[1:3]

[3, 4]

In [52]:
l_1[1:]

[3, 4, 0.5]

In [53]:
l_1[:-1]

[2, 3, 4]

In [54]:
l_1[1:-2]


[3]

In [103]:
len(l_1) # returns the length of the list i.e the number of elements inside.

4

### Type casting 

In computer science the act of changing the type of a value or variable is usually referred to as type casting. With the few scalar data and list types we have seen so far we can do the following

In [55]:
s = '3.908'

In [56]:
fval = float(s)

In [57]:
type(fval)

float

In [58]:
int(fval)

3

In [59]:
bool(0)

False

In [63]:
bool(fval)

True

In [66]:
s_2 = str(4.25)
print(s_2)

4.25


In [67]:
list(s_2)

['4', '.', '2', '5']

## Control flow

Python has few built-in keywords for conditional logic and loops. 

##### if, elif and else

The IF statement is one of the most well-known control flow statement types. It checks for a condition that, if True, evaluates the code in the block that follows.

In [68]:
if x<0:
    print ("it's negative")

#### indentation

Python uses whitespace (tabs, spaces) to structure code instead of braces as in many other languages. In the example above the block that Python will evaluate if the condition is true is within one space indentation. In the following example we add another task:

In [69]:
if x<0:
    print ("it's negative")
print("it's printing")

it's printing


In [70]:
if x<0:
    print ("it's negative")
    print("it's printing")

In [72]:
if x<0:
    print("It's negative")
elif x == 0:
    print('equal zero')
elif 0<x<=5:
    print('positive but smaller or equal to 5')
else:
    print('positive and larger than 5')
    

positive but smaller or equal to 5


If any condition is __True__, no further elif or else block will be reached. With a compound condition using __and__ or __or__, conditions are evaluated left to right and will short-circuit:

In [73]:
a = 5; b = 7

In [74]:
c = 8; d = 4

In [75]:
if a < b  or c > d:
    print('Yes')

Yes


In this example, he comparison c < d never gets evaluated because he first comparison is __True__

#### For loops

For loops are for iterating over a collection (like a list) or an __iterater__. The standard syntax for a __for loop__ is:

```python 
   
for value in collection:
    do something with the value

```
Simple example:

In [None]:
for v in l_1:
    print(v)


You can advance a for loop to the next iteration, skipping the remainder of the block, using the __continue__ keyword. Consider this code which sums u integers ina lis ad skips __None__ values.

In [84]:
sequence = [1,2,3,None,4,None]

In [85]:
total = 0

In [88]:
for value in sequence:
    if value is None:
        continue
    total +=value # this is equivelant to writing total =total +1

In [89]:
total

10

Alternatively a for loop can be exited if a certain condition is met by using the key word __break__

In [90]:
sequence = [1,2,0,4,6,5,2,1]

In [91]:
total_until_5 = 0

In [94]:
for value in sequence:
    if value ==5:
        break
    total_until_5 += value
    

In [95]:
total_until_5

13

#### While loop

A while loop specifies a condition and a block of code that is to be executed until the condition evaluates to __False__ or the loop is explicitly ended with a __Break__

In [97]:
x = 256
total = 0
while x >0:
    if total > 500:
        break
    total += x
    x = x//2
total

504

#### Range function

The __range__ function returns an iterator that yields a sequence of evenly spaced integers:

In [98]:
range(10)

range(0, 10)

In [99]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [100]:
type(range(10))

range

In [101]:
list(range(0,20,2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [102]:
list(range(10,0,-2))

[10, 8, 6, 4, 2]

In [104]:
seq=[1,2,3,4]

In [105]:
for i in range (len(seq)):
    val = seq[i]

In [106]:
val

4

In [113]:
sum_1 = 0
for i in range(100000):
    if i % 3 == 0 or i % 5 ==0:
        sum_1 += i

In [114]:
sum_1

2333316668

### Tricks and Magic

#### Introspection
Using the question mark (?) before and after a variable or a function will display some general information about the object:

In [115]:
l_2 = [2,4,5,8]

In [116]:
l_2?

In [121]:
print?

##### The %run command

To illustrate who this command work we will write a small code in __spyder__. You can copy and paste the below simple code in an empty file in __spyder__.

In [124]:
seq = [2,4,6,8]
for i in seq:
    print (i/2)

1.0
2.0
3.0
4.0


In [125]:
%pwd # this command will retrun the home folder where you want to save the sypder code

'/Users/ramzisaouma/ESSEC'

In [126]:
#%run xxx.py

A magic command is any command prefixed by the percent symbol ___%___. If you want Python to time a code for you, use the %timeit magic command

In [128]:

%timeit seq = [2,4,6,8]
for i in seq:
    print (i/2)

79.8 ns ± 3.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
1.0
2.0
3.0
4.0
