<a href="https://colab.research.google.com/github/AF332/ELEVEL---Intro-to-python/blob/main/a3b964573d077091.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Intro Lecture

Most python scripts start with libraries on top - this will let readers of your code know what dependencies (libraries) your script has to be able to run.

In [1]:
import pandas as pd
import math

# Python Basic Data Types

Let's start with defining an object:


In [2]:
number = 1

After running the code, we can call `number` throughout our code, and it will output the value we've stored in it:

In [3]:
number

1

We can also use a `print` statement to check what our object has:

In [4]:
print(number)

1


The most basic (let's call them atomic, for now) data types in Python are numbers - `integer` and `float` - we can conveniently check our data types by calling `type` on an object:

In [5]:
type(2)

int

In [6]:
type(2.5)

float

We can also call `type` on stored variables:

In [8]:
number_1 = 4.4
type(number_1)

float

`Python` complies with all mathematical rules we know such as order of operations:

In [9]:
(1/2)*(32+32)

32.0

In [10]:
1/2*32+32

48.0

We can also call some neat functions, using the `math` library:

In [11]:
math.sqrt(9)

3.0

.. and as these functions resolve to numbers, we can use them to make other calculations!

In [12]:
math.sqrt(9)*20

60.0

Other two important data types are `strings` and `bool`:

In [13]:
type('Hi class!')

str

In [14]:
type(True)

bool

The data types define the operations and the `methods` we can call on the objects. For example, replace is a `string` method that we can use to replace any piece of text with another:

In [15]:
fruit = 'banana'
fruit.replace('a','u')

'bununu'

But calling `replace` on a number will not work!

In [16]:
number_fruits = 15
fruit.replace(1,2)

TypeError: ignored

The `replace` method only works for objects of the `str` type. The same goes for other methods that only work with specific object types.

A method is called by providing a `.` after an object. A `function` is called by using the name of a function followed by `()`.
<br>
<br>
Although similar (they can both return values and use parameters), `functions` are more general (and possibly can be applied to multiple data types) and do not follow an `object oriented programming` logic. For example, the `len` function can be applied to `str` and `list` (a data structure we will see in a minute):

In [17]:
len('banana')

6

# Python Basic Data Structures

Some data structures in Python can hold other types of data. For example, `list`, a very convenient object in Python is able to hold `str`, `int`, `float`, `bool`, other `lists`, etc.

In [18]:
an_example_list = [1,2,'A','B']

We can access our lists by using indexes (calling `[]`):

In [19]:
an_example_list[0]

1

To make our life more difficult, `Python` indexes start on 0, while R starts on 1 😩

We can slice to retrieve multiple elements:

In [20]:
an_example_list[1:3]

[2, 'A']

Slices work as follows: `object[i,j]` means that we are indexing `object` from element `i` until element `j-1`.

`lists` are mutable, meaning that we can change them in-place - for example, I can change the first element by indexing it and assigning it to something new:

In [21]:
an_example_list[0] = 'New element!'

In [22]:
an_example_list

['New element!', 2, 'A', 'B']

On the other way, `str` are not mutable:

In [23]:
my_text = 'Europe'
my_text[5] = 'a'

TypeError: ignored

Also, lists preserve the data types that our underlying objects have:

In [24]:
an_example_list

['New element!', 2, 'A', 'B']

In [25]:
type(an_example_list[1])

int

In [26]:
type(an_example_list[2])

str

In [31]:
an_example_list.append('C')
an_example_list

['New element!', 2, 'A', 'B', 'C', 'C']

Another important data structure is the dictionary that creates a `key-value` pair structure:

In [32]:
languages = {
    'SQL': 1,
    'Python': 2,
    'R': 3,
    'Java': 4,
    'Javascript': 5,
    'Julia': 6
}

Notice that I'm using this "vertical" format to add new key-value pairs. This is not mandatory but is generally considered a best practice if your line of code goes over 79 characters.
<br>
<br>
Google Colab even scolds us if we go over that mark, by putting a ruler on the editor:

In [33]:
languages = {'SQL': 1, 'Python': 2, 'R': 3, 'Java':4, 'Javascript': 5, 'Julia': 6}

We can access dictionaries by their key:

In [34]:
languages['SQL']

1

In [35]:
languages['Python']

2

Three important properties of dictionaries:

In [36]:
languages.items()

dict_items([('SQL', 1), ('Python', 2), ('R', 3), ('Java', 4), ('Javascript', 5), ('Julia', 6)])

In [37]:
languages.keys()

dict_keys(['SQL', 'Python', 'R', 'Java', 'Javascript', 'Julia'])

In [38]:
languages.values()

dict_values([1, 2, 3, 4, 5, 6])

Another important data structure is the `set`, that is able to hold distinct values:

In [39]:
set([1,1,1,1,2])

{1, 2}

# Python Control Flow

Control flow will be important to understand some of the functions we will use throughout the course. We'll mostly use `for`, `while` and `if` statements.

`for` enables us to iterate through a specific object:

In [40]:
for letter in 'Europe':
  print(letter)

E
u
r
o
p
e


In [41]:
list_integers = [2, 4, 6, 10]

for number in list_integers:
  print(number**2)

4
16
36
100


`enumerate` is also cool because it enables us to iterate through indexes and elements:

In [42]:
for index, number in enumerate(list_integers):
  print(index, number)

0 2
1 4
2 6
3 10


`if` is a statement to create conditional situations in our code:

In [43]:
var = 12

if var == 12:
  print("It's true!")
else:
  print("It's not true!")

It's true!


In [44]:
var = 15

if var == 12:
  print("It's true!")
else:
  print("It's not true!")

It's not true!


We can also use `elif` to create other conditions:

In [45]:
var = 15

if var < 12:
  print("var is less than 12.")
elif var < 15:
  print("var is less than 15.")
else:
  print("var is greater or equal than 15!")

var is greater or equal than 15!


`while` loops keep going until a certain condition is met - watch out as this is really prone to inifinite loops!

In [47]:
n = 1
while n <= 10:
  print('Number is {}'.format(n))
  n = n+1

  # n = n+1 or n+=1 are exactly the same code!

Number is 1
Number is 2
Number is 3
Number is 4
Number is 5
Number is 6
Number is 7
Number is 8
Number is 9
Number is 10


With Python functions, indentation controls the code! The indented blocks depend on the code that sits a level before it.

# Pandas

One of the most important libraries for machine learning and data analytics or science is `pandas`.

Let's create our first `pandas` DataFrame. *Note: Keep in mind that this is one of the many ways that we can use to create pandas dataframes!*

In [50]:
pd.DataFrame([0,1,2], columns=['col'], index=['A','B','C'])

Unnamed: 0,col
A,0
B,1
C,2


We can pass more data by passing a list of lists:

In [51]:
pd.DataFrame([[0,'A',2],[1,'B',3]], columns=['col_1', 'col_2','col_3'], index=['A','B'])

Unnamed: 0,col_1,col_2,col_3
A,0,A,2
B,1,B,3


In [53]:
dataframe_example = pd.DataFrame(
    [[0,'A',2],[1,'B',3]],
    columns=['col_1', 'col_2','col_3'],
    index=['A','B']
)

Notice that the two previous code blocks are exactly the same (except the assignment to `dataframe_example`. The latter pd.DataFrame functions produces exactly the same result as the former, with the different that we are stacking the code to avoid overflowing 79 characters.

Here, we can play around with our `dataframe` by using indexes - here's 4 examples of different examples of indexes using the `loc` and `iloc` syntax:

In [54]:
dataframe_example.loc[:,'col_1']

A    0
B    1
Name: col_1, dtype: int64

In [55]:
dataframe_example.iloc[:,0]

A    0
B    1
Name: col_1, dtype: int64

In [56]:
dataframe_example.loc['A',:]

col_1    0
col_2    A
col_3    2
Name: A, dtype: object

In [57]:
dataframe_example.iloc[0,:]

col_1    0
col_2    A
col_3    2
Name: A, dtype: object