## The Python programming language

Python is a very popular general-purpose language. Its design emphasizes code readability and with the use of significant indentation. 

According to many online rankings, such as [TIOBE index](https://www.tiobe.com/tiobe-index/), Python is one of the most widely used programming languages today.

It was created in 1991 by Guido Van Rossum, and thanks to its popularity there are [many resources](https://programminglanguages.info/language/python/) for learning, packages and tools for developers, etc. 

## Basics of programming in Python

This notebook will cover some basics elements of the Python programming language:

* Indentation
* Variables
* If/else statements
* For loops
* List
* Loading and parsing CSV files

### Use of indentation in Python

Indentation in Python is used to create a group of statements that are executed as a block. Many popular languages such as C, and Java uses braces ({ }) to define a block of code, and Python uses indentation.

The indentation in Python code can be done by using any number of whitespaces, but we just have to be consistent (use the same number throughout), however __4 whitespaces for each level of indentation is the best practice__.

Some indentation rules in Python:

* The first line of Python code can’t have an indentation.
* Avoid mixing tabs and whitespaces to create an indentation. 
* It is preferred to use whitespace than the tab character.
* The best practice is to use 4 whitespaces for the first indentation and then keep adding additional 4 whitespaces to increase the indentation.

In [1]:
sum_even = 0
sum_odd = 0
for i in range(1, 101):
    # 4 whitespaces of indentation for the code under the for loop
    if i % 2 == 0:
        # 4 more whitespaces for the code under the if/else
        sum_even = sum_even + i 
    else: 
        sum_odd = sum_odd + i
print("The sum of all even numbers between 1 and 100 is ", sum_even) 
print("The sum of all odd numbers between 1 and 100 is ", sum_odd) 

The sum of all even numbers between 1 and 100 is  2550
The sum of all odd numbers between 1 and 100 is  2500


__Note:__ Comments in Python are identified with a hash symbol, #, and extend to the end of the line. Multiple line comments can be created with a triple-quoted string:

In [2]:
# Single line comment

'''
Multiple line comment
Second line of comment
And another
Below we close the multiple line comment
'''

# Below is a regular line of code
print('Hello')

Hello


### Variables in Python

Variables are containers for storing data values. 

Python has no command for declaring a variable, a new variable is created the moment you first assign a value to it.

Also, variables do not need to be declared with any particular type, and can even change type after they have been set.

In [3]:
# Demonstrating variables in Python

# We create a numeric variable called number:
number = 0
print(number)
number = number + 3
print(number)

# We create another numberic variable called another_number:
another_number = 10
number = number + another_number
print(number)

# We create two variables containing strings of text (This type of data is called string for short)
some_text = 'Hello'
more_text = "world!"

# Note how strings can be specified with a single quote (') character, or double quotes ("). Just use the same quote in one string

print(some_text, more_text)

# We can change the type of a variable by assigning a new value:
number = some_text

# Now the number variable contains a string:
print(number)

0
3
13
Hello world!
Hello


### If/else statements

The if statement allows us to do something when a condition is true. We can optionally use else to handle the situation when the condition is false.

In [4]:
if another_number % 2 == 0:
    print(another_number, 'is even')
else:
    print(another_number, 'is odd')

if number == 'Hello':    
    print('Good bye!')
    # We dont' need to have an else pairing an if, if we don't need to do anything when the condition in the if is not true

10 is even
Good bye!


### For loops

The for loop allows us to repeat a block of instructions a (pre-specified) number of times.

In [5]:
# We can replace the following 5 print instructions...
print(1)
print(2)
print(3)
print(4)
print(5)

# ...with the following for loop:
for i in range(1, 6):
    print(i)

1
2
3
4
5
1
2
3
4
5


### Lists

List are used in Python to store multiple values in one single variable. We can access the individual values in an list by referring to an index number.

__Note:__ The index in Python list starts at 0, be careful with this since other languages, such as R, start counting list positions at 1.

In [6]:
# We can create an list by assigning all its values in one go.
numbers = [3, 7, 1, 50, 1, 12]

# We print the first value in the numbers list
print(numbers[0])

# We change the valuye of the 5th element in the list
numbers[4] = 3 

# We print the entire list:
print(numbers)

3
[3, 7, 1, 50, 3, 12]


### Slicing list in Python

Slicing an list is the concept of cutting out – or slicing out – a part of the list. Getting used to this feature can be very convenient when we write more complex programs:

In [7]:
# We slice the list numbers from element 1 to 3 - 1 = 2 
print(numbers[1:3])

# If we don't set the last element, we take a slice from 1 to the end:
print(numbers[1:])

# We can set the "step" when creating a slice in a third optional parameter.
# Here, we take a slice from 1 to 4 with a step of 2, so the slice has 2 elements, 1 and 3 from the original list
print(numbers[1:5:2])

# We can assign the result from a slicing operation, that would make a new list:
some_numbers = numbers[1:3]
print(some_numbers)

# numbers and some_numbers are now "independent". If we change something in one of them, it will not affect the other:
some_numbers[0] = 111

print(numbers)
print(some_numbers)

[7, 1]
[7, 1, 50, 3, 12]
[7, 50]
[7, 1]
[3, 7, 1, 50, 3, 12]
[111, 1]


One cool trick is that we can use a negative stepping value, which we can use to revert an list:

In [8]:
flipped_numbers = numbers[::-1]
print(flipped_numbers)

[12, 3, 50, 1, 7, 3]


In [9]:
# An list can contain strings:
animals = ["squirrel", "cat", "mouse"]
print(animals)

# We can even have different types of data in the same list
mixed_data = [101, "hello", 31, 32, 'python rocks']
print(mixed_data)

['squirrel', 'cat', 'mouse']
[101, 'hello', 31, 32, 'python rocks']


Finally, if we want to know the number of elements in an list (it's length) we can use the len() function:

In [10]:
print(len(some_numbers))
print(len(flipped_numbers))
print(len(numbers))

2
6
6


### Loading and parsing CSV files

Tables are commonly stored in comma-separated values (CSV) files. We can read CSV files to load the data into our program as follows:

In [11]:
'''
We are using a well-known data file containing housing information 
(prices of houses, and their characteristics such as area, number of bedrooms, etc).
It has been used to predict house prices as a function of their characteristics.
The file can be accessed from this Kaggle page:
https://www.kaggle.com/datasets/yasserh/housing-prices-dataset
'''

f = open('Housing.csv')
lines = f.readlines()
# We slice the lines list to the first 11 elements since the entire file has more than 500 lines
for line in lines[:11]:
    print(line)
f.close()  # remember to close the file when you are done reading it!

price,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus

13300000,7420,4,2,3,yes,no,no,no,yes,2,yes,furnished

12250000,8960,4,4,4,yes,no,no,no,yes,3,no,furnished

12250000,9960,3,2,2,yes,no,yes,no,no,2,yes,semi-furnished

12215000,7500,4,2,2,yes,no,yes,no,yes,3,yes,furnished

11410000,7420,4,1,2,yes,yes,yes,no,yes,2,no,furnished

10850000,7500,3,3,1,yes,no,yes,no,yes,2,yes,semi-furnished

10150000,8580,4,3,4,yes,no,no,no,yes,2,yes,semi-furnished

10150000,16200,5,3,2,yes,no,no,no,no,0,no,unfurnished

9870000,8100,4,1,2,yes,yes,yes,no,yes,2,yes,furnished

9800000,5750,3,2,4,yes,yes,no,no,yes,1,yes,unfurnished



__Note:__ Now that we have read the data into the lines list, we don't need to open the file again

In [12]:
# We also skip the first line since it contains the header of the colummns
for line in lines[1:11]:
    line_contents = line.split(',')
    print("price =", line_contents[0])
    print("area =", line_contents[1])
    print("bedrooms =", line_contents[2])

price = 13300000
area = 7420
bedrooms = 4
price = 12250000
area = 8960
bedrooms = 4
price = 12250000
area = 9960
bedrooms = 3
price = 12215000
area = 7500
bedrooms = 4
price = 11410000
area = 7420
bedrooms = 4
price = 10850000
area = 7500
bedrooms = 3
price = 10150000
area = 8580
bedrooms = 4
price = 10150000
area = 16200
bedrooms = 5
price = 9870000
area = 8100
bedrooms = 4
price = 9800000
area = 5750
bedrooms = 3


In [13]:
# We now convert the string values in the data into (integer) numbers so we can use them to make some calculations:
mean_price = 0
mean_area = 0
for line in lines[1:]:
    line_contents = line.split(',')
    mean_price = mean_price + int(line_contents[0])

    # The following notation is a short-hand for mean_area = mean_area + int(line_contents[1])
    mean_area += int(line_contents[1])

mean_price = mean_price / (len(lines) - 1)
print("Mean price =", mean_price)

mean_area /= len(lines) - 1
print("Mean area =", mean_area)

Mean price = 4766729.247706422
Mean area = 5150.54128440367


In [14]:
# We can format how numbers are printed (does not change the value, just the output):
print(f"Mean price = {mean_price:.2f}")
print(f'Mean area = {mean_area:.1f}')

Mean price = 4766729.25
Mean area = 5150.5


## Some final tricks

Since we are using Jupyter Notebooks, we can complement our code with things like embedded images and videos:

In [15]:
%%HTML
<iframe width="560" height="315"
 src="https://youtube.com/embed/5pf0_bpNbkw"
</iframe>

![Python Basics Cheat Sheet](https://images.datacamp.com/image/upload/v1694526357/Python_Basics_Cheat_Sheet_27d91b08b7.png)