**Python Basics – Quick Start Guide**

Python was created by Guido van Rossum in the early 90s. It is a high-level, general-purpose programming language. It is known for its clear and concise syntax, which makes it easy to read and write. Python is also dynamically typed, which means that it is not necessary to declare variable types in advance. This makes Python code more flexible and easier to write. Today, it's one of the most popular languages around. We fell in love with Python for its syntactic clarity. It is essentially executable pseudocode. 

It is the most used programming language for [data analytics, data science, machine learning algorithms, but also in power grid analytics](https://storage.googleapis.com/kaggle-media/surveys/Kaggle%20State%20of%20Machine%20Learning%20and%20Data%20Science%20Report%202022.pdf). Also, it is a good replacement for MATLAB users and people who do not want to much strain with the coding syntax. According to the last [Stack Overflow Survey (2030)](https://survey.stackoverflow.co/2023/#section-most-popular-technologies-other-tools), the language is amongst the most desired and used programming languages. This is certainly due to is simplicity of usage and learning path.

> For data analysis and interactive computing and data visualization, Python will inevitably draw comparisons with other open source and commercial programming languages and tools in wide use, such as R, MATLAB, SAS, Stata, and others. In recent years, Python’s improved open source libraries (such as pandas and scikit-learn) have made it a popular choice for data analysis tasks. Combined with Python’s overall strength for general-purpose software engineering, it is an excellent option as a primary language for building data applications.

— Wes McKinney, the creator of Python pandas project and author of [*Python for data analysis: Data wrangling with Pandas, NumPy, and IPython*](https://wesmckinney.com/book/#whats-new-in-the-3rd-edition).

In this part, we are going to tackle the most important concepts of Python.

# Primitive datatypes and operators

The first thing we'll look at is the different types of data in Python (integer, float, boolean, and string).

## Integer

An integer often refered to as an `int`, is a data type used to represent whole numbers, both positive and negative, without fractional or decimal part.
In Python, integers are used for performing arithmetic operations, counting, indexing, and more – for example:

In [10]:
3

3

In [11]:
x = 3
x

3

## Float

A floating-point number, often referred to as a `float`, is a data type used to represent numbers with decimal points or fractional parts. Floats are used for representing real numbers. 
They can include both integers and non-integers. They are commonly used for calculations involving measurements, scientific computations, and more – for example:

In [12]:
3.0

3.0

**Remark**: `print()` function is not mandatory to obtain results in a notebook, but if you want to print everything from the same cell, you must use it. Otherwise, you will only get the last result printed.

## Boolean

A boolean is a data type that represents one of two values: `True` or `False`. Booleans are used for logical operations, comparisons, and decision-making in programming. 
They are integral for creating conditions and controlling the flow of a program. Boolean values are essential for creating conditional statements, loops, and controlling program execution based on certain conditions. Understanding these types is crucial for writing effective and accurate Python programs – we can use them as follows:

In [13]:
print(True)
print(False)

True
False


**Remark**: boolean values are primitives (capitalization is important).

## String

A string, often referred to as a `str`, is a data type used to represent a sequence of characters. Strings can contain letters, numbers, symbols, and even spaces. 
They are commonly used for working with textual data, such as words, sentences, and more complex text-based information. 
Strings are enclosed in either single quotes (`'`) or double quotes (`"`) – for example:

In [14]:
print('hello')
'1'

hello


'1'

**Remark**: Strings are versatile and can be manipulated, concatenated, and formatted in various ways. They play a crucial role in handling textual information within a Python program.

## The `type()` method

In order to ckeck variable type you can use `type()` function:

In [15]:
print(type(3))
print(type(3.0))
print(type(True))
print(type('hello'))

<class 'int'>
<class 'float'>
<class 'bool'>
<class 'str'>


In a nutshell, with Python, there are more or less 4 types of basic data types:

- Integer: a whole number, positive or negative. For example, `1`, `2`, `3`, `-1`, `-2`, `-3` are integers.
- Float is a number with a decimal point. For example, `3.14`, `2.5`, `-1.23` are floats.
- Boolean is a value that can be either `True` or `False`.

Here is a comple example of how to declare variables of each data type in a Jupyter Notebook with Python:

In [16]:
integer_var = 10
float_var = 3.14
boolean_var = True
string_var = "grids"

As we have just seen above, we can also use `type()` function to check the data type of a variable. For example:

In [17]:
print(type(integer_var))
print(type(float_var))
print(type(boolean_var))
print(type(string_var))

<class 'int'>
<class 'float'>
<class 'bool'>
<class 'str'>


## Basic operations

The maths is what you would expect, including integer division that rounds down for positive and negative numbers – for example:

In [18]:
# Integers
print(1+1) # => 2
print(8-1) # => 7
print(10*2) # => 20
print(35/5) # => 7.0

# Integer division rounds down
print(5//3)      # => 1
print(-5//3)      # => -2
print(5.0//3.0)   # => 1.0 # works on floats too
print(-5.0//3.0)  # => -2.0

2
7
20
7.0
1
-2
1.0
-2.0


The result of division is always a float:

In [19]:
print(type(10.0/3))
print(type(9/3))

<class 'float'>
<class 'float'>


A modulo operation is also possible, as follows:

In [20]:
print(7%3) # => 1
# i % j have the same sign as j, unlike C
print(-7%3)  # => 2

1
2


The modulo operator is widely used in programming languages because it can be used to solve a variety of problems. For example, it can be used to:

- Check if a number is even or odd.
- Find the remainder of a number when it is divided by a certain number.
- Generate pseudorandom numbers.
- Encrypt and decrypt data.

Here is an example of how the modulo operator can be used in Python:

**Remark**: the modulo operator, also called the remainder operator, is used to find the remainder when one integer is divided by another. For example, `5` mod `3` is `2`, because `5` divided by `3` has a quotient of `1` and a remainder of `2`.

Exponentiation (`x**y`, `x` to the $y^{th}$ power):

In [21]:
2**3 # => 8

8

Enforce precedence with parentheses:

In [22]:
print(1 + 3 * 2)    # => 7
print((1 + 3) * 2)  # => 8

7
8


Boolean logic operations are also possible, as follows:

In [23]:
print(True & False) # And
print(True | False) # Or

False
True


You can also use `+` and `*` operators, but the result would be 1 or 0:

In [24]:
print(True + False)
print(True * False)

1
0


String concatenation:

In [25]:
print('hello' + ' world')

hello world


# Lists

Python has different ways to group values together. One of them is called a _list_. A list is a collection of values separated by commas and enclosed in square brackets. Lists can contain different types of values but usually contain values of the same type. Python knows a number of _compound_ data types, used to group together other values. One of the most versatile is the list, which can be written as a list of comma-separated values (items) between square brackets. Lists might contain items of different types, but usually the items all have the same type.

In [26]:
squares = [1, 4, 9, 16, 25, 36, 49, 64]
print(squares)
print(type(squares))

[1, 4, 9, 16, 25, 36, 49, 64]
<class 'list'>


The items in the list do not have to be of the same type:

In [27]:
my_list = [1, 'A', 1.0, True]
print(my_list)

[1, 'A', 1.0, True]


The built-in function `len()` return the number of elements of a list:

In [28]:
len(squares)

8

## Indexing and slicing

Like strings (and all other built-in [sequence](https://docs.python.org/3.10/glossary.html#term-sequence) types), lists can be indexed and sliced:

In [29]:
print(squares[0]) # indexing returns the item
print(squares[2])

1
9


**Remark**: Pay attention the 0 is the first index. It is a choice in the programming language design (and different from MATLAB). The main reasons are the following: memory location, compatibility with C, relationship simplification between indices and lengths, etc.

To get the n<sup>th</sup> last element of the list use -n

In [30]:
print(squares[-1])
print(squares[-2])

64
49


To get every element between to boundaries, you can use `:` operator:

In [31]:
squares[2:5]

[9, 16, 25]

First and last index are not needed:

In [32]:
print(squares[:3])
print(squares[-3:])

[1, 4, 9]
[36, 49, 64]


To filter a list to get every n-element add `:n`, as follows:

In [33]:
print(squares[::2])
print(squares[1::2])
print(squares[1:-3:2])

[1, 9, 25, 49]
[4, 16, 36, 64]
[4, 16]


All slice operations return a new list containing the requested elements. This means that the following slice returns a [shallow copy](https://docs.python.org/3.10/library/copy.html#shallow-vs-deep-copy) of the list:

In [34]:
squares[:]

[1, 4, 9, 16, 25, 36, 49, 64]

## List concatenation and mutation

Lists are a [mutable](https://docs.python.org/3.10/glossary.html#term-mutable) type, i.e. it is possible to change their content, like concatenation:

In [35]:
squares + [81, 100]

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

We can also add new items at the end of the list, by using the `append()` method:

In [36]:
squares.append(121) # add 11 squared
squares.append(12 ** 2)  # add 12 squared
squares

[1, 4, 9, 16, 25, 36, 49, 64, 121, 144]

We can also modify an element in a list:

In [37]:
letters = ['a', 'b', 'c', 'x', 'e', 'f', 'g']
letters

['a', 'b', 'c', 'x', 'e', 'f', 'g']

In [38]:
letters[3] = 'd' 
letters

['a', 'b', 'c', 'd', 'e', 'f', 'g']

Assignment to slices is also possible, and this can even change the size of the list or clear it entirely:

In [39]:
??

Now, we can replace some values:

In [40]:
letters[2:5] = ['X']
letters

['a', 'b', 'X', 'f', 'g']

Then we can remove them:

In [41]:
letters[2:5] = []
letters

['a', 'b']

We can then clear the list by replacing all the elements with an empty list:

In [42]:
letters[:] = []
letters

[]

## Nested list

It is possible to nest lists (create lists containing other lists) – for example:

In [43]:
a = ['a', 'b', 'c']
n = [1, 2, 3]
x = [a, n]
x

[['a', 'b', 'c'], [1, 2, 3]]

In [44]:
print(x[0])
print(x[0][1])

['a', 'b', 'c']
b


To go further, we can find other [list build-in function](https://docs.python.org/3.10/tutorial/datastructures.html#tut-structures).

# Dictionnary

Dictionaries are Python’s implementation of a data structure that is more generally known as an associative array. A dictionary consists of a collection of key-value pairs. Each key-value pair maps the key to its associated value.

In [45]:
my_dict = {'A': 1, 'B': 2, 'C': 3, 'D': 4, 'E': 5, 'F': 6, 'G': 7}
print(type(my_dict))
my_dict

<class 'dict'>


{'A': 1, 'B': 2, 'C': 3, 'D': 4, 'E': 5, 'F': 6, 'G': 7}

**Remark**: pay attention that evey key has to be unique.

The built-in function `len()` also return the number of element's pairs of a dictionary:

In [46]:
len(my_dict)

7

## Dictionnay data access

To have acess to the values call the specified keys from the dictionary:

In [47]:
print(my_dict['A'])
print(my_dict['E'])

1
5


In [48]:
print(my_dict.keys())
print(my_dict.values())

dict_keys(['A', 'B', 'C', 'D', 'E', 'F', 'G'])
dict_values([1, 2, 3, 4, 5, 6, 7])


In [49]:
print(list(my_dict.keys()))
print(list(my_dict.values()))

['A', 'B', 'C', 'D', 'E', 'F', 'G']
[1, 2, 3, 4, 5, 6, 7]


## Dictionnary concatenation and mutation

The `update()` function allows to concatenate one dictionary to another:

In [50]:
my_dict.update({'H': 8, 'I': 9, 'J': 10})
my_dict

{'A': 1,
 'B': 2,
 'C': 3,
 'D': 4,
 'E': 5,
 'F': 6,
 'G': 7,
 'H': 8,
 'I': 9,
 'J': 10}

**Remark**: pay attention this function returns nothing and automatically update the dictionary**.

We can also add one key, value pair:

In [51]:
my_dict['K'] = 11
my_dict

{'A': 1,
 'B': 2,
 'C': 3,
 'D': 4,
 'E': 5,
 'F': 6,
 'G': 7,
 'H': 8,
 'I': 9,
 'J': 10,
 'K': 11}

To remove one key, value pair use  `del` function:

In [52]:
del my_dict['A']
my_dict

{'B': 2,
 'C': 3,
 'D': 4,
 'E': 5,
 'F': 6,
 'G': 7,
 'H': 8,
 'I': 9,
 'J': 10,
 'K': 11}

## Nested dictionary

It is also possible to nest dictionary:

In [53]:
nested_dict = {
    1: {'A': 1, 'B': 2, 'C': 3, 'D': 4, 'E': 5, 'F': 6, 'G': 7},
    'other_dict': {'a': 'A', 'b': 'B', 1: 'A', 2: 'B'}
}
nested_dict

{1: {'A': 1, 'B': 2, 'C': 3, 'D': 4, 'E': 5, 'F': 6, 'G': 7},
 'other_dict': {'a': 'A', 'b': 'B', 1: 'A', 2: 'B'}}

# Python basis programming function

## `if` statements

Perhaps the most well-known statement type is the [`if`](https://docs.python.org/3.10/reference/compound_stmts.html#if) statement – for example:

In [54]:
x = 42
if x < 0:
    x = 0
    print('Negative changed to zero')
elif x == 0:
    print('Zero')
elif x == 1:
    print('Single')
else:
    print('More')

More


**Remark**: pay attention to the systax: there is not end to if statements, every lines bellow aan if and containing a tab are within the if statement.

There can be zero or more [`elif`](https://docs.python.org/3.10/reference/compound_stmts.html#elif) parts, and the [`else`](https://docs.python.org/3.10/reference/compound_stmts.html#else) part is optional. The keyword `elif` is short for 'else if', and is useful to avoid excessive indentation. An `if` ... `elif` ... `elif` ... sequence is a substitute for the `switch` or `case` statements found in other languages.


In [55]:
x = 42
if (x >= 0) & (x <= 1):
    x = 0
    print('x between 0 and 1')
elif (x >= 3) | (x <= -4):
    print('x gater than 3 or smaller than -4')
else:
    print('other')

x gater than 3 or smaller than -4


## `while` statements

With the while statement, we can execute a set of statements as long as a condition is true. For instance, we can write an initial sub-sequence of the [Fibonacci series](https://en.wikipedia.org/wiki/Fibonacci_number) as follows:

In [56]:
# The sum of two elements defines the next
a, b = 0, 1
while a < 10:
    print(a)
    a, b = b, a+b

0
1
1
2
3
5
8


This example introduces new features, let's go through them:

- The first line contains a *multiple assignment*: the variables a and b simultaneously get the new values 0 and 1. On the last line this is used again, demonstrating that the expressions on the right-hand side are all evaluated first before any of the assignments take place. The right-hand side expressions are evaluated from the left to the right.

- The [`while`](https://docs.python.org/3.10/reference/compound_stmts.html#while) loop executes as long as the condition (here: `a < 10`) remains true. The condition may also be a string or list value; anything with a non-zero length is true, empty sequences are false. The test used in the above example, is a simple comparison and the standard comparison operators are written the same as in C. The standard comparison operators are written the same as in C: `<` (less than), `>` (greater than), `==` (equal to), `<=` (less than or equal to), `>=` (greater than or equal to) and `!=` (not equal to).

- The body of this loop is indented and each line within a basic block must be indented by the same amount (this is Python's way of grouping statements). At the interactive prompt, you have to type a <kbd>Tab</kbd> or space(s) for each indented line. In practice you will prepare more complicated input for python with a text editor & IDE (VS Code, Jupyter, etc.); all decent text editors have an auto-indent facility. When a compound statement is entered interactively, it must be followed by a blank line in order to indicate completion (since the parser cannot guess when you have typed the last line). Note that each line within a basic block must be indented by the same amount.

- The `print()` function writes the value of the argument(s) it is given and differs from just writing the expression you want to write in the way it handles multiple arguments, floating point quantities, and strings. Strings are printed without quotes and a space is inserted between items so you can format things nicely.

The keyword argument _end_ can be used to avoid the newline after the output, or end the output with a different string:

In [57]:
a, b = 0, 1
while a < 1000:
    print(a, end=', ')
    a, b = b, a+b

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 

## `for` statements

The [`for`](https://docs.python.org/3.10/reference/compound_stmts.html#for) statement in Python iterates over the items of any sequence (list, dictionary, string, ...), in the order that they appear in the sequence. This differs from what you may be used to in C or Pascal where you always iterate over an arithmetic progression of numbers (like in Pascal), or give the user the ability to define both the iteration step and halting condition (as C).

In [58]:
# Measure some strings
words = ['cat', 'window', 'defenestrate']
for w in words:
    print(w, len(w))

cat 3
window 6
defenestrate 12


You get the colection elements number using `enumerate()`:

In [59]:
for i, w in enumerate(words):
    print(i, w)

0 cat
1 window
2 defenestrate


In [60]:
# Create a sample collection
users = {'Hans': 'active', 'Éléonore': 'inactive', 'Charles': 'active'}

for user, status in users.items():
    print(user, ": ", status)

Hans :  active
Éléonore :  inactive
Charles :  active


Be carfull in code that modifies a collection while iterating over that same collection, it is safer to create a new collection:

In [61]:
# Strategy:  Create a new collection
active_users = {}
for user, status in users.items():
    if status == 'active':
        active_users[user] = status

active_users

{'Hans': 'active', 'Charles': 'active'}

To go further, You can find other [Looping Techniques](https://docs.python.org/3.10/tutorial/datastructures.html#tut-loopidioms).

## `range()` function

If you need to iterate over a sequence of numbers, the built-in function `range()` will come in handy. It generates arithmetic progressions:

In [62]:
range(10)

range(0, 10)

In [63]:
for i in range(5):
    print(i)

0
1
2
3
4


The given end point is never part of the generated sequence; `range(10)` generates 10 values, the legal indices for items of a sequence of length 10. It is possible to let the range start at another number, or to specify a different increment (even negative; sometimes this is called the 'step'):

In [64]:
print(list(range(5, 10)))
print(list(range(0, 10, 3)))
print(list(range(-10, -100, -30)))

[5, 6, 7, 8, 9]
[0, 3, 6, 9]
[-10, -40, -70]


To iterate over the indices of a sequence, you can combine [`range()`](https://docs.python.org/3.10/library/stdtypes.html#range) and [`len()`](https://docs.python.org/3.10/library/functions.html#len) as follows:

In [65]:
a = ['I', 'like', 'the', 'power', 'grid', 'studies']
my_dict = dict()
for i in range(len(a)):
    print(i, a[i])

0 I
1 like
2 the
3 power
4 grid
5 studies


In many ways the object returned by `range()` behaves as if it is a list, but in fact it isn’t. It is an object which returns the successive items of the desired sequence when you iterate over it, but it doesn’t really make the list, thus saving space.

We say that such an object is [iterable](https://docs.python.org/3.10/glossary.html#term-iterable), meaning that it is suitable as a target for functions and constructs that expect something from which they can obtain successive elements until the supply is exhausted. We've seen that the [`for`](https://docs.python.org/3.10/reference/compound_stmts.html#for) statement is such a construct, while an example of a function that takes an iterable object is [`sum()`](https://docs.python.org/3.10/library/functions.html#sum):

In [66]:
sum(range(4)) # 0 + 1 + 2 + 3

6

Later we will see more functions that return iterables and take iterables as arguments. In chapter [Data Structures](https://docs.python.org/3.10/tutorial/datastructures.html#tut-structures), we will discuss in more detail about [`list()`](https://docs.python.org/3.10/library/stdtypes.html#list).

Coming soon ...

# Shallow copy and deep copy in Python

We will go through the different way of copying a dataset correctly.

In [1]:
A = [1, 2, 3]
B = A
B.append(4)

print(A)
print(B)

[1, 2, 3, 4]
[1, 2, 3, 4]


In this case `B` and `A` are both references to the list, so when we modified the list by `B`, the `A` was also changed. Therefore, how to copy the list `A` to `B`? In other words, how could we modify `B` without changing `A` values?

## Shallow copy

In [3]:
import copy

A = [1, 2, 3]
B = A.copy()
#B = copy.copy(A)  # The same as `B = A.copy()`
B.append(4)

print(A)
print(B)

[1, 2, 3]
[1, 2, 3, 4]


As the above example shown, we can use `B = A.copy()` or `B = copy.copy(A)` to implement the copy operation. After that, `B` and `A` are independent with each other. We can modify `B` without affecting `A`.

In the previous case, we were assigning operation in python. Meaning that operation on `B` will assign values to `A`. In the assignment case, we do not really create a new list in memory address, it just created a new reference `B` for the list. In other words, `A` and `B` are two references for the same list.

With the `copy()` method we do not assign anything. It really created a new list with the same value of the original list. Therefore, we can modify `A` or `B` independently.

## What does the 'shallow' mean?

It means that `B = A.copy()` or `B = copy.copy(A)` are both shallow copy. This term means they can't create a 100 % copy of nested objects, instead it just copies the first level of the nested objects. For example:

In [None]:
A = [[1, 2], ['a', 'b']]
B = copy.copy(A)  # The same as `B = A.copy()`
B.append(4)
print(A)
print(B)

B[0].append(3)
print(A)
print(B)

So now, the list has two levels. The above results show that the first level was copied, but the second level was still assigned by reference. We can observe that by modifying `B[0]`, the `A[0]` was changed as well. In a nutshell, the `copy.copy()` function only applies to the first level of a nested object. The deep levels are still assigned by reference.

This 'lazy' performance can improve efficiency of copy operations and it conforms to Python's design philosophy. Because the copy method needs to create new lists, which costs time.

However, what can we do if we really want to totally copy operation?

## Deep copy

We can use `copy.deepcopy()` to copy a nested object completely no matter how many nested levels of it.

In [4]:
A = [[1, 2], ['a', 'b']]
B = copy.deepcopy(A)

B.append(4)
print(A)
# [[1, 2], ['a', 'b']]
print(B)
# [[1, 2], ['a', 'b'], 4]

B[0].append(3)
print(A)
# [[1, 2], ['a', 'b']]
print(B)
# [[1, 2, 3], ['a', 'b'], 4]

[[1, 2], ['a', 'b']]
[[1, 2], ['a', 'b'], 4]
[[1, 2], ['a', 'b']]
[[1, 2, 3], ['a', 'b'], 4]


As the above example shown, no matter how we modified `B` , `A` will never be affected. Because the `B` is a 100 % copy of `A` , there is no assignment by reference in a `deepcopy()` operation.

NB: Because a `deepcopy()` operation always totally copies all nested objects, it will cost lots of time if the objects are too large.

In order to improve the efficiency, the `=` operation in Python is designed to pass by reference rather than by value. If we really need to copy objects (pass by value), we should use the `copy.copy()` method for shallow copy or the `copy.deepcopy()` method for deep copy. We must be careful when using `deepcopy()` for large objects, cause it costs lots of time.

To go in more details, please check the following short tutorial which can explain in details:
- [Shallow Copy and Deep Copy in Python](https://medium.com/techtofreedom/shallow-copy-and-deep-copy-in-python-78c3e7cf2617)
- [Shallow and deep copy operations](https://docs.python.org/3.10/library/copy.html#shallow-vs-deep-copy)

# References for Python's basics

If you want to get more insights and learn the basics in Python, please visit the following two very good references:

- [Learn X in Y minutes](https://learnxinyminutes.com/docs/python/)
- [The Python Tutorial](https://docs.python.org/3/tutorial/)