**Introductory and intermediate computing for Data Science [Barcelona School of Economics]**

`Instructor:` Maxim Fedotov  
`Program:` M.Sc. in Data Science Methodology

# Class 1.

# Variables

A variable in Python has a *name* and points to an associated *memory cell* where some *value* is stored. So, to define a variable, we just need to assign a value (more generally -- expression) to a chosen name by using a binary operator `=`. That is, the construction is: `name = expression`. The name of a variable is also called an *identifier*.

To see an example, consider a problem of computing an area of a rectangle with sides 3 and 5.

In [None]:
length = 3
height = 5.0

There are a couple simple ways to display a value of a variable. For instance, you can use function `print(...)` to print the value.

In [38]:
# try to print a value of the 'length' variable


In a Jupyter Python notebook you can simply type a name of a variable in a cell and run it. This displays the corresponding value below.

In [39]:
# try to display a value of the 'height' variable


We can learn a numerical address in the memory to which a variable name points to via function `id(...)`. You can also get a hexadecimal representation of this addres by applying function `hex(...)`.

In [40]:
print("A numeric ID of a memory cell to which the `length` variable points to:", id(length))
print("A HEX representation of this address is:", hex(id(length)))

A numeric ID of a memory cell to which the `length` variable points to: 4364599600
A HEX representation of this address is: 0x104268130


Note that if you create another variable with the same value, it will have the same id:

In [41]:
another_length = 3
print(f"For the `length` variable, the id is: {id(length)}", 
      f"For the `another_length`: {id(another_length)}", 
      sep="\n")

For the `length` variable, the id is: 4364599600
For the `another_length`: 4364599600


Note, that it is possible to *overwrite* a variable, i.e. replace a value that is assigned to a specific name.  

In [42]:
length = 6  # note that it could be also a value of a different type, Python is flexible in this sense.
print(length)

6


### Naming a variable

There is a conventional way to spell a name to a variable which is called: *snake_case*. Which means that if a name is a combination of several words then they are split by `_`. Preferably, letters in a name should be lower case, although it is not necessary. Note that names (or identifiers) in Python are case-sensitive. Also, a variable name cannot start with a number.  

Generally, any name that you introduce in a program should reflect the nature behind the named entity. So, just try to incorporate a purpose of an entity in its name. For example, `car_speed = 50`. From this example, you can also see that it can be hard to understand a code without any additional comments. So you can comment your code using hash character `#`, like this:

`car_speed = 50  # note that everywhere in the program speed is measured in km/h`. 

Although, if you name some constant in your program that has a pretty comon name, then non-specific name could also work. For example,

`c = 299792458  # speed of light in m/s`.

## Data Types

So, to understand what we can do with our variables; we must consider *data types* in Python. Simply, what our variables can be.

First, let's consider some very basic *built-in* types you will deal with the most of the time. 

* Boolean values: 
    * <span style="color:green"> bool </span> (boolean) - is a binary variable, either `True` or `False`.
* Numeric types: 
    * <span style="color:green"> int </span> (integer)
    * <span style="color:green"> float </span> (floating-point number) - a decimal representation of a real number.
* Text sequence type: <span style="color:green"> str </span> (string) 
* The null object type: <span style="color:green"> None </span>

In [16]:
my_boolean = True

my_integer = 10
my_float = 5.

my_string = "hey there"  # You can also use single-quotes "'".

my_no_value_object = None

You can find out a type of a variable using function `type(...)`.

In [15]:
type(length)

int

### Other types

There are many other data types in Python 3. We won't go through other built-in types in detail right now, but we will consider a significant part of them during the course. 

If you want to go in very depth of the language, it is a fine practice to look through documentation on the aspects you need. For example, have a look at [built-in types](https://docs.python.org/3/library/stdtypes.html#) page in the online Python documentation, but do not be too enthusiastic about it at the time. The documentation is quite heavy and will get more understandable as you accumulate experience with the language. 

## Introduction to expressions

In this section we will learn what we can do with variables. An *expression* is a composite element of the program which consists of other "smaller" various lexical elements (including identifiers, operations and so on) united together in a specific syntactic manner. Simply, we need expressions to "express a value". Do not think too much about the definition right now, it is only to draw an analogy with a language that we speak.

### Operations

An *operation* is an example of *expression* in Python. You might have seen most of the *operators* that exist in Python before. The complete list of operators in Python is:

![image.png](attachment:image.png)
Source: [Python documentation](https://docs.python.org/3/reference/lexical_analysis.html#operators-1).

Today we are going to consider some of them.

Despite the operators might seem familiar, they should be treated with care because: 

(1) Some opertors are not applicable to some types, e.g. the expression `1 + "add me"` is wrong. If you use an incorrect operation you will get `TypeError: unsupported operand type(s) ...`.

(2) Effects of some operators differ depending on the types of the arguments, e.g. `2 * 2` gives `4`, try what `"double" * 2` returns.

In [21]:
# try here


### Arithmetc operations

Writing arithmetic expressions in Python is quite intuitive, especally those with addition, subtraction, multiplication and division of two numeric values. The order of the operations is as in mathematics. As usual, parentheses help to modify it if you want to.

In [34]:
print(2 * 3 + (5 - 1) / 2 )

8.0


Let's dwell on some operators that might be not that familiar. 

The *power operator* in python is `**`. The *integer division* operation `left_operand // right_operand` returns an integer part of the devision of the left operand by the right operand. The *modulo* operation `left_operand % right_operand` returns a reminder of the devision of the left operand by the right operand.

In [47]:
# print a result of taking 4 to the power of 60:


# find an integer part of devision of the previous quantity by 4 and print it:


# see what is a reminder of the devision of the previous value by 2:


# can you find the answer to the previous queston by using a composite arithmetic expression?


Some of the operations are customizable, which means that they can be programmed for use in the same syntactical manner with obects other than numerical values. You have seen an example before `"double" * 2`. So, now we do a short digression introducing what `+` and `*` do with sequence objects (strings is one example of such objects).

#### Specific behavior of summation and multiplication applied to some (text and other) sequence type objects

Everything is pretty simple. If the operands are sequence objects (or expressions that give sequence objects) then `+` represents *concatenation*. As of now, we know of only strings from the group of sequence objects. Let's see how concatenation works with them.

In [44]:
"I can't live without you --<" + ">-- We are meant to be together"

"I can't live without you --<>-- We are meant to be together"

As you can see, Python concatenates strings straightforwardly without adding any space character in between.

The multipliaction operator `*` just concatenates the left operand (e.g. a string) with itself as many times as the right operand, `integer`, says minus one (so, effectively we get the original sequence object repeated `integer` times). So, the construction must be `sequence_object * integer`.

In [9]:
"double" * 2

'doubledouble'

Later in the course we will also talk about other sequence type objects like lists and tuples that are also subject to these operations.

See the examples below:

In [20]:
[1, 2] * 2

[1, 2, 1, 2]

In [25]:
["Y", "O"] + ["L", "O"]

['Y', 'O', 'L', 'O']

In [22]:
(1, 2) * 2

(1, 2, 1, 2)

In [26]:
("Just", "the", "two") + ("of", "us")

('Just', 'the', 'two', 'of', 'us')

### Comparisons

Sometimes we need our program to perform differently depending on different conditions, <a id='taxation-example'> e.g. we could be solving a taxation problem where salaries below 1000 EUR are nontaxable and others are subject to 13% tax (applied only to the amount above 1000).</a> For the purpose of numeric comparison, we have the following operators: `>` (greater), `<` (less), `>=` (greater or equal), `<=` (less or equal), `==` (equal), `!=` (not equal).

Try to modify the following short piece of code to check whether the input salary is below 1000:

In [32]:
salary = int(input())  # input() function here represents a user input from stdin, but it reads it as a string;
                       # that is why we convert it to an integer by applying the function int().

print("The input salary is below 1000:", ) # fill in a logical expression after the comma

1000
The input income is below 1000: False


There are also *identity comparisons*: `is` and `is not`. They will be quite helpful when you process a real data. For example, you can check whether a variable has value or not:

In [42]:
variable = 5

print("The variable has a value:", variable is not None)

The variable has a value: True


Last but not the least important are *membership test operations* that are also considered as type of comparisons in Python. They can be done by literals `in` and `not in`. We will have a closer look at these operations when we consider *conainer objects*.

### Boolean operations

Logical "and" test is done by `and` literal, logical "or" test -- by `or`. Logical negation is done by `not` literal. See the examples below: 

In [39]:
print(True and False)

print(True or False)

print(not True)

False
True
False


In the [taxation example](#taxation-example), consider a new rule: if the tax paid this year from other sources of income than one's salary is above 300 EUR and salary before tax is taxable, then the tax deduction of 3% is applied to the taxable part of the salary.

In [None]:
salary_before_tax = 1500

tax_not_salary = int(input())

print("Tax deduction should be applied:", )  # fill in a logical expression after the comma

Note that some objects other that booleans can give a boolean result. For example, the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). This helps to write beautiful and concise conditional expressions that we will discuss later.

## A note on type conversion in Python

You have already seen that we used the function `int(...)` to convert a type of an input from `str` to `int`. There are also other functions that convert a variable to another type. Of course, it works only if the provided variable is convertible to the suggested type. The functions that are used for explicit type conversion are the same as type names.

In [59]:
initial_integer = 5
integer_to_float = float(initial_integer)
float_to_string = str(integer_to_float)
string_to_float = float(float_to_string)
float_to_integer = int(string_to_float)

Will you be able to convert `float_to_string` directly to integer?

In [None]:
# try here


There is also an *implicit type conversion* in Python. There are several numeric examples: 
* If at least one value in an expression is a complex number, then the result is converted to a complex number.
* If there is no complex numbers but there is a float number, the result is converted to float.
* If there is no complex or float values but there is an integer, the result is converted to int.
* Boolean values `True` and `False` in numerical expressions are implicitly converted to 1 and 0 respectively.

The machinery of all these exemples comes from the fact that there are lower and higher data types in Python. So, the compiler converts lower data types to higher data types if they occur together in one expression.

In [117]:
print(type(4.1 + 5))
print(type(5 // 2.5))
print(type(5 // 2))
print(True * 5 + False * -1)

<class 'float'>
<class 'float'>
<class 'int'>
5
