## Python, IPython, Jupyter
Python is an interpreted programming language. Usually Python scripts are run from beginning to end. It is impossible to stop the program for a while, maintain its state, modify something and return to the operation. IPython (Interactive Python) has been created to solve this problem. It allows interactive work on scripts and stores objects in the memory permanently, so it is always possible to return to them. You can also run other code, stop it, and everything will be stored in memory. Jupyter on the other hand, is a graphical interface to run IPython.

There are multiple environments to work in when it comes to Python, from simple Notepad, vim, nano, notepad++, Sublime Text, through Jupyter Notebook, to IDEs (pyCharm, Spyder etc.). Everyone is free to choose their preferred solution. During the course Jupyter Notebook is sufficient, because it is convenient, fast, has a user-friendly interface and makes making readable, structured text comments easy.

## Using Notebook

Using a notebook is very easy. Some important pieces of information:
* You run a single cell using Shift+Enter (focus moves to the next cell) or Ctrl+Enter (focus stays in the same cell)
* If you want to run all cells or e.g. all below active cell, click on Cell and appropriate Run option.

You may use the icons above:
* Rectangle: stop current action
* +: add new, empty cell below currently active cell
* Arrows up/down: move cells
* Scissors: cut cell; two pieces of paper: copy cell
* Dropdown menu: choose cell type (Markdown - text, Code - Python code)
* Kernel (notebook) is restarted using Kernel > Restart or a Refresh (circle) arrow. Restarting a notebook makes you lose all results.
* Once in a while, a notebook is saved automatically. However in crucial moments it is good practice to save manually (Ctrl+S). You should also do this before running new code.

It is suggested that you get used to keyboard shortcuts. If you do anything repeatedly, you should search for a keyboard shortcut and start using it. Even if at the beginning it is slower, after a while work will become much faster.
* Ctrl+/ : comment a line
* Shift+Del or Ctrl+D : delete current line
* Tab : indent a line or multiple lines, Shift+Tab : remove indentation (both at the beginning of a line or if many lines have been chosen)
* Tab : autocomplete (at the end of a line)
* Shift+Tab : show documentation of an object (usually between parentheses: () )
* Help > Keyboard Shortcuts : show other keyboard shortcuts
* Some people may miss a shortcut to duplicate a line. Fortunately, there is an Open Source solution: [Line duplication](https://github.com/jupyter/notebook/issues/1816)
* If you don't like classic Notebook there, you can obviously change it: [Themes](https://github.com/dunovank/jupyter-themes)

Keyboard shortcuts are also available in Notebook to work on cells. You can either be inside the cell (edit a cell) or work on cells. Switch between modes using keys Esc/Enter. In the cell edition mode additional shortcuts are available, e.g.:
* A : add cell above
* B : add cell below
* C : copy a cell, X : cut, V : paste
* Shift+Up : choose cells above, Shift+Down : choose cells below.


# Basics: code organization and data structures
## Code organization
In Python code organization is based on indentation, as opposed to many other programming languages. We cannot use whitespace freely (apart from blank lines). Code blocks are created using indentation levels, and not curly brackets {}.
The code below is incorrect and will not run:

In [None]:
x = 1
    y = 2

In [None]:
# You need to remove unnecessary indentation to run the code.
x = 1

y = 2

At the beginning it may be frustrating, but you will get used to it soon. Thanks to this approach, the codeis always readable. Indentation is always right, and the number of unnecessary characters (for example {}) and lines is limited. There are no semicolons at the end of the line ;.

**Tip: Tab and Shift+Tab allow you to indent one or multiple lines (in case of one line cursor must be at the beginning, unless you use alternative shortcuts: Ctrl+] or Ctrl+[, which also work on multiple lines)**

## Print function and "hello world"
"print" function displays text.

In [None]:
print("Hello world! :)")

# Since Python 3 print is a function, so using parentheses is required.
# In older versions of Python the following code was also right:
# print "Hello world! :)"
# Usually a print statement without parentheses is the easiest way to distinguish Python 2 from 3.

## Variables
Python is an interpreted language (does not require compiling), which is characterized by dynamic type system - you are not required to declare variables' type. Interpreter itself guesses it.

In [None]:
e = 2.72
pi = "3.14"
text = "Hello world!"
print("Type of variable e:", type(e),
      ", type of variable text: ", type(text),
      ", type of variable pi: ", type(pi))
# Open parenthesis of print function allows for multiple lines with any indentation.

Print function can take multiple arguments.

Showing a variable's content may be achieved in one more way:

In [None]:
print("Type of variable e: %s, type of variable text: %s, type of variable pi: %s"
      % (type(e), type(text), type(pi)))

In this way you can control formatting better.
We avoid unnecessary space before a comma.
You can read more here: https://pyformat.info/

Dynamic type system has both advantages:
* faster code writing
* less code

and disadvantages:
* longer running time
* possibility of errors which are difficult to debug.

Python allows easy typecasting (changing a variable type):

In [None]:
# Concacenating two strings using operator "+":
print(str(e) + pi)
# Adding two numbers:
print(e + float(pi))

In [None]:
# Fortunately, this is not possible:
print(e + float(text))

## Operators
Besides obvious operators (+,-,/,\*) integer division (// - quotient, % - remainder) and exponentiation (\*\*) are available.

Comparison operators are usual: >=, >, <=, <, ==, !=.

Logical operators: & - AND, | - OR, ^ - XOR, ~ - NOT.

## Data structures
### Objects
Before describing other data structures you should know something about objects.

A difference between objects and functions is (simplifying) as follows:
**Functions is a set of instructions to run. They don't have a state and cannot "exist". Objects exist, there may be many objects simultaneously, and each one may be in a different state.**

To say it in an easy way, object is a complex element, which may have multiple variables and methods (functions). Both variables being an object's element and available methods are accessed using a dot.

``` python
# Show contents of "variable1" being an element 
print(object1.variable1)
# Function is similarly used
result = object1.function1(pi)
```

### Basic data structures
Four basic data structures in Python are:
* lists
* tuples
* dictionaries
* sets

Array in Python exists as an object type, but is nearly never used in practice. Numpy is used for large tables/numerical matrices, and other cases are easier using lists. Full documentation is available here: https://docs.python.org/3/tutorial/datastructures.html

### Lists
Lists are a convenient and flexible method of storing data. Lists are dynamic and may change their state (are mutable). You can extend and modify them, which makes them very practical in everyday usage. This code shows their capabilities:

In [None]:
# Lists are created using square brackets
emptyList = []
# You can also create a new list object.
emptyList2 = list()
print(emptyList, emptyList2)
colors = ["red", "blue", "green", "orange"]
# Lists are indexed by a number
print(colors[0])
print(colors[:])
# Print elements [1,3)
print(colors[1:3])

In [None]:
# You can append an element to the end of the list in two ways
colors[len(colors):] =["yellow"]
print(colors)
# or use an existing method of a list object.
colors.append("black")
print(colors)
# You can also insert an element to the inside of the list
colors.insert(2, "black")
print(colors)

In [None]:
# Count how many times an element occurs:
print("Count 'black':", colors.count("black"))
# Does an element exist in the list?
print("Is 'black' in the colors list?:", "black" in colors)
print("Is 'black' not in the colors list?:", "black" not in colors)

In [None]:
# There are two ways of deleting elements
numbers = [4, 5, 6]
print(numbers)
# Delete the first element equal to a given value
numbers.remove(5)
print(numbers)

numbers = [4, 5, 6]
# Delete element using an index
numbers.pop(1)
print(numbers)

In [None]:
numbers = [4, 5, 6]
# You can also reverse a list
numbers.reverse()
print(numbers)

In [None]:
colors = ["red", "blue", "green"]
numbers = [4, 5, 6]
# Lists are very flexible. They allow you to have different data types in one list
mixedList = colors + numbers
print(mixedList)
# You can even create a list of lists and other combinations
mixedList1 = list(colors)
mixedList1.append(numbers)
print(mixedList1)
mixedList2 = []
mixedList2.append(colors)
mixedList2.append(numbers)
print(mixedList2)

In the cell above function list() created a new list. Why did we have to do it to create mixedList2? What would happen if you wrote mixedList1 = colors instead of using a function?

You can test this solution and wonder, why is the result different. We will return to this question soon.

Obviously sorting lists is also possible. How to sort a list in reverse? Place a cursor on "sort" or inside parentheses after "sort" and click Shift+Tab.

In [None]:
# Sort:
colors.sort()
print(colors)
print("Length before clearing", len(colors))
# ... and clear:
colors.clear()
print(colors)

### Sets and lists
Sets are most closely related to lists. You can perform similar operations as in the case of lists. Changing one data structure to another is very simple.

In [None]:
colors1 = ['black', 'blue', 'green','yellow']
colors2 = ['black', 'green', 'orange', 'red']
# You can join two lists using addition
allColors = colors1 + colors2
print(allColors)

As you can see, "black" and "green" are in the same list two times. What if we wanted to use lists as sets and avoid duplicates? We can convert lists to sets, which we will join.
To avoid mistakes and maintain logical cohesion, operator of joining sets is different than lists.

Pay attention to different bracket type in sets {}. We can create new sets using class name set() or by putting elements in {}. NOTICE: by typing "var={}" we create a dictionary and not a set, see more below.

In [None]:
# Just like in the case of sets, you can create a new object.
emptySet = set()
print(emptySet)
colors3 = {'brown', 'navy'}
print(colors3)
allColors = set(colors1) | set(colors2) | colors3
# add one color more to allColors, you add elements to a set (and append to a list)
print(allColors)
allColors.add("violet")
print(allColors)
# discard "red" from set (equivalent in lists: remove)
allColors.discard("red")
print(allColors)
# Convert set to list
allColors = list(allColors)
print(allColors)

All arithmetic operations on sets are possible (for sets x, y):
* Joining x.update(y), using an operator: x|=y
* Intersecion x.intersection_update(y), using an operator: x&=y
* Difference x.difference_update(y), using an operator: x-=y
* Symmetric difference x.symmetric_difference_update(y), using an operator: x^=y

For example:

In [None]:
commonColors = set(colors1) & set(colors2)
print(commonColors)
notCommonColors = set(colors1) ^ set(colors2)
print(notCommonColors)

Obviously you can check inclusion of sets:

In [None]:
print("Is colors2 a subset of colors1?", set(colors1) > set(colors2))
print("Is ['black', 'blue'] a subset of colors1?", set(colors1) > set(['black', 'blue']))

### Dictionaries
Dictionaries are similar to sets. A set guarantees uniqueness of elements, so you can use it as a key/index. For every key there is a value (or None). Dictionaries are in a way an extensions of sets. In sets we have just unique values and in dictionaries unique keys. We can create an empty dict using {}.

In [None]:
emptyDict = {}
emptyDict2 = dict()
print(emptyDict, emptyDict2)
author = {'name': 'Maciej', 'surname':'Wilamowski', 'age': 31}
print(author)
# adding elements is simply defining value for a key
author["height"] = 192
print(author)
# key may also be a number
author[1] = "Python >> R"
print(author)
# you can delete a single element from a dictionary
del author["age"]
print(author)

In [None]:
# Just as in the case of sets you can join/update two dictionaries
authorsAge = {'age': 32}
author.update(authorsAge)
print(author)

Other operations which are possible to perform on sets cannot be performed on dictionaries, because in two dictionaries for the same key there may be a different value. Many operations would be ambiguous.
It is possible to perform operations on keys alone.

In [None]:
print(set(author.keys()))

Sometimes you may want to print key/value pairs from a dictionary:

In [None]:
print(author.items())

### Tuples
Tuples are the last built-in data type. In practice, this data structure is not used directly too often in the area of data analysis. Tuples are similar to lists in some ways. You can choose elements by index, tuples may contain various variable types, be nested, etc.

Parenthesis is used for tuples "(", reminder: lists use square brackets "[", sets "{" curly brackets.

In [None]:
tuple1 = (5, 10, 15, "Hurray!")
print(tuple1)
print(tuple1[1])
print(tuple1[3])

There is, however, a substantial difference. Tuples are static/immutable objects. For this reason they cannot be changed after being created, it is also not possible to add elements (see code below).

This behavior is completely different than in the case of lists, sets and dictionaries, which are dynamic and mutable. It makes them flexible and convenient for fast programming.

Why do tuples and static objects exist? Because of their efficiency. If we cannot do "anything' with them, they are simple to use, so we may have fast and efficient access to them. In practice the only case, in which we will use tuples, is returning multiple values from a function.

In [None]:
tuple1[1] = 5

### Strings
Strinngs are another immutable data type. Strings will be covered more in depth later on. In Python, both single ' and double " (as well as triple single, ''', for multiple lines) may be used to create strings.

In [None]:
a = "hello"
b = 'world'
# You can print first four characters in order
print(a[0], a[1], a[2], a[3])
# Or at once
print(a[0:4])
# Adding two strings together creates a third objects, a and b remain unchanged.
print(a + b)
c = '''This is a string
that contains
many
lines ... '''
print(c)