# Introduction to Python

## 1. About Python

We start with some basic things you need to know about the Python language. It includes basic data types, data structures and data flow control. When you have finished this notebook, you should know the following basic Python things:

* different data-types (string, integer, float, boolean, list, tuple, dictionary, set)
* slicing strings, string concatenation, lower() and upper()
* print()
* control flow tools (if-statements (if, elif, else), for-loop, range(), while-loop, write your own functions)
* use import

You can find a lot of good information about Python on the internet. A valuable source is the Python documentation: https://docs.python.org/3/

## 2. Basic Python, Data Types, Print

We start the course with some basic Python:

In [None]:
print("Hello world!")

This statement prints the text "Hello world!". `print()` is a function, and it prints what is inside the parentheses. We call this the argument. The part between the quotes is called a string and it consists of a number of characters, including a space and an exclamation mark. The function `print()` can have more arguments.

In [None]:
print("Hello world!", "How are you?")

You can add comments after #. It is strongly recommended to add comments to your code. Comments do not only clarify the code for new readers, but they are also helpful for yourself. Especially if you haven't worked on your code for a while.

The string is not the only data type in Python. There are also other data types, such as the integer. In the following cell the value 5 is assigned to the variable a. 5 is an integer. Integers a whole numbers.

In [None]:
a = 5 # 5 is an integer.

Python stores the data type of an object in memory. You can find out the data type of an object with the function `type()`.

In [None]:
print(a)

In [None]:
print(type(a))

In [None]:
print(type("Hello world!"))

The real numbers are a separate data type, called float. In the following cell numeric values are assigned to variable names with the names b, c and d. Assignment takes place with the = sign.

In [None]:
b = 5. # this is a float, note the decimal '.'.
c = 2.3
d = float(5)

In [None]:
print(type(b))
print(type(c))
print(type(d))

You can use the variable name to do calculations.

In [None]:
print(b * 5) # multiplication
print(b ** 5) # power

It is important to know what the type of an object is. Let's have a look at the difference between a string and an integer.

What happens if you do `'5' + '5'` ?

In [None]:
'5' + '5' # This is called string concatenation. Strings are 'glued' together. You can do this with any kind of strings,
          # e.g. 'etc' + 'bc' + '_' + '40'

The same happens if you do the following.

In [None]:
'5' * 2

This differs from the addition of two integers.

In [None]:
5 + 5

This happens if you want to add a string to an integer.

In [None]:
'5' + 5

You can convert the string '5' into an integer with `int()`.

In [None]:
num_str = '5'

int(num_str) + 5

Another important data type is the boolean. A boolean variable can have two values: `True` and `False`.

In [None]:
bool_var = True

type(bool_var)

## 3. Conditions, If Statements

Notice the difference between an assignment (=) and 'is equal to' (==). 

In [None]:
4 == 5 # this is evaluated to False

!= is 'is not equal to' .

In [None]:
4 != 5

There is also > (greater than), < (smaller than), >= (greater than or equal to), and <= (smaller than or equal to)

In [None]:
4 <= 5

In [None]:
4 >= 5

The `if` statement checks if a certain condition is evaluated to True.

In [None]:
a = 0

if a == 0:
    print('Hello,')

In [None]:
if a != 0: 
    print('Pythonistas!')

The `if` statament ends with a colon. In Python it is required that the line after the colon is indented (with a tab or 4 spaces). Later you will see the colon also after `for` and `while` statements and in a function header, which starts with `def`.

To the `if` statement you can add zero or more `elif`'s and an optional `else`.

In [None]:
if a == 0:
    print('a is 0') #what is printed here is one string
    
elif a < 10:
    print('a is geater than 0 and smaller than 10')
    
elif a < 15:
    print('a is geater than 9 and smaller than 15')
    
else:
    print('a is geater than 14')

## 4. While Loops

With `while` you loop until a specific condition is not evaluated to the value `True` anymore.

In [None]:
number = 10

while number > 0:
    print(number)
    number -= 1

In the example above you saw number -= 1. This is a short way of writing `number = number - 1`. Similarly you have +=, \*= and /= in Python.

In [None]:
number2 = 2

while number2 < 25:
    print(number2)
    number2 += 1

## 5. Writing Functions

You use a function for operations that have to be done more than once in a program. The advantage of using functions is that you need to write the piece of code in the function only once, which keeps your code concise. Eevery time you want that piece of code to be executed, you call the function.

The structure of a function is as follows:


def function_name(arg1, arg2, ...):  
"""  
This is the docstring, here you explain what the function does. Use triple quotes.
It can cover more than one line.  
"""  
&nbsp;&nbsp;&nbsp;&nbsp;function body  
&nbsp;&nbsp;&nbsp;&nbsp;...  
&nbsp;&nbsp;&nbsp;&nbsp;...  
&nbsp;&nbsp;&nbsp;&nbsp;return(certain_object)

    
If a function name consists of more words, as the name functionName, it is good Python style to use camel case, which means that the first letters of the words after the first word are capitalized. There is a whole document about Python style programming. You can find it here: https://www.python.org/dev/peps/pep-0008/ You can refer to it as PEP8. There are many things that you can do in various ways in Python. Often, one of those ways is considered the "Pythonic way". This Pythonic way gives you clean and efficient code. Many of these efficient coding manners are described in PEP8.

In [None]:
def cubic_calculator(num):
    """calculates cube of a number"""
    
    cub_num = num**3
    return(cub_num)

The function is called.

In [None]:
print(cubic_calculator(4))

print(cubic_calculator(6))

If you want to work further with the result of a function, you assign its value to a new variable.

In [None]:
new_var = cubic_calculator(10)

A function often has more than one argument. An argument can have a default value.

In [None]:
def addition(num_a, num_b = 5):
    '''adds two numbers together'''
    
    num_c = num_a + num_b
    return(num_c)

In the next cell the function is called. The value 10 in the function call corresponds with num_a in the function definition, and the value 12 in the function call corresponds with num_b.

In [None]:
print(addition(10, 12))

Now we give the function call only one argument. The second argument gets the default value 5 in this case.

In [None]:
print(addition(10))

## 6. Lists

In the example above, the script prints the names of the books, but it does not remember them, so it is impossible to use them later in the program. If you want Python to remember the books names, you can store them in a list. 

A list is an ordered sequence of elements. You can recognize it by its square brackets. Here an empty list is initialized, which is called a_list.

In [None]:
a_list = [] # this is equivalent to a_list = list()

You can add elements to the list with the method `.append()` .

In [None]:
a_list.append(40)

print(a_list)

We add a string to a_list.

In [None]:
a_list.append('workshop_at')

print(a_list)

A list can also be populated manually.

In [None]:
unsorted_list = [3, 2, 1, 5]

print(unsorted_list)

You can sort a list with the built-in function `sorted()`.

In [None]:
print(sorted(unsorted_list))

And you can sort it in reversed order with the argument reverse.

In [None]:
print(sorted(unsorted_list, reverse = True))

## 7. For loops and lists

You can loop over the elements in a list and do something with them, by using the so-called for loop. 

The line line with for ends with a colon. In Python it is required that the line following the colon is indented.

In [2]:
num_list = [1, 2, 3, 10]

for num in num_list:
    print(num)

1
2
3
10


In [3]:
for num in num_list:
    print(num**2)

1
4
9
100


In [5]:
another_num_list = [20, 30, 40]

for num in num_list:
    another_num_list.append(num)
    
another_num_list

[20, 30, 40, 1, 2, 3, 10]

In [7]:
another_num_list = [20, 30, 40]

for num in num_list:
    if num > 5:
        another_num_list.append(num)
    
another_num_list

[20, 30, 40, 10]

We can make a numeric sequence with `range()` and loop over it with a `for` loop.

In [None]:
for number in range(15):
    print(number)

Look carefully at what `range()` does in the following examples.

In [None]:
for number in range(4, 15): # with two arguments we have range(start, stop)
    print(number)

In [None]:
for number in range(4, 15, 2): # with three arguments we have range(start, stop, step)
    print(number)

Note that the arguments of `range()` are integers.

Instead of just printing the integers, you can store them in a list.

In [None]:
integer_list = []

for number in range(15):
    integer_list.append(number)
    
print(integer_list)

In [None]:
print(len(integer_list))

We want to retrieve the first element in this list. You can do this by using an index with []. The first element in a list is retrieved with index 0, because Python is zero based.

In [None]:
print(integer_list[0])

Now we would like to find out what the first ten elements of the list are.

In [None]:
print(integer_list[0:9])

When you slice a list, you can also use the step:

In [None]:
integer_list[2:5:2]

Here, you use [start: stop: step]. If the step size should be bigger:

In [None]:
integer_list[::3]

If you do not use the start and stop, you simply slice the whole list with, in this case, step size 3.

The last element in a list is retrieved with index -1.

In [None]:
print(integer_list[-1])

And the last ten elements? There is nothing after the colon, which means that it looks for everything from the tenth last element until the last element

In [None]:
print(integer_list[-10:])

## 9. Some Details about Lists and Strings

A Python list can contain elements of different data types.

In [None]:
varied_list = [True, 5, 5.0, 'Hebrew']

print(type(varied_list[0]))
print(type(varied_list[1]))
print(type(varied_list[2]))
print(type(varied_list[3]))

The following does not work.

In [None]:
print(type(varied_list[4]))

The elements of a list can also be lists.

In [None]:
list_of_lists = [[1, 2], [3, 4], [5, 6]] # this is a list of lists.
                                         # you can also say that three lists are nested in list_of_lists.

print(list_of_lists[0])

We use a double index to access the individual integers.

In [None]:
print(list_of_lists[0][1])

A list comprehension is a fast and clean way to create a list.

In [None]:
another_list = [number**2 for number in range(12,20)]

print(another_list)

You can find the minimum and maximum values in a list with the functions `min()` and `max()`.

In [None]:
print(min(another_list))

print(max(another_list))

It can be very useful to retrieve the position of a certain value in a list. You do that with the method `.index()`.

In [None]:
highest_value = max(another_list)

pos_of_highest = another_list.index(highest_value)

print(pos_of_highest)

Here is an example of a list comprehension with strings. Look at what `lower()` and `upper()` do.

In [None]:
books_list = ['Genesis', 'Exodus', 'Leviticus']

In [None]:
book_list_lower = [book.lower() for book in books_list]

print(book_list_lower)

In [None]:
book_list_upper = [book.upper() for book in books_list]

print(book_list_upper)

Often you need to make slices of a string.

In [None]:
book_string = 'Genesis'

If you want to retrieve the first letter of a string, you use the index 0.

In [None]:
print(book_string[0])

And if you need the first three letters, you use the index 0:3.

In [None]:
print(book_string[0:3])

If you want to know the last letter, the index is -1.

In [None]:
print(book_string[-1])

And finally, if you want to know the last three letters, you use -3: . 

In [None]:
print(book_string[-3:])

## 10. Dictionaries and Counting Object Types

A dictionary is a structure which contains key-value pairs. You can recognize a dictionary by the curly brackets. A dictionary is initialized as follows.

In [None]:
geo_dict = {'Germany': 'Berlin', 'Belgium': 'Brussels', 'Italy': 'Rome'}

# the dict geo_dict is populated manually. An empty dict would be initialized with:
# geo_dict = {}
# or: geo_dict = dict()

The geo_dict contains three keys, 'Belgium','Germany', and 'Italy', and four values. Between key and value you see a colon, and the key:value pairs are separated by comma's. How many elements does this dictionary contain?

In [None]:
geo_dict_len = len(geo_dict)

print(geo_dict_len)

We can retrieve the value of a specific key as follows.

In [None]:
print(geo_dict['Italy']) # returns the value of the key 'Italy'.

You can add new key:value pairs to a dictionary.

In [None]:
geo_dict['Denmark'] = 'Copenhagen'

print(geo_dict)

If you want to iterate over all the keys, you use .keys() .

In [None]:
for country in geo_dict.keys():
    print(geo_dict[country])

A specific key can only occur once in a dictionary:

In [None]:
geo_dict['Belgium'] = 'Antwerpen'

In [None]:
geo_dict

You see the old value is overwritten.

## 11. Sets

The set is another basic data type in Python. In contrast to the list it contains unique elements only without order. You can use a set if you want to know which unique elements there are in a large mount of data. First we look at a simple example.

In [None]:
integer_set = set() # an empty set is initialized

print(integer_set) # prints the empty set

With .add() you can add an element to a set. In this case we add the integer 5.

In [None]:
integer_set.add(5)

print(integer_set)

Now we add another integer.

In [None]:
integer_set.add(27)

print(integer_set)

Now we try to add 5 again.

In [None]:
integer_set.add(5)

print(integer_set)

## 12. Pandas Dataframe

A pandas dataframe is a 2-dimensional structure, which has rows and columns. Each column has its own data type. 

First, you need to load the pandas library. If it is not installed yet, you do `pip install pandas`.

In [None]:
import pandas as pd

You import pandas and give it a new, short name pd. This is not required, but all programmers do this, so other people who read your code can easily recognize when you use pd.

Now we read a csv file using the function read_csv() from the pandas package.

In [None]:
pd.read_csv('books.csv')

You see that pandas assumes that the first line of the csv file is a header.

You can assign the dataframe to a variable.

In [None]:
books_df = pd.read_csv('books.csv')

Let's explore the books_df a bit:

In [None]:
type(books_df)

In [None]:
books_df.shape

In [None]:
books_df.dtypes

You can select a column:

In [None]:
books_df['genre']

We count the values in a specific column:

In [None]:
books_df['genre'].value_counts()

Or make a selection of the rows, based on a specific value in a column.

In [None]:
books_df[books_df['genre'] == 'prose']

NB. There is A LOT that you can do with the pandas package. Check the [documentation](https://pandas.pydata.org/).

## 13. Accessing Text Fabric from Python

You start Text-Fabric using the incantation:

In [None]:
from tf.app import use
A = use('etcbc/bhsa', hoist=globals())

There are three important classes that you will often use T, F, and L.

With F, you access the node features. An important node feature is otype: object type.

If you use the funtion s(), you can loop over the objects of a specified type:

In [None]:
for bo in F.otype.s('book'):
    print(bo)

You see it prints the nodes. With F, you can also get the names of the books using the feature "book", and the function v():

In [None]:
for bo in F.otype.s('book'):
    print(F.book.v(bo))

In [None]:
You can use the other node features as well.

In [None]:
for wo in F.otype.s('word'):
    if F.lex.v(wo) == 'HLK[':
        print(F.vt.v(wo))

With T.sectionFromNode(), you can get the section values in a tuple. 

In [None]:
for wo in F.otype.s('word'):
    if F.lex.v(wo) == 'HLK[':
        print(T.sectionFromNode(wo))

With T, you can also retrieve the text of a book:

In [None]:
for bo in F.otype.s('book'):
    if F.book.v(bo) == 'Jona':
        print(T.text(bo))

This works also for clauses, verses and chapters.

With L, you can jump to other node levels. With L.u you go up, e.g., from words to phrases, clauses or books, and with L.d you move downwards. E.g.:

In [None]:
L.u(1, 'clause') # 1 is the first word in the Hebrew Bible

You see, that L.u returns a tuple. If you want the node, you need to use the index 0.

In [None]:
first_clause_node = L.u(1, 'clause')[0]
first_clause_node

From this first_clause_node, we move downwards to the word level:

In [None]:
L.d(first_clause_node, 'word')

We print the consonantal representation, the lexeme, and the number of these word nodes:

In [None]:
words_in_first_clause = L.d(first_clause_node, 'word')
for wo in words_in_first_clause:
    print(F.g_cons.v(wo), F.lex.v(wo), F.nu.v(wo))