<img src="../NAWI_Graz_Logo.png" align="right" width=150>

# Notebook 1: Basics of programming (1)

*Developed by Johannes Haas and Raoul Collenteur*

*Parts based on [Exploratory computing with Python by Mark Bakker](http://mbakker7.github.io/exploratory_computing_with_python/)*



## This lecture will introduce some basic principles

- jupyter notebooks
- importing packages
- Variables
- data types
- basic operators
- plotting
- reading errors

### jupyter notebooks

If you can read this, you have followed the instructions on how to open a jupyter notebook.

Now double click (or click + enter) into this textbox. Can you figure out how to add another textbox below, for your comments on this lecture?



#### What is a jupyter notebook?

https://jupyter.org/:

*"The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text."*

- formerly known as ipython notebooks (see filename)
- used to be exclusively for python, now also available for many more languages
- JSON document

Potential revolution for scientific publishing, or just quite useful for teaching?



# Packages

You can do everything with pure python, but a lot of things do require a lot of work. For example, try to calculate the square root of 2. Easiest way you maybe remember from school:
$1 * 1 = 1; 2 * 2 = 4$
OK, it's not those two. Maybe $1.5 * 1.5*$? Nope, that's 2.25. So, it must be smaller than 1.5. Maybe 1.3? Nope...


In [1]:
1*1

2

In [None]:
2*2

What you probably want, is something like your calculator, where you simply press $\sqrt 2$ and are presented with 1.414213... 
Luckily, such a thing already exists with `np.sqrt(2)`.

In [3]:
np.sqrt(2)

NameError: name 'np' is not defined

What happened here?
We are asking to python to run a function it does not know!
We need to import it first.

In [24]:
import numpy as np


You could also give it another name, e.g. `import numpy as calc` or just import that specific function, e.g. `from numpy import sqrt as wurzel`, but unless you know what you are doing and have a good reason for that, you shouldn't!
Most well known packages have a standard way they are imported, e.g. `import pandas as pd`.

## What's a package?

In the most simple terms, a package is a collection of functions. 
These can be either python functions or written in another language.
Generally, they tend to be packed together under a common theme or use case.
E.g. [NumPy](https://docs.scipy.org/doc/numpy/about.html) is a package with lots of functions for numerical operations (done in C) which is now the de-facto standard for much of scientific computation.
Numpy is now also a part of [ScyPy](https://scipy.org/about.html) which has a few more functions for scientific computation.
[Pandas](http://pandas.pydata.org/) is another widely used package which makes working with data frames much easier.
And [Matplotlib](http://matplotlib.org/) is generally used to plot whatever the output of the above.

And then there's about 172 536 more packages available on https://pypi.org/ for about any use case you can think of.
So most of the issues you might face ar already solved.
But: **Think before you install!**

## Installing packages

Anaconda (what we are using here) comes with the most common packages installed already.
For things it does not yet have, there's an inbuilt package manager. 
You can search or install with `conda search PACKAGENAME` or `conda install PACKAGENAME`.
If it's not on conda, pip will probably have it, with the same syntax.
And if it's not on pip, the project will have an installation on how to install it from source.

## Importing packages



In [25]:
# See if you can import pandas and matplotlib in the standard way
import matplotlib.pyplot as plt 
import pandas as pd 

Python will throw an error if you mistype something (e.g. `imp0rt`) or try to import a module that does not exist, but it will not care if you use a weird name!

If you did the imports correctly, running the two cells below should give you some results:

In [70]:
plt.plot([0,1,2,3],[4,5,6,4])
plt.show()

SyntaxError: unexpected EOF while parsing (<ipython-input-70-c53d189fb758>, line 2)

In [10]:
dates = pd.date_range('20190101', periods=12)
myfirstdataframe = pd.DataFrame(np.random.randn(12, 4), index=dates, columns=list('ABCD'))
print(myfirstdataframe)

                   A         B         C         D
2019-01-01  1.404668 -0.932166  1.386888 -0.453975
2019-01-02 -0.999996  0.626157  2.102936 -1.121349
2019-01-03 -0.582509 -0.674263  0.044244  1.220669
2019-01-04  1.423813 -0.603711 -2.397415  0.988546
2019-01-05  0.413796  0.237925 -0.316548 -0.133671
2019-01-06 -1.820207 -1.580433  0.963832 -2.249333
2019-01-07 -0.439711  0.211776  0.842453  0.644036
2019-01-08  0.031406 -0.234306 -0.040396 -0.173052
2019-01-09 -0.541796  2.539715 -0.598174  1.399919
2019-01-10 -1.778739 -1.968846  0.200538  0.147897
2019-01-11 -0.736256 -0.869626  1.329750 -0.741845
2019-01-12 -0.218112  0.110309 -1.280337  0.619159


# Variables and datatypes



In [26]:
a = 1
b = 2.1
c = 'Hallo'
d = [1,2]
e = {'var1':0.111, 'var2':0.222, 'var3':0.333}


What can we do with that? What are these?

In [19]:
#play around with some standard math operators such as + and - and calculate the sum of all entries of e
e['var1']+e['var2']

0.333

## Most common types:

- integer
- float
- string
- list
- dict
- dataframe
- ndarray

What do these types mean?

The type defines what you can do with a variable, i.e. what you can store in it and what operations you can do.
Float and integer can be easily switched back and forth and you can easily turn a number into a string:

In [40]:
a_f = float(a)
f_i = int(f)

In [42]:
print(a_f, a)
print (f, f_i)

1.0 1
0.5 0


What's the problem with that?

In [22]:
# show round()

1


In [25]:
r_test = round(fl_test)
print(r_test)

2


# Calculations and printing of results

So far, you did some very easy operations and got the result immidiately from running the cell.
Suppose you are not using a jupyter notebook or ipython, but you want to write `mymegacalculator.py` and run it in terminal/cmd.
Saving the lines `import numpy as np` `np.sqrt(2)` as a `.py` file and running it, will not result in any output.

How do we get our results?

In [29]:
print?


*Remember, print in python 2 works a bit different than in 3!*
Write a line that prints `The square root of 2 is 1.414`

In [30]:
print('the square root of 2 is', np.sqrt(2))
#todo: limit the decimals

the square root of 2 is 1.4142135623730951


You can also use print for longer "sentences" by using the newline character `\n` at the end of a string.
Write a statement that prints

    Earlier we have set a = 1,
    b as 2.1 and c  as Hallo.

In [37]:
print('Earlier we have set a =', a,'\n','and b as', b, 'and c as', c, '.')

Earlier we have set a = 1 
 and b as 2.1 and c as Hallo .


## User input

Python can not only output results, but it also can receive user input. If we really want to make our  `mymegacalculator.py` app useable, it would be much better if it would ask us for the number we want to do something with, instead of just being able to output the result of `np.sqrt(2)`.

So similar to `print()` we can use `input()` to ask for user input.

In [44]:
intest = input('write something!')

write something!4


Now use this to write a function that asks the user for a number, and prints out the square root of that number.
However, there is a pitfall with `input()`.
Play around with your test and what we discussed above for types to figure it out.

In [47]:
datain = float(input('what number do you want to root?'))
dataout = np.sqrt(datain)
print('the square root of', datain, 'is', dataout)

what number do you want to root?4
the square root of 4.0 is 2.0


As with most things, there is not single correct way. Your little script is likely different from my example here. With a short thing like this, it does not really matter, but for bigger projects, there are of course many *wrong* ways that will add up.
So if you are going to write something bigger, please remember the intro from the first lecture!

In [46]:
type(intest)

str

# Reading errors

Unless you always write perfect code on the first try, you will get errors thrown at you.
Luckily, python tries to be helpful with them.
Lets analyse a few errors we already have received in our short python career:

    np.sqrt(2)
    np.sqrt(2)
    
    ---------------------------------------------------------------------------
    NameError                                 Traceback (most recent call last)
    <ipython-input-3-bbf78ff053fc> in <module>()
    ----> 1 np.sqrt(2)
    
    NameError: name 'np' is not defined



In [48]:
import numpi as np

ModuleNotFoundError: No module named 'numpi'

In [50]:
b = 2.1
c = 'Hallo'
d = c/b

TypeError: unsupported operand type(s) for /: 'str' and 'float'

In [51]:
d = float(c)

ValueError: could not convert string to float: 'Hallo'

In [54]:
np.sqrt(c)

TypeError: ufunc 'sqrt' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

In [58]:
j = 2
k = 3
print 'Hello'

SyntaxError: Missing parentheses in call to 'print'. Did you mean print('Hello')? (<ipython-input-58-fbd027aad335>, line 3)

There's two types of errors, exceptions and syntax errors, and they are always structured quite similarly.
You get the type, a traceback and a verbose description of the type.
With larger programms, this can easily seem quite overwhelming, but once you understand the structure, it becomes quite readable.
For the full, official explanation, please look [here](https://docs.python.org/3/tutorial/errors.html) and [here](https://docs.python.org/3/library/exceptions.html#bltin-exceptions).
In short, an exception "works" like this:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)

This part marks the beginning of the error message and gives you a first info about the type of error and tells you that it now shows a traceback. 
The [TypeError](https://docs.python.org/3/library/exceptions.html#TypeError) is *\[an\] exception \[that\] may be raised by user code to indicate that an attempted operation on an object is not supported, and is not meant to be.*
So apparently, we did try to do something we are not supposed to.
The following traceback tells us what exactly we did do wrong:

    <ipython-input-50-90b9c40f605f> in <module>
           1 b = 2.1
           2 c = 'Hallo'
     ----> 3 d = c/b
     
In case of a long statement, it shows the lines preceding the error, with line numbers in front and an arrow pointing to the offending line.
For this small test case, this seems very obivous, but with larger programs, it is very helpful to know what line you need to look at.
After this small code snippet, we get some additional information about the error:

    TypeError: unsupported operand type(s) for /: 'str' and 'float'
    
Again, it tells us that we are suffering from a type error (error messages can be quite long, and it almost always is the bottom that's the most important, so it makes sense to repeat it!) and gives us some more information about it.
Our TypeError in this case is quite easy. We tried to divide a string by a float, which obviously isn't going to work.
So now we have to figure out what to do with this error. 
Was it a small typo and we fat-fingered `c` instead of `x`, which we might have assigned to `x = 32141` at some point?
Or do we want to split `Hallo` in two parts, so that we would get `d = ['Hal','lo']`?

For the first, it's a really easy fix.
For the second, we will figure out how to to that in the **Working with text** section.

A syntax error works a bit different:

      File "<ipython-input-58-fbd027aad335>", line 3
        print 'Hello'
                    ^
    SyntaxError: Missing parentheses in call to 'print'. Did you mean print('Hello')?

But again, we get all the information we need to fix it.
We do get the line number of the error, and we get an arrow pointing at the offender.
However, you have to know that the error points at the statement that could not be executed, so the actual error occured before it.
So in this example, `'Hello'` is fine, the problem lies in `print`.
So a further info and the type of error is given in the last line of the error message, which is this special case (remember python 2 vs 3) is completely self explaining.


## Exercise

Fix the errors in the little script below.
You should be able to solve all the issues with what you have learned so far.
Remeber, you can add a new code box below the following line, to try out parts, or to get the `type()` info for some variables.

In [59]:
print'This is a little error exercise', \n 'Please correct the errors!')
err_count = input('How many errors did you have to fix?')
peop_count = input('How many people are in this course?')
err_total = err_count * peop_count
err_price = 1,2
err_sum = err_total * err_price
prinf('There's been a total of', err_total, 'errors fixed,')
print('if each error was worth', err_price, 'Euros, that would make', err_sum, 'Euros')

SyntaxError: invalid syntax (<ipython-input-59-c590a2e110ae>, line 1)

In [69]:
print('This is a little error exercise', '\n', 'Please correct the errors')
err_count = float(input('How many errors did you have to fix?'))
peop_count = float(input('How many people are in this course?'))
err_total = err_count * peop_count
err_price = 1.2
err_sum = err_total * err_price
print('there\'s been a total of', err_total, 'errors fixed')
print('if each error was worth', err_price, 'Euros, that would make', err_sum, 'Euros')

This is a little error exercise 
 Please correct the errors
How many errors did you have to fix?6
How many people are in this course?10
there's been a total of 60.0 errors fixed
if each error was worth 1.2 Euros, that would make 72.0 Euros


In [67]:
err_price

(1, 2)