# 0: Introduction to Python

<img alt="xkcd Python" align="right" style="width:40%" src="https://imgs.xkcd.com/comics/python.png">

In this course we will be using the Python programming language to help us learn how to automate tasks in geoscience.
Python is a relatively friendly language, but it still has lots of **rules** that you need to follow to make codes run.
In the first three notebooks we will introduce some of those rules and start you on your way to 
[zen](https://www.python.org/dev/peps/pep-0020/).

With that said, what follows is a very brief intro to Python for newcomers.  The focus of thess notebooks is to introduce you
to the simple data types and logic in Python, and a couple of handy packages.  There are many other great tutorials out there 
for more in-depth ideas, e.g.
- [The Python tutorial](https://docs.python.org/3/tutorial/)
- [LearnPython](https://www.learnpython.org/)
- and many more.

Python itself is a useful language, but one of *the best* things about Python is all the packages written to extend
Python.  These packages are usually distributed via [pypi](https://pypi.org) or/and [anaconda](https://conda.io), and are 
easy to install.  This means that you often don't have to write (much of) your own code! Most of the time someone out
there knows better than you how to do something, so you get to use their code and focus on the important things.

Most Python packages have documentation.  If you find yourself stuck, or thinking *I wish I could do this*, it is worth having
a search online for what you want, or what you are stuck on.  With Python, installing other packages can be quite simple using
either [conda](https://conda.io/en/latest/) or [pip](https://pypi.org/project/pip/).

## Python

Python is an interpreted language (rather than a compiled language like Fortran or C). Because of this it is easy to
iterate and see your results. You can interact with your code in a step-by-step way, so it is simple to understand
the logic of your code. However, because of the interpreted nature of the language, Python is rarely the fastest
choice. To combat this, Python can be (and has been) extended by compilled sections of code, meaning that time-critical
sections of code can be sped-up.  This has led to quite a few libraries that use *Python as glue* to hold together
faster sections of code written in C, fortran, or other languages. We will introduce one of these fast packages, *numpy*
later: *numpy* is at the heart of almost all scientific Python applications.

Python itself is open-source and runs almost anywhere, and is used for a whole range of purposes, from science
to web-pages, data analysis and more: Dropbox was written almost exclusively in 
[Python](https://blogs.dropbox.com/tech/2018/09/how-we-rolled-out-one-of-the-largest-python-3-migrations-ever/).

## Using Python on your computer

For our class we will be using these Jupyter notebooks (see below), but you can interact with Python in
other ways: You can interact with Python by running a Python shell - in MacOS or Linux systems, open a terminal and type
`python3` to start a Python 3.x shell (the default Python on most systems is python 2.7, which
is currently end-of-life, so best to start with Python 3). In windows, open the command line and
type `C:\python3\python.exe`: you might have to check your Python version.  To get a more interactive, nicely
coloured shell, I would recommend using the [iPython](https://ipython.org/) shell, which you can install from
either Pypi or Anaconda.

## Jupyter notebooks

This is a Jupyter notebook! [Jupyter notebooks](www.jupyter.org) provide inline interactive Python shells, alongside markdown capability to
allow you to write descriptive comments around the code. In-fact, recently, [some scientific papers have been written
in Jupyter notebooks](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks#reproducible-academic-publications)
which enables people to test their work. They are a great way to *show your work* while explaining what you did
in more extensive prose. We are using them for teaching purposes because they let us play with the code and
explain the ideas behind the code.

## 0.0 Introduction to fundamental programming concepts

Computers are useful, but they need to be told what to do, even in this age of machine learning. Writing programs allows us to
control what computers do, but we need to be clear and logical in how we program them.  The way we will program, using Python,
is very abstracted from what a computer actually does. Python takes what we write and converts this into a set of simple
and exhaustive instructions that the computer actually undertakes. For Python and your computer to understand what you want to
do you need to write clear and logical code.

Computers understand binary (ones and zeros) values only. I don't know about you, but I struggle to think in binary, so one
of the first things we will think about is the abstraction on-top of binary that allows us to work in real numbers, words and
more.  The only reason you need to know this at the moment is because there are certain things that you can and can't do with
these data types (e.g. you can't add a word to a number, but Python will let you add words together to make sentences). Later in
your programming career you may find that you have to understand the intricacies of these different data types (the difference between
a 64-Bit floating-point number and a 32-Bit number has resulted in some annoyingly wrong earthquake catalogues...).

## 0.1 Getting started - hello world!

The first program written in most languages is a simple "Hello World!" program, that just outputs the phrase "Hello World!"
to the screen. In Python this is embarassingly simple (run the code by clicking the arrow button up the top, or by hitting
*ctrl-enter*):

In [3]:
print("Hello World!")

Hello World!


What we did is call the `print` function with the *argument* `"Hello World!"`. Encapsulating *Hello World!* in
quotes tells Python that we want this to be a *string* type. Strings hold charectars, other types hold other
data types.

The `print` function takes whatever we gave it as an argument and prints that to screen (we see the output of our
code in Jupyter notebooks just beneath the *cell* that we ran the code in).

## 0.2 Some data types in Python

We said that `"Hello World!"` was a string (known as `str` in Python, there are a few other data types that you should know about 
(there are others that we don't need to worry about yet):

- `int`: For storing integers, like 1, 4, 999, -2000
- `float`: For storing floating-point numbers, like 1.2, -37.473, 42.424242424242 - there is a limit to the precision!
- `list`: For storing lists of other objects, written using square-brackets, e.g.: `[1, 2, "alfred", 3.2]`, note 
   that any other data type can be within a list, including another list. The order that you put things into
   a `list` is retained.
- `set`: Another data type for storing other objects, but this time only unique elements are stored, and order
  is not guaranteed. `Set`s are written using curly-brackets, e.g: `{1, 2, "alfred", 3.2}`. These can be really
  useful for getting a unique set of *things*.
- `dict`: A further data type for storing other objects.  In a `dict` (short for dictionary), *values* are stored
  associated with some *key* in key: value pairs.  This is really useful for keeping track of attributes, for example
  you might have a dictionary to store the attributes associated with an observation like:
  `rock = {"type": "andesite", "age": "10", "comment": "Tastes like fish"}`
  
`print` will take any of these types, convert it to a `str` and print it to screen:

In [4]:
print(1)

1


In [5]:
print(43.4242)

43.4242


In [6]:
print([1, 2, "alfred"])

[1, 2, 'alfred']


In [7]:
# NBVAL_IGNORE_OUTPUT
print({1, 2, "alfred"})

{1, 2, 'alfred'}


In [8]:
print({1, 2, 6, 2, 1})

{1, 2, 6}


Notice that the final print only printed the unique elements of the `set` - only those attributes are stored.

In [51]:
rock = {"type": "andesite", "age": "10", "comment": "Tastes like fish"}
print(rock)

{'type': 'andesite', 'age': '10', 'comment': 'Tastes like fish'}


## 0.3 Variables

Programming languages keep track of values using variables. You (the programmer) assign some name to a value, and the 
computer keeps track of this value in memory.  You can then use this variable later in your code, e.g. you can
some maths! The symbols for doing mathematical operations are:

- `+` addition
- `-` subtraction
- `*` multiplication
- `\` division
- `**` exponentation

Note that division will always return a `float` in Python 3.x.

In [10]:
a = 5
b = 12
c = a + b
print(c)

17


You can also change the value of a variable *in-place* using the following symbols:

- `+=` addition in-place
- `-=` subtraction in-place
- `*=` multiplication in-place
- `\=` division in-place
- `**=` exponentiation in-place

In [24]:
d = 5
d += 3
print(d)

8


In [25]:
d /= 3
print(d)

2.6666666666666665


In [26]:
d **= 3
print(d)

18.96296296296296


Note that if you run the above cell again, without re-running from when we first defined d, you will get a different returned value. Watch out
when working in-place on data!

## 0.3.1 Variable naming

You should always give your variables useful names. Python doesn't demand this, but it does allow it. You should write
your code so that you (both now and in ten years time, as well as other people) can understand it; as 
[Martin Fowler](https://en.wikiquote.org/wiki/Martin_Fowler) wrote:

> Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

Hopefully you will note that variable names `a`, `b`, `c`, ... are crap names for variables! Try to use simple
by expressive names, e.g. if you have a list of different volcano names, don't call that variable `vn`, nor
should you call is `volcano_names_used_in_this_code_with_effusive_eruptions`; just call it `volcano_names`.
*If* you need to get into more detail, maybe you should be using more complex data-types, like `dict`s. We will
play with some of these later.

## 0.4 Comments

Ideally you will write code with useful variable names and obvious logic so that you can read an understand your
programs as if they were prose. When you find that it is a little harder to understand, you should add comments.
Comments are text in your program that are not executed (run), and are just there to help the reader understand
what is going on. While you are learning they can be really handy, and people often start code as a series of
comments that they then convert into code.

Comments in Python start with a `#` charectar, e.g.:

In [30]:
volume_ejected_per_event = [1000, 1500, 750, 8700]  # Total volume of material in cubic kg
# sum is a Python function over any iterable
total_volume = sum(volume_ejected_per_event)
print(f"Total volume ejected: {total_volume} kg^3")
# len tells you how long an object is
n_events = len(volume_ejected_per_event)
average_volume = total_volume / n_events
print(f"Average volume per-event: {average_volume} kg^3")

Total volume ejected: 11950 kg^3
Average volume per-event: 2987.5 kg^3


## 0.5 String manipulation

The above example does some fun things with the strings printed. We can do all sorts with strings, we can add them together:

In [32]:
sentence_start = "Monty Python and the "
sentence_end = "Holy Grail"
sentence = sentence_start + sentence_end
print(sentence)

Monty Python and the Holy Grail


Or we can make a list of them and join them together with another string in between:

In [33]:
sentence = "Un-".join([sentence_start, sentence_end])
print(sentence)

Monty Python and the Un-Holy Grail


We can operate in-place on strings as well:

In [36]:
sentence = sentence_start
sentence += sentence_end
print(sentence)

Monty Python and the Holy Grail


We can make new strings using *format* strings, like we did in the example above. Here we start a string with an `f`, then open
quotes to mark the start of a string. Within this, any plain text is interpreted to be a string. Anything within curly
brackets (`{}`) is interpreted as a variable name and converted to a string.

In [39]:
sentence = f"{sentence_start}{sentence_end} is a rather amusing film"
print(sentence)

Monty Python and the Holy Grail is a rather amusing film


There is lots more that you can do with format strings that we don't need to go into yet, but you can find out more
[in the Python docs](https://docs.python.org/3.6/reference/lexical_analysis.html#f-strings).

## 0.6 List manipulation

Individual elements of a list can be accessed by their index. In Python, the first element in a list is at
index `0` (hence this notebook being the zeroth notebook).  This may seem silly, but you get used to it.
The reason for this is to do with the byte-offset from the start of the memory-block containing the list, but
you don't need to know that.

Say we have a list of 10 elements, their indexes are as follows:

```python
some_list = [1, 9, 3, 26, 7, 9, 42, 99, 1000, -2]
# indexes:   0  1  2   3  4  5   6   7     8   9
```

You can also specify the index as the position from the end, in this case, the equivalent indexes are:

```python
some_list = [1, 9, 3, 26, 7, 9, 42, 99, 1000, -2]
# indexes: -10 -9 -8  -7 -6 -5  -4  -3    -2  -1
```

Lets have a go at manipulating some lists based on their indexes, we will use the `"Monty Python and the Holy Grail"` example again.

In [40]:
sentence = "Monty Python and the Holy Grail"

We can split a string based on any charectar to get a list of strings:

In [41]:
words = sentence.split(" ")
print(words)

['Monty', 'Python', 'and', 'the', 'Holy', 'Grail']


We can re-arange by making new lists from parts of the original list

In [45]:
sense = [words[-2], words[0]]
print(sense)

['Holy', 'Monty']


We can add parts of a list together to get a new thing:

In [46]:
nonsense = words[-1] + words[2]
print(nonsense)

Grailand


We can take chunks of the list by *slicing*

In [47]:
first_three_words = words[0:3]
print(first_three_words)

['Monty', 'Python', 'and']


We can append values to a list:

In [48]:
first_three_words.append("the")
print(first_three_words)

['Monty', 'Python', 'and', 'the']


And we can add lists together (both in-place and not):

In [49]:
first_three_words += ["unruly", "albatross"]
print(first_three_words)

['Monty', 'Python', 'and', 'the', 'unruly', 'albatross']


## Exercise:

Take the following sentence and reverse the order of the words.  This will be easier with loops once we introduce them
in the next notebook...

In [50]:
sentence = "Once upon a time in the West"

In [None]:
# Your answer here:

## 0.7 Dictionary lookups

Finding attributes within a dictionary, if you know the key, is very fast.  Accessing attributes by key is similar to
indexing lists, but instead of providing an index in the square brackets, you provide the key:

In [52]:
rock = {"type": "andesite", "age": "10", "comment": "Tastes like fish"}

print(rock["type"])

andesite


In [53]:
print(rock["comment"])

Tastes like fish


We will use dictionaries a bit in this tutorial because they can be really helpful for keeping track of your data,
you will see *dictionary-style* lookups in the [pandas](5-Pandas-introduction.ipynb) notebook especially.

So now you know that Python is a thing, lets see if we can make use of it.  Like any good scientist we
should take a logical approach and start by introducing [logic in python](1-Python-logic.ipynb).