# ADACS - Introduction to computing for astronomers

This lesson material was developed for ADACS face-to-face training. However, as the material is fairly comprehensive one can work through the notebooks on their own. As the face-to-face training is a live coding workshop a solutions notebook is supplied to help anyone working through the notebooks in their own pace.

This notebook was put together by:
- Rebecca Lange | Curtin Institute for Computation
- Paul Hancock | Curtin Institute for Radio Astronomy Research

Inspiration and material for this notebook was taken from:
- Data Carpentry
- Towards Data Science blog
- ...

## Introduction to Jupyter notebooks 


Jupyter Notebook Cheat Sheet
![Jupyter Notebook Cheat Sheet -- courtesy of DataCamp.com](https://cdn-images-1.medium.com/max/800/1*_nFAOrPMxYwE7cBt-ryqZA.png)


## Introduction to Python, Jupyter notebooks and coding best practices

Python is a high-level, interpreted programming language. This means the code is easy to read for humans and there is no need for us to compile it and in many cases we do not have to think too much about the underlying system fro e.g. memory usage.

As a consequence, we can use it in two ways:
- Using the interpreter as an "advanced calculator" in interactive mode:
- Executing programs/scripts saved as a text file, usually with *.py extension:

In [None]:
2+2

In [None]:
print("Hello")

In [None]:
%run my_script.py

# Types of Data

How information is stored in a DataFrame or a python object affects what we can do with it and the outputs of calculations as well. There are two main types of data that we're explore in this lesson: numeric and character types.


## Numeric Data Types

Numeric data types include integers and floats. A **floating point** (known as a
float) number has decimal points even if that decimal point value is 0. For
example: 1.13, 2.0 1234.345. If we have a column that contains both integers and
floating point numbers, Pandas will assign the entire column to the float data
type so the decimal points are not lost. In a vector or data fram (we learn about these different types later) the entire object or an entire column will be of the same type.

An **integer** will never have a decimal point. Thus 1.13 would be stored as 1.
1234.345 is stored as 1234. You will often see the data type `Int64` in python
which stands for 64 bit integer. The 64 simply refers to the memory allocated to
store data in each cell which effectively relates to how many digits it can
store in each "cell". Allocating space ahead of time allows computers to
optimize storage and processing efficiency.



## Character Data Types

Strings are values that contain numbers and / or characters. 
For example, a string might be a word, a sentence, or several sentences. 
A string can also contain or consist of numbers. For instance, '1234' could be stored as a
string. As could '10.23'. However **strings that contain numbers can not be used
for mathematical operations**!





In [None]:
text = "Data Carpentry"
number = 42
pi_value = 3.1415

Here we've assigned data to variables, namely `text`, `number` and `pi_value`,
using the assignment operator `=`. The variable called `text` is a string which
means it can contain letters and numbers. We could reassign the variable `text`
to an integer too - but be careful reassigning variables as this can get 
confusing.

To print out the value stored in a variable we can simply type the name of the
variable into the interpreter:

In [None]:
text

however, in scripts we must use the `print` function:

In [None]:
# Comments start with #
# Next line will print out text
print(text)

In [None]:
# We also need the print statement if we want to see more than one variable
text
number

In [None]:
print(text, number, pi_value)

### Operators

We can perform mathematical calculations in Python using the basic operators
 `+, -, /, *, %`:

In [None]:
6*7
2**16
13 % 5

** In python 2 if we divide one integer by another, we get an integer! **
The result in python 3 is different where we get a float.
Remember to convert your integers to floats when you want floating point precision for divisions!

In [None]:
10/3

In [None]:
# convert to integer
a = 6.6
int(a)

In [None]:
# convert to float
b=5
float(b)

In [None]:
10/float(3)

We can also use comparison and logic operators:
`<, >, ==, !=, <=, >=` and statements of identity such as
`and, or, not`. The data type returned by this is 
called a _boolean_.

In [None]:
3>4
True and False
True or False

## Sequential types: Lists and Tuples

### Lists

**Lists** are a common data structure to hold an ordered sequence of
elements. Each element can be accessed by an index.  Note that Python
indexes start with 0 instead of 1:

In [None]:
numbers = [1,2,3]
numbers[0]

To add elements to the end of a list, we can use the `append` method:

In [None]:
numbers.append(4)
print(numbers)

**Methods** are a way to interact with an object (a list, for example). We can invoke 
a method using the dot `.` followed by the method name and a list of arguments in parentheses. 
To find out what methods are available for an object, we can use the built-in `help` command:

In [None]:
help(numbers)

We can also access a list of methods using `dir`. Some methods names are
surrounded by double underscores. Those methods are called "special", and
usually we access them in a different way. For example `__add__` method is
responsible for the `+` operator.

In [None]:
dir(numbers)

### Tuples

A tuple is similar to a list in that it's an ordered sequence of elements. However,
tuples can not be changed once created (they are "immutable"). Tuples are
created by placing comma-separated values inside parentheses `()`.

In [None]:
a_tuple = (1,2,3)
another_tuple = ("blue", "green", "red")

### Challenge
1. What happens when you type `a_tuple[2]=5` vs `a_list[1]=5` ?
2. Type `type(a_tuple)` into python - what is the object type?


In [None]:
a_list=[1,2,3]

In [None]:
a_list[1]=5

In [None]:
a_list

## Dictionaries

A **dictionary** is a container that holds pairs of objects - keys and values.

Dictionaries work a lot like lists - except that you index them with *keys*. 
You can think about a key as a name for or a unique identifier for a set of values
in the dictionary. Keys can only have particular types - they have to be 
"hashable". Strings and numeric types are acceptable, but lists aren't.

In [None]:
translation = {"one":1, "two":2}

In [None]:
translation["one"]

In [None]:
rev = {1:"one", 2:["two", "birds"]}
rev

In [None]:
bad = {[1,2,3]:3}

To add an item to the dictionary we assign a value to a new key:

In [None]:
rev[3]="three"

In [None]:
rev

### Challenge

Can you do reassignment in a dictionary? Give it a try. 

1. First check what `rev` is right now (remember `rev` is the name of our dictionary). 
    
2. Try to reassign the second value (in the *key value pair*) so that it no longer reads "two" but instead reads "apple-sauce". 

3. Now display `rev` again to see if it has changed. 

It is important to note that dictionaries are "unordered" and do not remember the
sequence of their items (i.e. the order in which key:value pairs were added to 
the dictionary). Because of this, the order in which items are returned from loops
over dictionaries might appear random and can even change with time.

In [None]:
rev

In [None]:
rev[2]="apple-sauce"

In [None]:
rev

In [None]:
#if you need to change your directory
import os

In [None]:
os.chdir("../") #make sure you enter the correct fille path

In [None]:
os.listdir("./")
os.chdir("data/")