# CIDS Carpentries Workshop - Day 1 - Part 1

This lesson is adapted from the Data Carpentries [Data Analysis and Visualization in Python for Ecologists](https://datacarpentry.org/python-ecology-lesson/index.html) lesson.

---

## How to use a Jupyter Notebook
Online Resources:
- https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/index.html
- https://code.visualstudio.com/docs/datascience/jupyter-notebooks 

Useful Tips:
- To save the notebook/file, <kbd>Ctrl</kbd> + <kbd>s</kbd> or Go to `File -> Save`.
- You run a cell with <kbd>Shift</kbd> + <kbd>Enter</kbd> or
    - **Jupyter Notebook, JupyterLab**: you can use the run button ▶ in the tool bar.
    - **VScode**: you can use the run button ▶ in front of the cell.
- If you run a cell with <kbd>Option (Alt)</kbd> + <kbd>Enter</kbd> it will also create a new cell below.
- If you opened this a classic notebook you can check *Help > Keyboard Shortcuts* else see the *Cheatsheet* for more info.
- If you are using VScode, See [Jupyter Notebooks in VS Code](https://code.visualstudio.com/docs/datascience/jupyter-notebooks) for more info.
- The notebook has different type of cells (Code and Markdown are most commonly used): 
    - **Code** cells expect code for the Kernel you have chosen, syntax highlighting is available, comments in the code are specified with `#` -> code after this will not be executed.
    - **Markdown** cells allow you to right report style text, using markdown for formatting the style (e.g. Headers, bold face etc).
---


## ❓Questions and Objectives for this Notebook
What should you be able to answer by the end of this notebook?
### Questions

- How do I program in Python?
- How can I represent my data in Python?

### Objectives

- Describe the advantages of using programming vs. completing repetitive tasks by hand.
- Define the following data types in Python: strings, integers, and floats.
- Perform mathematical operations in Python using basic operators.
- Define the following as it relates to Python: lists, tuples, and dictionaries.
---

## Short Introduction to Programming in Python

### Interpreter
Python is a high-level, interpreted programming language. This means the code is easy to read for humans and there is no need for us to compile it and in many cases, we do not have to think too much about the underlying system e.g. memory usage.

As a consequence, we can use Python in two ways:
- Using the intepreter as an "advanced calculator" in interactive mode:

In [1]:
# Calculations
2+2

4

In [2]:
# Printing text to screen
print('Welcome to Curtin Institute for Data Science - Python workshop')

Welcome to Curtin Institute for Data Science - Python workshop


- Executing programs/scripts saved as a text file, usually with a *.py extension:

In [3]:
# Running scripts (using Jupyter Notebook magics)
%run my_script.py

Hello World


---

### Python Built-in Data Types
#### Strings, Integers and Floats

One of the most basic things we can do in Python is to assign values to variables. Everything in a Python object has a type and affects what we can do with it and the outputs of calculations as well. There are three main types of data we'll explore in this lesson: strings, integers and floats.

Strings are values that contain numbers and/or characters. For example, a string might be a word, a sentence, or several sentences. A string can contain or consist of numbers. For instance, '1234' could be stored as a string. As could '10.23'. However **strings that contain numbers cannot be used for mathematical operations!**

Integers are numbers without a decimal point. Thus 1.13 would be stored as 1. 1234.345 is stored as 1234.

Floats or floating point numbers in contrast, have decimal points. For example, 0.00, 1.13 and 2.0. 


In [4]:
# Example of a string
'1234'

# Example of an integer
1234

# Example of a float
1234.5

1234.5

In Python, both single quotes (') and double quotes (") can be used to define string literals. The choice of which one to use is mostly a matter of personal preference, but they serve slightly different purposes. Here's an example of using both single and double quotes:

```python
'Hello, World!' # single_quoted
"Hello, World!" # double_quoted
'We said, "Hello, World!"' # mixed_quotes

# Escaping quotes
'I\'m a programmer.' # single_with_escape
"We said, \"Hello, World!\"" # double_with_escape
```

Here we've assigned data to the variables `text`, `number` and `pi_value`, using the assignment operator `=`.

To check the type of something, we can use the built-in function `type`:

In [5]:
text = "Data Carpentry"  # An example of assigning a value to a new text variable,
                         # also known as a string data type in Python
number = 42              # An example of assigning a numeric value, or an integer data type
pi_value = 3.1415        # An example of assigning a floating point value (the float data type)

In [6]:
text

'Data Carpentry'

In [7]:
type(text)  # try type(number) or type(pi_value) 

str

To print out the value stored in a variable, we can simply type in the name of the variable into the interpreter:

In [8]:
# Print out text
text  # try number or pi_value

'Data Carpentry'

Or we can call the built-in `print` function:

In [9]:
# Print out text
print(text)

Data Carpentry


A cell in a Jupyter Notebook, by default, will print to the screen the last thing it evaluates.

In [10]:
# Print out text and number
text

pi_value

3.1415

To print out multiple variables in a cell, we can evaluate our variables separated by a comma or use multiple `print` statements.

In [11]:
# Print out text, number and pi_value
text, number, pi_value

('Data Carpentry', 42, 3.1415)

In [12]:
# Print out text, number and pi_value
print(text, number, pi_value)

Data Carpentry 42 3.1415


In [13]:
# Print out text, number and pi_value
print(text)
print(number) 
print(pi_value)


Data Carpentry
42
3.1415


#### Mathematical Operators
We can perform mathematical calculations in Python using the basic operators `+, -, /, *, %`:

In [14]:
# Addition
2 + 2

4

In [15]:
# Multiplication
6 * 7

42

In [16]:
# Power
2 ** 16

65536

In [17]:
# Modulo
13 % 5

3

#### Logical Operators
We can also use comparison and logic operators: `<, > , ==, !=, <=, >=` and statements of identity such as `and, or, not`. The data type returned by this is called a *boolean*.

In [18]:
3 > 4

False

In [19]:
True and True

True

In [20]:
True or False

True

In [21]:
True and False

False

#### Sequences: Lists and Tuples
#### Lists
*Lists* are a common data structure to hold an ordered sequence of elements. Each element can be accessed by an index. Note that Python indexes start with 0 instead of 1:



In [22]:
# Creating and indexing a list of numbers
numbers = [1, 2, 3]
numbers[0]

1

A for loop can be used to access the elements in a list or other Python data structure one at a time:

In [23]:
# loop over list, print elements
for num in numbers:
    print(num)

1
2
3


To add elements to the end of a list, we can use the `append` method.

Methods are a way to interact with an object (a list, for example). We can invoke a method using the `.` followed by the method name and a list of arguments in parenthesis. Let's look at an example using `append`:

In [24]:
# Adding an element to a list
numbers.append(4)
print(numbers)

[1, 2, 3, 4]


To find out what methods are available for an object, we can use the built-in `help` command.

In [25]:
# Viewing the help documentation for numbers list
help(numbers)

Help on list object:

class list(object)
 |  list(iterable=(), /)
 |  
 |  Built-in mutable sequence.
 |  
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate sign

#### Tuples
A tuple is similar to a list in that it's an ordered sequence of elements. However, tuples cannot be changed once created (they are "immutable"). Tuples are created by placing comma-separated values inside parentheses ().

In [26]:
# Creating tuples and a list

# Tuples use parentheses
a_tuple = (1, 2, 3)
another_tuple = ('blue', 'green', 'red')

# Note: lists use square brackets
a_list = [1, 2, 3]

##### ✏️ Challenge
1. What happens when you execute `a_list[1] = 5`?
2. What happens when you execute `a_tuple[1] = 5`? Why is this different to 1?
3. What does `type(a_tuple)` tell you about `a_tuple`?
4. What information does the built-in function len() provide? Does it provide the same information on both tuples and lists? Does the help() function confirm this? 

In [27]:
#1   
a_list[1] = 5

# The second value in a_list is replaced with 5.
a_list[1]

5

In [28]:
#2 
a_tuple[2] = 5

# TypeError: 'tuple' object does not support item assignment

TypeError: 'tuple' object does not support item assignment

In [None]:
#3
type(a_tuple)

# The function tells you that the variable a_tuple is an object of the class tuple.

tuple

In [None]:
#4
len(a_list)  

3

In [None]:
len(a_tuple)

3

In [None]:
help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



#### Dictionaries

A dictionary is a container that holds pairs of objects - keys and values.

In [None]:
# Creating a dictionary
translation = {'one': 'first', 'two': 'second'}
translation['one']

'first'

Dictionaries work a lot like lists - except that you index them with *keys*. You can think of a key as a name or unique identifier for the value it corresponds to.

In [None]:
rev = {'first': 'one', 'second': 'two'}
rev['first']

'one'

To add an item to the dictionary, we assigned a value to a new key:

In [None]:
rev['third'] = 'three'
rev

{'first': 'one', 'second': 'two', 'third': 'three'}

Using for loops with dictionaries is a little more complicated. We can do this in two ways:

In [None]:
# loop over items
for key, value in rev.items():
    print(key, '->', value)

first -> one
second -> two
third -> three


or

In [None]:
# loop over keys
for key in rev.keys():
    print(key, '->', rev[key])

first -> one
second -> two
third -> three


##### ✏️ Challenge
1. First, print the value of the `rev` dictionary to the screen.
2. Reassign the value that corresponds to the key `second` so that it no longer reads "two" but instead `2`.
3. Print the value of `rev` to the screen again to see if the value has changed.

In [None]:
# 1
print(rev)

{'first': 'one', 'second': 'two', 'third': 'three'}


In [None]:
# 2
rev['second'] = 2
print(rev)

{'first': 'one', 'second': 2, 'third': 'three'}


In [None]:
# 3
print(rev)

{'first': 'one', 'second': 2, 'third': 'three'}


---

### Looping
Doing things one at a time can often be quite tedious. Python allows us to *iterate* what we do programmatically using `for` loops.

For example, a `for` loop can be used to access the elements in a list or other Python data structures one at a time.

In [None]:
# Iterating over a list - we defined numbers as a list previously
for num in numbers:
    print(num)

1
2
3
4


**Indentation** is very important in Python. Note that the second line is indented. This is Python's method of marking a block of code.

Using `for` loops with dictionaries is a little more complicated, but we can do it in two ways:

In [None]:
# Iterating over a dictionary - Method 1
for key, value in rev.items():
    print(key, '->', value)

first -> one
second -> 2
third -> three


In [None]:
# Iterating over a dictionary - Method 2
for key in rev.keys():
    print(key, '->', rev[key])

first -> one
second -> 2
third -> three


---

### Functions
Defining a section of code as a *function* in Python is done using the `def` keyword. For example, a function that takes two arguments and returns their sum can be defined as:

In [None]:
# Function to sum two numbers
def add_function(a, b):
    result = a + b
    return result

In [None]:
z = add_function(20, 22)
print(z)

42


In [None]:
# Make a function to add three numbers and take the average
def add_and_average(num1, num2, num3):
    total = num1 + num2 + num3
    average = total / 3
    return average

# Example usage:
result = add_and_average(10, 20, 30)
print(result)

20.0


---

# ❗Key Points

- Python is an interpreted language which can be used interactively (executing one command at a time) or in scripting mode (executing a series of commands saved in file).
- One can assign a value to a variable in Python. Those variables can be of several types, such as string, integer, floating point and complex numbers.
- Lists and tuples are similar in that they are ordered lists of elements; they differ in that a tuple is immutable (cannot be changed).
- Dictionaries are data structures that provide mappings between keys and values.