# The Basics
## Comments, the `print` function, and Python as a calculator

### Commenting code
When writing code, you are your own worst enemy. You might remember what something means at the time of writing, but when you return a day later, a week, or even a month later, you *will* forget. Comment code using by starting a line with `#` - anything following will be ignored by Python.

In [1]:
# Any line of code you preface with a '#' symbol is a comment. Python will ignore it - its just for you to read. 
# ALWAYS COMMENT YOUR CODE! #

### The `print` function
When you write code, you often want to see the results of it. Python will happily run code all day, but unless you ask to see the output of certain pieces of code, it will go about its business silently. Anything you put inside the `print` function will be... printed.

### Python is capable of basic maths - it can be used as a calculator.

In [2]:
# Addition
print(10 + 5)

# Subtraction
print(10 - 5)

# Division
print(10 / 2)

# Modulus
print(10 % 3)

# Exponent
print(10 ** 2)

# Do the print outs below make sense?

15
5
5.0
1
100


More complex maths can be done, but follow the order of operations...

In [3]:
# Is this answer surprising?
print(4 * 5 / 2 + 7)

17.0


In [4]:
# One of Python's rules is 'be explicit.' Python will now work within brackets, inside-out
print( (4 * (5 / 2)) + 7 )

17.0


In [5]:
# Greater complexity
print( (10 * (4 + (10 / 2))) - 5 )

85.0


### Variables and data types
Almost always, we want to store information in some way to interact with it later. This is achieved through the use of *variables*. 

A variable is a kind of box that contains some data, which can be of a variety of *types*.

### Assigning a variable - 'putting something in a box'
Assigning a variable is very straightforward, and we'll be doing a lot of it. Variable assignment has rules!
* **Must** begin with a letter (a-z, A-Z) or an underscore, _
* Other characters may be letters, numbers or underscores
* Variables are **case sensitive**. `my_variable` is different to `My_variable`!
* Can be any length, but be sensible.
* Some words you cannot use as a variable name, because Python has reserved them
    * Examples include `and`, `or`, `for`, `return`, `del`, `def`, `in`, `yield`, `True`, `False`


In [6]:
# An example
weight = 90

print(weight)

90


In [7]:
# Another
height = 1.90

print(height)

1.9


Its now possible to work with these variables as if they are raw numbers, since they are just pointing to those numbers!

In [8]:
BMI = weight / (height ** 2)
print(BMI)

24.930747922437675


What if we made a mistake? Python is a forgiving language. We can simply overwrite the earlier valuable by reassigning it, or altering it in some way!

In [9]:
# We've gained weight!
weight = 95

print(weight)

95


Since we can interact with variables as if they are numbers, the following approaches are also valid.

In [10]:
# We need to increase the height
height = height + 0.10

print(height)

# I can't seem to get this right...
height = height - 0.10
height = height + 0.05

print(height)

2.0
1.95


### 'Syntactic sugar'
It's pretty common to reassign a variable using the notation above. But that is a lot of typing. Python has some excellent shortcuts that can speed these operations up, which are known as 'syntactic sugar' - sweet code!

In [11]:
# How to alter a variable quickly
age = 30

# My birthday is this week, so...
age += 1

print(age)

31


It's possible to use `+=`, `-=`, `/=`, `%=`, and `**=` to carry out the various operations on variables in place, **if** the variables refer to a single number.

### Variable types
Not all variables and data are the same, and what you can do with a variable depends on its type. Each type comes with its own abilities and uses.


There are several common data types that you see during analysis. 

* Integers, known as `int`. These are whole numbers, e.g. `1`, `45`.
* Floating points, or `float`. These are numbers with extra precision, e.g. `3.894`, `99.100000003`


* Booleans, or `bool`. These are special variables that are represented by `True` and `False`. These have a range of uses in **comparing** variables, or checking whether certain conditions are met.
* Strings, or `str`. These represent text data and have a range of uses in data analysis. 

How do you know what type your variable is? Use the `type` function!

In [12]:
# Make an integer
an_integer = 1
print(type(an_integer))

<class 'int'>


In [13]:
# Make a float
a_float = 5.78
print(type(a_float))

<class 'float'>


In [14]:
# Booleans already exist, but we can assign them to a variable
is_true = True
print(type(is_true))

<class 'bool'>


In [15]:
# Make a string
a_string = 'Hello world'

another_string = "Hello again"

formatted_string = """Any text tha
t star
ts with three quotes     
will be formatted just like you type!"""

print(type(a_string))

# Just to show
print(formatted_string)

<class 'str'>
Any text tha
t star
ts with three quotes     
will be formatted just like you type!


### Converting between types
Variables do not have to stay as the type they were assigned to. 

The variable types - `int`, `float`, `bool`, and `str` can be used as functions to convert between types, but only if the variables have certain values.

In [16]:
# Convert a float to a an integer
to_int = int(a_float)

print(type(to_int), to_int)

<class 'int'> 5


In [17]:
# Convert an integer to a float
to_float = float(an_integer)

print(type(to_float), to_float)

<class 'float'> 1.0


In [18]:
# Convert an integer to a boolean - anything 1 or more gets converted to True, zero is False
true_bool = bool(12)
false_bool = bool(0)

print(type(true_bool), true_bool)
print(type(false_bool), false_bool)

<class 'bool'> True
<class 'bool'> False


In [19]:
# Converting types with strings can be tricky. You can only convert numbers-as-strings to actual numbers.
a_number = '1'

print(type(a_number), a_number)

<class 'str'> 1


In [20]:
# Convert it to an integer
as_integer = int(a_number)

print(type(as_integer), as_integer)

# Notice how they look the same to us, but not to Python!

<class 'int'> 1


In [21]:
# Turn a number to a string
as_string = str(1342)

print(type(as_string), as_string)

<class 'str'> 1342


### Working with different variable types
Variables of different types can interact in certain ways, but not in others. They can also behave in unexpected ways which can catch you out, so its worth learning this early.

In [22]:
# Adding an integer and a float works!
float_ = float(3.21)
int_ = int(2)

print(float_ + int_)

5.21


In [23]:
# Adding an integer and a boolean doesn't make any sense, but since True is seen as ONE - makes sense to Python
a_bool = True

print(int_ - a_bool)

1


In [24]:
# Adding strings together is works just like a stitch, but subtraction doesn't work 
print('hello' + 'world')

# Remember your whitespace!
print('Hello' + ' ' + 'world')

# What happens if I multiply? Division doesn't work!
print('hello' * 10)

helloworld
Hello world
hellohellohellohellohellohellohellohellohellohello


In [25]:
# But working with strings and other types is as simple as a correct conversion - without, it will break
language = 'Python'
age = 30

sentence = 'The programming language ' + language + ' is ' + str(age) + ' years old'

print(sentence)

NameError: name 'name' is not defined

### Logic - comparing data and variables

A lot of data handling and analysis involves comparison of some kind. How can we compare whether one score is higher than another? How do we remove participants from an analysis who are above a certain age, or have a score outside of a normal range?

Computers are excellent at this kind of `logic`, and there are specific rules about how to compare values.

The specific syntax for comparing variables:

* Are two variables the same, `==`
* Are two variables **not** the same, `!=`

* Is one variable **greater** than another, `>`
* Is one variable **less** than another, `<`

* Is one variable **greater than or equal** to another, `>=`
* Is one variables **less than or equal** to another, `<=`

These comparisons are known as **Boolean logic**, and the results of the comparisons will be either `True` or `False` - the special variables that are used to manage logical outcomes.

In [None]:
# Variable comparison examples
# Define some simple variables - ';' can be used to have more than one statement on a line
a = 5; b = 3; c = 23; d = 23

In [None]:
# Is a greater than b?
print(a > b)

In [None]:
# Is b less than c?
print(b < c)

In [None]:
# Is a greater than d?
print(a > d)

In [None]:
# Are c and d the same?
print(c == d)

# Or is one equal to or greater than the other?
print(c >= d)

Variable comparison is straightforward, but sometimes you may have a series of variables to compare, and this is a little more involved, and uses some extra keywords.

Introducing `and`, `or`, `not`:

* `and` compares a pair of booleans, and will return a `True` **only** if they are both the same. Also appears as the `&` operator!

* `or` compares a pair of booleans, and will return a `True` if at least one of them is `True`. Also appears as the `|` operator.

* `not` will invert a boolean value - so if its `True`, it will become `False`, and so on.

Let's look at some examples.

In [None]:
# Define some variables
x = 10; y = 12; z = 15
a = 9; b = 22; c = 30
age = 30; sex = 'female'

In [None]:
# Demonstrate 'and'
age > 20 and y > 10

In [None]:
# Demonstrate 'or' 
sex == 'female' or age < 20

In [None]:
# Demonstrate 'not'
not b < 20 and age >= 30

There are more complex comparisons can be carried out using `()` to group subsets of comparisons that can make up larger comparisons.

In [None]:
# More complex comparisons
(sex == 'female' and age < 30) or (a < 9 or c == 30)

In [None]:
# More complex 
not (sex == 'male' and age < 30) and (b < 25 or z > 16)

### Data collections

So far, we've seen data stored in single variables. But we usually want to collect data together somehow - different variables for each participants, different experimental conditions, or different data files for each participants. 

How can a bunch of variables be stored together? Python has a range of `collections` - special structures that can store data together. The two we'll focus on are called `lists` and `dictionaries`.

## The list
The `list` is a workhorse of data manipulation and analysis. A list is group of variables of various types, set in a particular order, that are stored together in a single variable.

Consider this example from an analysis perspective - data from an experimental participant on a personality test. It would be clumsy to refer to these all the time individually, so let's use a list.

In [None]:
# Define some data variables
pid = '001'; age = 29; sex = 'female'; extraversion = 5; neuroticism = 3

In [None]:
# Store these in a list!
data = [pid, age, sex, extraversion, neuroticism]
print(data)

The data is now stored together in one collection, with varying types. Each part of a list is known as an element, and has a position. It's worth noting that each element can be any Python data type or structure - even another list!

In [None]:
# Nested list
nest = [1, 4, 5, ['inside', 'the', 'list']]
print(nest)

### The list
We've got our data in a list. But what if we need to interact with the elements? How can we access different elements? Using an `index`, and different `methods`. 

Welcome to one of Python's quirkiest parts: list indexing.

#### Simple indexing
You can access any element of a list by passing its **index value** to the list variable, using square brackets, `[]`, in this format:
`list_name[number]`.
List indexing **starts from zero**. That is, the first element is in the '0th' position.

In [None]:
# Get the participant ID from the data list
retrieved_pid = data[0]
print(retrieved_pid)

In [None]:
# Get the extraversion score
print(data[3])

In [None]:
# Lists can also be accessed in reverse order, by using negative indexes!
# Grab the final element of a list
print(data[-1])

In [None]:
# Or grab the FIRST element of a list using reverse indexing
print(data[-5])

A graphic for rememebring list indexing:
![List%20Index.png](attachment:List%20Index.png)

### List slicing
Getting things out of a list one by one would involve a lot of typing. Often, we might want to access a range of values all at once. This is achieved through specific notation using a colon, `:`, and specific index locations.

In [None]:
# The first few elements of `data` have demographics. Let's get those out
print(data[0:3])

In [None]:
# Or if you are starting from the very beginning, the 0 can be omitted
print(data[:3])

In [None]:
# Reverse slicing works in the same way
print(data[-3:])

In [None]:
# Nested lists can be indexed one after the other 
print(nest)
print(nest[3][-1])

Do you notice anything interesting about the slicing? To access the first three elements, you have to pass `[0:3]`. List slicing works on the basis that:

* The first value specifies the start and is ***included***.
* The value after the colon specifics the end and is ***not*** included.

A good way of remembering this is to think Python sees a slice as 'start here, and take ***up to, but not including*** this number'

### Stepped indexing 
It's possible to use different steps to jump over lists in different ways, using a set of three values and colon operators, that specify where to start, where to stop, and how many steps to take, like this:

`a_list[start:stop:steps]`

In [None]:
# Define a longer list
long = ['a', 1, 'c', 45, 'd', 77, 'st', 4.321]

In [None]:
# Index with steps
print(long[1:6:2])

In [None]:
# Another variation
print(long[:5:3])

In [None]:
# In reverse!
print(long[5:1:-1])

### List `.methods` and other tricks
Lists are pretty powerful, but there additional things that make them even cooler. Lists (along with many other data structures in Python) have **methods**. A **method** is a function that 'lives' in a variable, and can do specific things to the data in that variable. Not all data structures have methods, but the ones that do usually have great uses. A method is accessed by the `.` operator.

Let's say we want to add data to a list. How can we do this? Using the `.append` method.

In [None]:
# Add some Agreeableness scores to the personality data
print(data)

In [None]:
# Define Agreeableness score
agreeableness = 1
# Append it!
data.append(agreeableness)
print(data)

Sometimes, you can't remember the location of a specific element in a list when you want to get to it. There is an `.index` method that takes a value that is in the list, and will return the index. 

In [None]:
# We need the position of the sex variable. We know they are female, so...
ind = data.index('female')
print(ind)

# Use that index
print(data[2])

Lists can be interacted with using the `*` and `+` operators, which result in interesting manipulations.

In [None]:
# Make two lists
list_a = [3, 2, 1]; list_b = [4, 5, 6]

In [None]:
# Try addition...
print(list_a + list_b)

In [None]:
# What if I multiply a list?
print(list_a * 3)

In [None]:
# Multiple operators
print(list_a * 3 + list_b * 2)

We may sometimes want to alter lists in some way. That's easy enough to do using slicing or methods. Lists are known as a **mutable** data structure, because you can alter them once you have created them. 

In [None]:
# Update the data list with new participant data
print(data)

# Update the scores
data[3:] = [5, 5, 5]

print(data)

## The dictionary - look it up!
Lists are powerful ways to store variables and data of many different types. But it can be hard to keep track of what kind of variable is in what element. For example, in the `data` list, the first few elements were demographic data. Was the age the second element, or the third? 

That's not easy to remember. Fortunately, another data structure, the `dictionary`, addresses these shortcomings, by providing a set of *key:value* pairs, which are defined using curly braces notation `{}`. The keys must be an **immutable** data structure - a string usually works!

We can define a dictionary with some demographic data.

In [None]:
# Create a dictionary
demographics = {'PID':'001', 'Age':30, 'Sex':'Female'}

# Or use the `dict` function, assigning the values with keyword arguments
demographics_1 = dict(PID='001', Age=30, Sex='Female')

print(demographics)
print(demographics_1)

#### Looking things up in the dictionary
Getting a value from a dictionary is as easy as providing the **key**, following this format. 

`dictionary_name[key_name]`

In [None]:
# Retrieve the participant sex
print(demographics['Sex'])

#### Adding data to the dictionary
Adding extra data is as simple as providing a new *key:value* pair. Entire extra dictionaries can be provided through the dictionary `.update` method.

In [None]:
# Add a score to the data
demographics['Extraversion'] = 4
print(demographics)

In [None]:
# Create a new dictionary of scores to be added
scores = dict(Agreeableness=4, Neuroticism=2, Openness=1, Conscientiousness=3)

# Call the update method on the demographics dictionary
demographics.update(scores)

print(demographics)

#### Important dictionary methods
Like lists, dictionaries have a range of methods. Some useful ones to know are the following:
* `.keys()` : returns a list of all the keys in the dictionary, useful if you aren't sure whats inside one.
* `.values()` : returns a list of all the values (i.e. data) in the dictionary. Allows a look directly at the data.
* `.items()` : returns a list, where each element is a tuple (like a list, but immutable), where the first element is the key, and the second, the value.

In [None]:
# Look at the keys 
print(demographics.items())

In [None]:
# Look at the values
print(demographics.values())

In [None]:
# Look at it all!
print(demographics.items())

#### Advanced dictionaries
Like lists, dictionaries can also be nested, with a key referring to another dictionary. 

It follows that a value can be anything - a list, dictionary, string, or any of Python's many data types.

In [None]:
# Demonstrate a nested dictionary
nest_dict = {'PID': 'John', 'Age': 25, 'Personality':{'Extra': 4, 'Neuro': 1}}

In [None]:
# Accessing nested attributes
print(nest_dict['Personality'])
print(nest_dict['Personality']['Extra'])

It should be clear by now that lists and dictionaries can be used together, to build complex data structures capable of storing a range of variables and data. Extracting data from a complex data structure is relatively straightforward, armed with what we know.

In [None]:
# Create a somewhat complex data structure - notice we can split long commands across lines
participantA = {'PID': '001', 'age': 23, 
               'Anxiety_Score':{'Subscale1': 4.22, 'Subscale2': 2.34 },
              'RTs': [321, 230, 400, 593, 600, 102, 129]}

participantB = {'PID': '002', 'age': 25, 
               'Anxiety_Score':{'Subscale1': 1.20, 'Subscale2': 5.24 },
              'RTs': [876, 450, 291, 893, 864, 456, 983]}

full_data = {'ppt_A': participantA, 'ppt_B': participantB}

In [None]:
# Extracting buried items - extract the last reaction time they gave
print(full_data['ppt_A']['RTs'][-1])

In [None]:
# Or anxiety scores
print(full_data['ppt_B']['Anxiety_Score']['Subscale1'])

## Functions

Functions are nothing new to you - we have used them before!
* `print`
* `type`
* `dict`
* `list`

A function is a piece of reusable code that is suited for a particular task, that can be reused. Python has thousands of functions built in, as well as across different packages we will come to in the future. 


#### Functions example
How do we use functions? With the following general approach:

`output = function_name(work_on_this, more_arguments)`

Two built in functions we can play with are `min` and `max`. 

In [None]:
# Example
scores = [2.39, 1.12, 5.21, 8.92, 1.24]

# Find the range by using min and max
lowest = min(scores)
highest = max(scores)

# Compute the span
print(highest - lowest)

#### Functions examples
`max` and `min` are really simple functions, with a single input. Sometimes functions have more than one input, or **argument**. More than one argument is entered by separating with a `,`.

A common data analysis function is `round`, which rounds numbers to a degree of precision set by you. Its first input is a value you want to round, and the second is the level of precision you want.

In [None]:
# Multi-argument function example
value = 3.193849291

# Use round 
rounded = round(value, 3)

print(rounded)

#### Functions examples
Functions often have what are known as default arguments. What happens if we just give `round` a single argument, the number we want to round?

In [None]:
# Testing round
print(round(value))

Python has made a guess about what you want - rounded to a whole number. That is, the second argument has a **default** value - zero.

#### `help`!
If you ever get confused about what a function does, or what arguments it requires, you can ask for help. 
`help` is a function whose job is to give you a print out of how a function works. What does it say about `round`?

In [None]:
# Get help on round
help(round)

## Methods
We've seen methods before - the special functions that a `list` can use to make alterations to itself. 

But methods are everywhere, and depend on the variable in question. You can use `help` on a specific data type to see what methods it has. 

`string` data has a range of useful methods that make working with text data simple.

For example, we can use:
* `.upper()`
* `.lower()`
* `.count()`
* `.split()`
* `.join()`

for various exciting possibilities. 

In [None]:
# Define a string
basic = 'We are going to play with this!'

In [None]:
# How does .upper() work?
print(basic.upper())

In [None]:
# So lower should...
print(basic.lower())

In [None]:
# How many 'a's in this sentence?
print(basic.count('a'))

In [None]:
# Chop this up by the number of space
print(basic.split(' '))

In [None]:
# Join is cool!
split_list = basic.split(' ')
print('!'.join(split_list))

## Loops
Things are getting more advanced now. There is one programming practice that you should master to fully control your data. 

What we've seen so far is straightforward, defining collections like lists and dictionaries, and retrieving data from them through indexing. Sometimes, we want to do something with each element in a data collection - perhaps call a function on it, read a file, remove a piece of data, etc. How can we do this without manually accessing each element manually?

The answer is the `for` loop, a special piece of syntax that allows us to *iterate* over a sequance or collection and retrieve the values inside by temporarily passing them to a new variable of whatever name we choose. 

The format looks like this:

`for variable in sequence:`

    `do something with variable`
    
Not the indentation on the second line - as you'll see, this is part of the code. Python uses this to differentiate code that is in the for loop with that outside of it. 

For loops can be confusing, so some examples are required!

In [None]:
# Make a list - these values all need to be rounded to two decimal places!
data_list = [12.394, 9.0948, 1.02, 39.4023, 2.392, 50.12948912]

# Write a for loop that prints each element in turn
for score in data_list:
    print(score)

At each iteration, the value of `score` changed to be the next element in the list. So it started with the element in the position `[0]`, then `[1]`, and then onwards until `data_list` was finished. 

Let's say we want to modify each element of a list, by rounding the numbers. This could be achieved as follows:

In [None]:
# First, make an empty list to store our new data - just square brackets with no variables
rounded_data = []

# Iterate over the original list - round each element - and append to our empty list
for value in data_list:
    round_value = round(value, 2)
    rounded_data.append(round_value)
    
print(rounded_data)

#### Building complexity with iteration
You aren't limited to iterating just once. You can **nest** an iteration in the same way you can nest lists and dictionaries. This sometimes happens if you need to dig deeper into data structures to manipulate values.

In [None]:
# Define a nested list
nest_list = [ ['001', 23, 'female'], ['002', 31, 'male'], ['003', 40, 'female'] ]

# Iterate over the main list
for participant in nest_list:
    
    # Iterate over the sub-list
    for score in participant:
        print(score)

In [None]:
# Or access the sublists to do what we want!
for participant in nest_list:
    
    line = 'Participant ID = ' + participant[0] + ', age = ' + str(participant[1]) + ', sex = ' + participant[2]
    print(line)

## Loops
#### `enumerate` 
It should be clear from the above examples that when iterating over a list using `for`, we get back a variable that changes at each iteration. This is usually fine, but sometimes, we want to know the **actual position** of the element and do something with that. 

The `enumerate` function is used for dealing with this. We call `enumerate` on a sequence when we set up a `for` loop, and it returns a pair of variables - a counter, that returns the index, and a value, that returns the element, just like a normal `for` loop. 

The syntax looks like this!

`for index, value in enumerate(sequence):`

    `do stuff!`

In [None]:
# Define a list
data_list = [123, 381, 'content', 'value']

for index, value in enumerate(data_list):
    line = 'Item in position ' + str(index) + ' is ' + str(value)
    print(line)

In [None]:
# To demonstrate the use of index to actually access elements!
for index, value in enumerate(data_list):
    print(data_list[index])

In [None]:
# Or to more efficiently replace elements - the original list is now lost
data_list = [12.394, 9.0948, 1.02, 39.4023, 2.392, 50.12948912]
print(data_list)

for index, value in enumerate(data_list):
    
    # Use the index to access and assign a new variable
    data_list[index] = round(value, 2)

print(data_list)

## Loops
### Iterating over dictionaries
Dictionaries are a somewhat more complex structure compared to lists. For example, they don't 'remember' the order things were put in them, depending on the version of Python you use, so iterating over them in the same way as lists doesn't always make sense. 

If we remember the **methods** dictionaries had, we can iterate over those to access and play around with created dictionaries.

In [None]:
# Make a dictionary
data_dict = {'Score_A': 45.2912, 'Score_B': 68.2945, 'Score_C': 88.9873}

In [None]:
# Print out the keys
for key in data_dict.keys():
    print(key)
    


In [None]:
# Or use the keys to access the data!
for key in data_dict.keys():
    print(data_dict[key])

In [None]:
# Use this to grab the values - beware they may not be in the order you think
for value in data_dict.values():
    print(value)

In [None]:
# Finally, use .items to get both key:value pairs out!
for key, value in data_dict.items():
    line = 'The key ' + key + ' returns the value: ' + str(value)
    print(line)

In [None]:
# Makes sense to use this to modify the original dictionary
# Raise each value to the power of 3 to two decimal places
for k, v in data_dict.items():
    data_dict[k] = round(v ** 3, 2)
    
print(data_dict)

The basics are covered! 

That's a lot of information to take in, but these are the fundamentals of working with data Python. Next time, we will cover the **NumPy** and **Pandas** libraries, which are built to process data efficiently and are the real workhorses of doing data analysis in Python.

If you're struggling or are stuck, do not worry. The learning curve is high and it will *take time*. The only way to improve is to keep coding. Try the exercises, and use the internet to look for answers where you aren't sure.

In [None]:
from IPython.display import YouTubeVideo
display(YouTubeVideo('HluANRwPyNo'))