# Python and Jupyter Basics
### Workshop 0 of DASIL's series on "Data Science with Python"
### Created by Martin Pollack

Welcome to the series!

Right now you are looking at a Jupyter Notebook. This is a popular filetype for doing Data Science in Python, and we will be using it often.

In "code" blocks you can write your actual Python code, and outputs coming from your code are printed out nicely at the end of each block. To run your code, just hit the arrow in the upper left corner of the code block.

Then there are also "markdown" blocks for writing normal text to describe or give context for your code, like the one you are reading right now! There are lots of things you can do here to customize your text, but we will only briefly discuss one of them: headers.

# If you put a single "#" before text, it will be a large header
## With two "#" symbols, you get a medium-sized header
### Three of them means you get a small header

Blocks can be deleted by double-clicking on them and then clicking on the trash can icon in the top-right corner of the block.

You can also change the order of blocks by clicking and dragging them past each other.

And that's basically all you need to know about Jupyter Notebooks. In time you'll learn to love them as much as I do.

# Now let's get to Python!

## Numbers

Python can do basically anything a basic calculator can do.

Typing numbers and basic arithmetic operations (+, -, *, /) we can do things like

In [4]:
1 + 1

2

And the result of our addition is displayed below our code block.

Some more special operations are ** for powers, // for quotient division (divide and then round down to the nearest integer), and % for remainder

In [7]:
2 ** 3

2 // 3
4 // 3

10 % 3

1

Notice that in this last code block we typed four expressions, each on their own line.

However, Jupyter by default only returns the result of the last line of a code block. To make sure we see the results of doing all expressions, we can use Python's `print()` function. See below.

In [8]:
print(2 ** 3)

print(2 // 3)
print(4 // 3)

print(10 % 3)

8
0
1
1


So far we have only used integers, or numbers without decimal portions.

But of course Python can also deal with numbers with decimals, and these numbers are called `floats`.

In [14]:
print(1.7 + 0.01)

1.71


##### Exercise
Use the `print()` function to output the result of dividing 5 by 7.

## Strings

Another common data type in Python is Strings, or a sequence of characters. These are surrounded by "".
Below are some examples.

In [19]:
print("123")
print("abc")

123
abc


You can create new strings by putting together two other strings. This is done with the `+` operator.

In [23]:
print("String1" + "&" + "String2")

String1&String2


Python also makes it easy to choose specific characters from a string.

This is done with square brackets `[]` right after a String.

You can select an individual character by enclosing the index of that character in the brackets.

NOTE: Python uses zero-indexing, meaning the first element has index 0, the second element has index 1, etc.

In [31]:
print("abcdefgh"[0])
print("abcdefgh"[3])

a
d


You can also select a range of characters. 

In the square brackets type the index of the first character you want to choose, then a colon `:`, then one more than the index of the last character you want.

This ensures that the difference of the second number you type and the first number you type is the number of characters returned.

In [33]:
print("abcdefgh"[0:2])
print("abcdefgh"[3:7])

ab
defg


If you want your range to start from the beginning of the string, leave the first number blank. Similarly, if you want your range to go until the end of the string, leave the second number blank.

In [37]:
print("abcdefgh"[5:])
print("abcdefgh"[:2])
print("abcdefgh"[:])

fgh
ab
abcdefgh


Also, you can start counting from the back of a String using a negative sign.

So, for example, an index of -1 refers to the last character, and -2 refers to the second to last character.

In [62]:
print("abcdefgh"[1:-1])
print("abcdefgh"[1:-6])
print("abcdefgh"[-3:-1])

bcdefg
b
fg


##### Exercise
Pring out the results of joining the Strings "Hello" and " World" together and then selecting first five characters of the new combined String.

## Booleans

Another important data type is the boolean, which is either the value `True` or the value `False`.
These are typed without quotes, differentiating them from Strings.

In [41]:
print(True)
print(False)

True
False


Booleans are usually seen as the result of some sort of test.

To test for equality of numbers and strings, make sure to use two equal signs `==`. Then testing if things are NOT equal uses the operator `!=`.

We can compare them using the following symbols: `<`, `<=`, `>`, `>=`. For numbers the meanings of these comparators is straight forward. For strings, it ignores case and looks at the alphabetically order of the characters, and non-alphabetic characters are considered less than alphabetic characters.

In [69]:
print(1 == 2)
print(1 != 2)

print('a' < 'b')
print('A' < 'b')

print('1' <= 'a')

False
False
True
True
True
True


Sometimes it is also helpful to chain together multiple tests into one large test. For that we can use the logical operators `and` as well as `or`.

Then `(test1) and (test2)` is `True` if and only if both `test1` and `test2` are true. If at least one of `test1` or `test2` is false, then the overall test is `False`.

Next consider `(test1) or (test2)`. The overall test is `True` if and only if both at least one of `test1` or `test2` is `True`. Only if both `test1` and `test2` are `False`, then the overall test is `False`.

In [80]:
print((1 == 1) and (2 != 1))
print((5 > 1) and (5 < 1))

print(("Test" == "Test") or ("Test" != "Test"))
print(("1" > "2") and ("1" >= "2"))

True
False
True
False


When comparing booleans, things are a little different.

Instead of using `==` to test for equality, we use the keyword `is`. Also, instead of using `!=` for inequality, we use the two keywords `is not`.

In [81]:
print(True is True)
print(True is not False)

True
True


## Variables

So far we have only used various data types once. But for most applications we will want to save values to use later.

This is what variables are for. They give a name to a piece of data so you can refer to or alter it later.

Just put the name of the variable you want to create, an equal sign, and the piece of data you want to be referenced by your variable.

In [57]:
var1 = 1
var1 = var1 + 1

var2 = 3

sum = var1 + var2
print(sum)

5


Python is a dynamically typed language, meaning you do not have to explicitly say what type of data a variable references. This also means that a variable is very flexible and can change its type whenever you want.

In [56]:
num = 1
print(num)
num = "one"
print(num)

1
one


## Commenting your code

Our code is becoming slightly more and more complex. This means that when we come back and look at it later, or if we share our code with someone else, we need to make sure it is properly understood.

This is where commenting is super important. Instead of just writing code we can describe what code does, and usually we do this right above a piece of code.

If you write a `#` symbol on the start of a line in a code block, everything after that symbol will be interpreted by Python as a comment. This means that it will not try to run the text as code. As you can see below, if you forget the `#` symbol, you will get an error message.

In [59]:
# This is a comment
This is not a comment

SyntaxError: invalid syntax (762433041.py, line 2)

## Collections of data: Lists

So far we have only looked at single pieces of data at a time, be that a single number or a single sequence of characters in the form of a String.

But many times, especially for Data Science, we want to look at collections of multiple individual pieces of data.

One of the simplest forms of collections is a list. It is a one-dimensional collection with a finite number of values that are ordered. The same value can appear multiple times in a list.

Lists are created by typing square brackets `[]` with the individual values separated by commas.

In [94]:
# make a list of just numbers
nums = [1, 2, 5, 7]
print(nums)

# make list of different types
otherList = [1, "one", "1"]
print(otherList)

[1, 2, 5, 7]
[1, 'one', '1']


Accessing and changing elements of a list uses a fairly similar strategy to accessing individual characters in a String from before.

Directly after an actual list or a variable referencing a list, use square brackets and indices to select values from the list.

In [92]:
# get first element from list created above
print(nums[0])

# get second element from a new list
print(["apple", "banana", "cherry"][1])

1
banana
