# SRM 641 Python Week 1

# Introduction to Python and Jupyter Notebooks

## Learning Objectives

By the end of this week, you should be able to:

1. Install Python and create the first Jupyter Notebook. 
2. Understand Python, its syntax, and its capabilities


# Part 1 Launching Jupyter Notebook

To start working with Python, we need to launch a program that will interpret and execute our Python commands. We will use Jupyter Notebook for this course. You may also use https://colab.google/

"A Jupyter notebook is a document that supports mixing executable code, equations, visualizations, and narrative text. Specifically, Jupyter notebooks allow the user to bring together data, code, and prose, to tell an interactive, computational story" (Barba et al, 2019).

The Jupyter Notebook file format (.ipynb ) allows you to combine descriptive text, code blocks and code output in a single file. When you run the code, it generates the outputs, including plots and tables, within the notebook file. You can then export the notebook to a .pynb, .pdf or .html that can then be shared with anyone.

To open Jupyter Notebook, you can either use Anaconda Navigator or the Terminal:

### Anaconda Navigator

1. Launch Anaconda Navigator. It might ask you if you’d like to send anonymized usage information to Anaconda developers. 
<br>

2. Find the “Notebook” tab and click on the “Launch” button. Anaconda will open a new browser window or tab with a Notebook Dashboard showing you the contents of your Home (or User) folder.
<br>

<div>
<img src="attachment:image.png" width="800"/>
</div>

3. On your desktop or home folder, create a new folder and name it, you may use 'SRM641 Python'. Use the folder to save your files.
<br>


4. To open a new notebook within the saved folder. Click on the “New” button and then selecting “Python 3”

<div>
<img src="attachment:image-2.png" width="800"/>
</div>

<br>


## Command Line - Terminal

To access Jupyter Notebook via command line:

1. If you’re using a Unix shell application, such as Terminal app in macOS, Console or Terminal in Linux, or Git Bash on Windows, execute the following command: ```jupyter notebook``` for Windows use ```python -m notebook```
<br>

2. The jupyter notebook will open in a browser. Follow steps 3 and 4 above to save a folder and start a new notebook.

*For more help see the PDF Getting Started Jupyter Notebook file by Paige uploaded on Canvas.*


## Quick note about Jupyter cells

The first and most salient component of the notebook is the cell. Cells can take one of two forms: *text* or *code*

Code cells are composed of three areas: **the input**, **the display**, and the **output** area. The input area is identified by the In `[ ]:` prompt to the left of the cell.

Between the brackets of the In `[ ]` prompt can be one of three items: a number, an asterisk, or a blank. A number indicates that this cell has been executed and the value of the number indicates the order of execution. For example, normally, after you execute the first cell after opening a notebook, its prompt will read In `[1]:`

When you are editing a cell in Jupyter notebook, you need to run the cell by pressing 'Shift + Enter' or 'Ctrl + Enter'. This will allow changes you made to be available to the cell.

Use $Enter$ to make new lines inside a cell you are editing.

Code cells
Re-running will execute any statements you have written. To edit an existing code cell, click on it.

Markdown cells
Re-running will render the markdown text. To edit an existing markdown cell, double-click on it.


## Common Jupyter operations

Near the top of the Jupyter notebook window, there are a row of menu options (File, Edit, View, Insert, ...) and a row of tool bar icons (disk, plus sign, scissors, 2 files, clipboard and file, up arrow, ...).

Inserting and removing cells

> Use the "plus sign" icon to insert a cell below the currently selected cell

> Use "Insert" -> "Insert Cell Above" from the menu to insert above

Clear the output of all cells
> Use "Kernel" -> "Restart" from the menu to restart the kernel

> click on "clear all outputs & restart" to have all the output cleared

Save your notebook file locally
> Use "File" -> "Download as" -> "IPython Notebook (.ipynb)" to download a notebook file 

![image-3.png](attachment:image-3.png)

<br>

*For a full list of keyboard shortcuts, click the help button, then the keyboard shortcuts button.*

References:
<br>
https://jupyter-notebook.readthedocs.io/en/latest/notebook.html 
<br>
https://daringfireball.net/projects/markdown/syntax
<br>
https://www.ibm.com/docs/en/watson-studio-local/1.2.3?topic=notebooks-markdown-jupyter-cheatsheet
<br>
https://www.earthdatascience.org/courses/intro-to-earth-data-science/open-reproducible-science/jupyter-python/
<br>


# Part 2 Python Fundamentals


### A note on installing packages
There are two main installing packages for Python, **conda** and **pip**. 
Pip is the Python Packaging Authority’s recommended tool for installing packages from the Python Package Index (PyPI). 
Conda is a cross platform package and environment manager that installs and manages conda packages from the Anaconda repository. Conda does not assume any specific configuration in your computer and will install the Python packages, whereas pip assumes that you have installed the Python interpreter in your computer. Given the fact that most operating systems do include Python this is not a problem.

Conda is installed when you installed Anaconda. To check if Conda is available. First check if you have conda
In MacOS or Linux open a Terminal window and at the prompt type

```conda –V```

If you get the version number (e.g. conda 23.9.0) you are all set! If you get an error, that means you do not have Anaconda and would be a good idea to install it.

If you do have anaconda consider upgrading it so you get the latest version of the packages:
```conda update conda```

# Basic Data Types

Python knows various types of data. Three common ones are:

-  integer numbers e.g the values -2 and 30. The integer (or int) data type indicates values that are whole numbers.
-  floating point numbers or floats e.g decimal point such as 3.14, 30.0 etc.
-  strings or strs (pronounced stirs) are text values e.g 'Hello world!', 'apple', 'SRM 641' etc

# Numbers

Numbers in Python can be represented as integers (e.g. 5) or floats (e.g. 5.0). We can perform operations on them:

In [1]:
# Any text preceded by a hash mark (pound sign #) is ignored by the Python interpreter. Is is part of a comment.
# Python ignores comments, and you can use them to write notes or remind yourself what the code is trying to do. 
# Just like in R, we use the # sign in Python for comments.

# Addition

4 + 5

9

In [2]:
# Division

2.5/3

0.8333333333333334

# Booleans
Boolean values are written as `True` and `False`. Comparisons and other conditional expressions evaluate to either `True` or `False`, and can be combined with the `and` and `or` keywords.

We can check for equality giving us a Boolean:

In [3]:
# Asking whether 5 equals to 6, in Python two dash signs means equal to

5 == 6

False

In [4]:
# Less than

5 < 6

True

These statements can be combined with logical operators: `not`, `and`, `or`

In [5]:
5 < 6 and not 5 == 6

True

In [6]:
True and True

True

In [7]:
False or True

True

Note that the output is True for both

In [8]:
True or False

True

In [9]:
False and False

False

When converted to numbers, `False` becomes 0 and `True` becomes 1

In [10]:
int(False)

0

In [11]:
int(True)

1

# Strings

Using strings or str (pronounced stirs), we can handle text in Python. These values must be surrounded in quotes — single ('...') is the standard, but double ("...") works as well:

In [12]:
'Hello, World!'

'Hello, World!'

We can also perform operations on strings. For example, we can see how long it is with inbuilt Python function `len()`:

In [13]:
len('Hello, World!')

13

We can select parts of the string by specifying the **index**. Note that in Python the 1st character is at index 0:

In [14]:
'Hello, World!'[0]

'H'

String Concatenation: The meaning of an operator may change based on the data types of the values next to it. For example, + is the addition operator when it operates on two integers or floating-point values. However, when + is used on two string values, it joins the strings as the string concatenation operator.

We can concatentate strings with +:


In [15]:
'Hello' + 'World'

'HelloWorld'

In [16]:
# to create a space, add an empty string

'hello' + ' ' + 'world'

'hello world'

If you try to use the + operator on a string and an integer value, Python will not know how to handle this, and it will display an error message

In [17]:
"Hello" + 1

TypeError: can only concatenate str (not "int") to str

See the TypeError message above, you can only concatenate str and str. Python thought you were trying to concatenate an integer to the string 'Hello'. You would have to explicitly convert the integer to a string first before combining.

The `*` operator multiplies two integer or floating-point values. But when the `*` operator is used on one string value and one integer value, it becomes the string replication operator. 

In [18]:
'Hello' * 5

'HelloHelloHelloHelloHello'

You can’t multiply two strings, and it’s hard to replicate a string a fractional number of times.

# Variables

Variables are like a box in the computer’s memory where you can store a single value. If you want to use the result of an evaluated expression later in your program, you can save it inside a variable.

In [19]:
hello

NameError: name 'hello' is not defined

Note that just typing text causes an error. Errors in Python attempt to give a clue on what went wrong with our code. In this case, we have a `NameError` line which tells us that 'hello' is not defined. This means that the Python interpreter looked for a variable named hello, but it didn't find one.

You store values in variables with an assignment statement. An assignment statement consists of a variable name, an equal sign (called the assignment operator), and the value to be stored. If you enter the assignment statement `weight_kg = 42`, then a variable named `weight_kg` will have the integer value `42` stored in it.

Variables give us a way to store data types. We define a variable using the `variable_name = value syntax`:

In [20]:
weight_kg = 42

A note about naming your variables. A good variable name describes the data or information it contains. The best variable names are descriptive ones. You can name a variable anything as long as it obeys the following three rules:
- It can be only one word with no spaces.
- It can use only letters, numbers, and the underscore (_) character.
- It can’t begin with a number.

Some people prefer **camelcase** for variable names instead of **underscores**; that is, variables `lookLikeThis` instead of `looking_like_this`. 

In [21]:
# Variables can contain integers values

x = 5
y = 10.5
x + y 

15.5

In [22]:
# Variables can also contain str values

course_title = "SRM 641 Python"

In [23]:
# You can add values to an existing variable 

weight_lb = 2.2 * weight_kg
print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb)


weight in kilograms: 42 and in pounds: 92.4


Variables can be any data type. We can check which one it is with `type()`, which is a function (more on that later):

In [24]:
type(x)

int

In [25]:
type(weight_kg)

int

In [26]:
type(course_title)

str

If we need to see the value of a variable, we can print it using the `print()` function:

In [27]:
print(course_title)

SRM 641 Python


In [28]:
print(weight_kg)

42


### Exercise 1

What are the data types of the following variables?

- planet = 'Earth'
- apples = 100
- distance = 12.5

In [None]:
# Your code here:



# Collection of Items

List, tuple and dictionary are some of the most frequently used sequence types.

## Lists

Lists are variable length and their contents can be modified in place. They are mutable. You can define them using square brackets `[]` or using the list type function.

We can store a collection of items in a `list`:

In [30]:
['hello', ' ', 'world']

['hello', ' ', 'world']

The `list` can be stored in a variable. Note that the items in the list can be of different types:

In [31]:
# What if we want to store many values? We need a list!

my_list = ['hello', 3.8, True, 'Python']

In [32]:
# view the list

my_list

['hello', 3.8, True, 'Python']

In [33]:
# check the type

type(my_list)

list

We can see how many elements are in the list with the function `len()`:

In [34]:
len(my_list)

4

We can also use the `in` operator to check if a value is in the `list`:

In [35]:
'world' in my_list

False

In [66]:
# We can also make an empty list and add to it. Append or + adds a value

colors = []

colors.append("Green")
colors.append("Blue")
colors.append("Red")

print(colors)

['Green', 'Blue', 'Red']


We can select items in the `list` just as we did with strings, by providing the index to select:

In [36]:
my_list[0]

'hello'

Python also allows us to use negative values, so we can easily select the last one:

In [37]:
my_list[-1]

'Python'

Another powerful feature of lists (and strings) is slicing. You can select sections of most sequence types by using `slice` notation, which in its basic form consists of `start:stop` passed to the indexing operator `[]`


In [38]:
# We can grab the middle 2 elements in the list:

my_list[1:3]

[3.8, True]

In [39]:
# or index every other one

my_list[::2]

['hello', True]

In [40]:
# or index everything to 3

my_list[:3] 

['hello', 3.8, True]

## Tuples

Tuples are similar to lists; however, they can't be modified after creation i.e. they are immutable. Instead of square brackets, we use parenthesis to create tuples:

In [41]:
my_tuple = ('a', 5)
type(my_tuple)

tuple

In [42]:
tup = (4, 5, 6)

In [43]:
tup

(4, 5, 6)

In [44]:
my_tuple[0]

'a'

In [45]:
# inmutable tuples cant be modified, we cant change the first value to 'b'

my_tuple[0] = 'b'

TypeError: 'tuple' object does not support item assignment

However, If an object inside a `tuple` is mutable, such as a `list`, you can modify it in place:

In [46]:
tup = tuple(['foo', [1, 2], True])
tup

('foo', [1, 2], True)

In [47]:
# You can modify the list [1, 2] inside a tuple. Let's say you want to add the value 3 to the list. 
# Elements can be appended to the end of the list with the append method

tup[1].append(3)

In [48]:
#view 

tup

('foo', [1, 2, 3], True)

Lists and tuples are semantically similar (though tuples cannot be modified) and can be used interchangeably in many functions.

### Exercise 2

Create a tuple:
The first element is an integer of your choice
The second element is a float of your choice
The third element is the sum of the first two elements
The fourth element is the difference of the first two elements
The fifth element is the first element divided by the second element

Display the output of tuple. What is the type of the variable tuple? 
What happens if you try and change an item in the tuple?

In [None]:
# Your code here:



## Dictionary

The dictionary or `dict` may be the most important built-in Python data structure. In other programming languages, dictionaries are sometimes called hash maps or associative arrays. A dictionary stores a collection of key-value pairs, where key and value are Python objects. Each key is associated with a value so that a value can be conveniently retrieved, inserted, modified, or deleted given a particular key. One approach for creating a dictionary is to use curly braces `{}` and colons to separate keys and values:

In [49]:
d1 = {"a": "some value", "b": [1, 2, 3, 4]}

In [50]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

In [51]:
# Dict are useful for storing some key values. We can store mappings of key-value pairs using dictionaries, 
# for instance here is short dict of a shopping list. The keys are a list of veggies, fruits, 
# and the values are the corresponding types of veggies, fruits etc :

shopping_list = {
    'veggies': ['spinach', 'kale', 'beets'],
    'fruits': 'bananas', 
    'meat': 0    
}

# check the type
type(shopping_list)

dict

To access the values associated with a specific key, we use the square bracket notation again:

In [52]:
shopping_list['veggies']

['spinach', 'kale', 'beets']

We can extract all of the keys with the function `keys()`:

In [53]:
shopping_list.keys()

dict_keys(['veggies', 'fruits', 'meat'])

In [54]:
# We can extract all of the values with values()

shopping_list.values()

dict_values([['spinach', 'kale', 'beets'], 'bananas', 0])

In [55]:
# Finally, we can call items() to get back pairs of (key, value) pairs

shopping_list.items()

dict_items([('veggies', ['spinach', 'kale', 'beets']), ('fruits', 'bananas'), ('meat', 0)])


Note: mutable objects can be modified after creation and immutable objects cannot.

Containers are objects that can be used to group other objects together. The basic container types include:

- `list` (list: mutable; indexed by integers; items are stored in the order they were added)
`[3, 5, 6, 3, 'dog', 'cat', False]`
- `tuple` (tuple: immutable; indexed by integers; items are stored in the order they were added)
`(3, 5, 6, 3, 'dog', 'cat', False)`
- `set` (set: mutable; not indexed at all; items are NOT stored in the order they were added; can only contain immutable objects; does NOT contain duplicate objects)
`{3, 5, 6, 3, 'dog', 'cat', False}`
- `dict` (dictionary: mutable; key-value pairs are indexed by immutable keys; items are NOT stored in the order they were added)
`{'name': 'Jane', 'age': 23, 'fav_foods': ['pizza', 'fruit', 'fish']}`

When defining lists, tuples, or sets, use commas (,) to separate the individual items. When defining dicts, use a colon (:) to separate keys from values and commas (,) to separate the key-value pairs.

Strings, lists, and tuples are all sequence types that can use the +, *, +=, and *= operators.

## Key Points

- Basic data types in Python include integers, strings, and floating-point numbers.
- Use variable = value to assign a value to a variable in order to record it in memory.
- Variables are created on demand whenever a value is assigned to them.
- Use print(something) to display the value of something.
- Use # some kind of explanation to add comments to programs.
- Built-in functions such as `str()`, `type()`, `int()`, `print()`, `list()` are always available to use.


References:
For more about data types and structure, check out the following resources:
- Python for Data Analysis by Wes McKinney
- Hands-on Data Analysis with Pandas by Stefanie Molin
- Automating the Boring Stuff by Al Sweigert