# Python Refresher - Part 1
----

## Why Python

[Python][1] is an open-source, high-level, general purpose programming language that was originally developed by [Guido van Rossum][gv]. [What this means][2] is that anyone can view (and optionally make contributions to) the actual language implementation, the language simplifies many tasks for the programmer, especially when compared to other programming languages like C or Java, and Python programs can easily be developed to tackle a wide range of problems including network communication, scientific calculations, data processing and archiving tasks, and graphical tool development. In addition, the syntax and form of the Python programming language is designed to allow Python programs to be quickly developed, simplifying development time and costs while also reducing maintenance costs since a well-written Python program is easy to comprehend.

![Python logo](https://www.python.org/static/community_logos/python-logo-master-v3-TM.png)

A large number of Python modules, which extend the base programming language, have been developed and are now widely used. These modules, which can simplify the development of new Python programs, can be broadly classified into three types. The first type are commonly used modules that are part of the official Python distribution, known as the [standard library][sl], such as the [`math`][i], [`os.path`][ii], [`pickle`][ip], [`sqlite3`][sql], [`bz2`][iii], or [`csv`][iv] modules. The second type are modules that are also commonly used, but not (yet) part of the official standard library. Relevant example modules of this type are the [`numpy`][np], [`pandas`][pd], [`matplotlib`][mp], or [`scipy`][sp] modules. The last type of module is developed by communities for a special purpose that are currently less commonly used (but this may change in time). These include modules like [`seaborn`][sb], [`statsmodels`][sm], or [`nltk`][nl]. Of course, any developer can write a Python module, thus offering a wide array of possible add-on functionality. Using these modules, however, should be carefully balanced with the importance of minimizing software dependencies, which can reduce development and maintenance issues for software engineers.

-----

[gv]: https://www.python.org/~guido/
[i]: https://docs.python.org/3/library/math.html
[ii]: https://docs.python.org/3/library/os.path.html
[iii]: https://docs.python.org/3/library/bz2.html
[iv]: https://docs.python.org/3/library/csv.html
[ip]: https://docs.python.org/3/library/pickle.html
[sl]: https://docs.python.org/3/library/
[sql]: https://docs.python.org/3/library/sqlite3.html
[np]: http://www.numpy.org
[mp]: http://matplotlib.org
[pd]: http://pandas.pydata.org
[sp]: http://www.scipy.org
[sb]: http://web.stanford.edu/~mwaskom/software/seaborn/index.html
[sm]: http://statsmodels.sourceforge.net
[nl]: http://www.nltk.org
[1]: https://www.python.org
[2]: https://en.wikipedia.org/wiki/Python_(programming_language)

### Python History

Python is maintained by the Python Software Foundation and currently comes in [two versions][3]: Python 2 and Python 3. Sometime around the turn of the millennium, Python developers began to consider improvements to the Python language that might produce incompatibilities with the existing Python language. One such change, which was introduced with Python 3, is that the Python programming language is now consistently object-oriented, which means that everything in a Python program or script is now an *object*. These changes were considered necessary to enable the language to continue to grow and develop. The development of this new version, originally entitled Python 3000 or Python 3K, now shortened to just Python 3 (and even sometimes Python3), took a number of years as new ideas were carefully developed and tested, and to provide sufficient time for the existing community to participate in the progression of the Python language.

Now, over 30 years later, Python 3 is an improved version of the original Python language and offers a number of [important advances][4]. Thus, Python 2 is primarily used to maintain backwards compatibility with legacy Python codes that are too difficult or expensive to port to the newer version. **In this class, we will exclusively use Python 3** since it represents the present and the future, and all libraries we will use have already been successfully ported to the new version. As you browse different websites, IPython notebooks, or other resources, you should keep in mind this language split and be sure to focus primarily on Python 3 material to minimize confusion arising from these language differences. 

-----
[3]: https://wiki.python.org/moin/Python2orPython3
[4]: https://docs.python.org/3.0/whatsnew/3.0.html

## Basic Concepts

While Python is a relatively easy language to learn, there are a few basic concepts that need to be reviewed before we begin to discuss the Python programming language. A fundamental concept to remember is that good Python code should be easy to read. To help programmers adhere to this guideline, Python has the following guidelines:

1. White space is important.
2. Names should be descriptive.
3. Code blocks are indented four spaces (not hard tabs) and follow a colon.
4. Lines of code should be limited to less than 80 characters.
5. Good code should be throughly documented both with comments and descriptive documentation strings.

If lines need to be longer than 80 characters, the recommended practice is to use parentheses to group operations and to use suitable indentation to maintain readability. In the event this is insufficient, a line continuation character (backslash), `\`, can be used to allow code to extend over as many lines as necessary. If that sounds confusing, don't worry, you will see examples where this is demonstrated repeatedly in this course.

Python supports [REPL](https://en.wikipedia.org/wiki/Read–eval–print_loop), which is an abbreviation for Read-Eval-Print Loop, allowing a developer to write, run, and test code iteratively, which aids in quickly developing new programs. Python is also Unicode compliant, so character coding can be specified (for example, UTF-8) at the start of a program (or in string literals), allowing a wider range of characters to be used to write descriptive text.

### Fundamentals to Keep In Mind

1. Python executes code one line at a time.
2. Python is case-sensitive.
3. Python uses indentation to group statements (see 3. above).

----

In [None]:
# White space is important
x = 'some fun stuff'
    y = 45
    print(x, y)

In [None]:
# Spaces instead of tabs - EXCEPT Jupyter Notebook converts a TAB to to SPACES for you
# Try it ... hit TAB, then type some stuff
# Go back and with arrow keys to see if there is a single tab or spaces


### Python Identifiers

A Python identifier is a name that is composed of a sequence of letters, numbers, and underscore characters that must adhere to the following rules:

1. The first character must be a letter or an underscore character.
2. Variable and Function names traditionally start with a lowercase letter.
3. Classes traditionally start with an uppercase letter.
4. The identifier cannot be one of the reserved Python keywords, listed in the code block below.

While not explicitly prevented, it is also recommended to avoid names of objects from common Python libraries to minimize name collisions and any resultant confusion. 

In this notebook, we will discuss variables, demonstrate how to use them effectively in a Python program, and use several built-in functions. A variable is simply a name that refers to something else, like a number or a string of characters. In a future lesson, we will discuss functions in more depth. For now, simply consider a function as a chunk of code, wrapped in a name that does something. A classic example is a math function like the `sin` function or a more general function like `print`, which displays text to the screen.

In the following Code cell, we use the `help`  built-in function to display the list of Python keywords. These names are reserved and cannot be used as the name of a variable or function.

----

In [None]:
# This is a comment
# Display Python3 keywords

help('keywords')

-----

A Python identifier can be used as the name of a variable, function, class, or module. Python identifiers are case sensitive, so `mylist` is different than `myList`. Writing descriptive identifiers can be beneficial for code readability and subsequent maintenance, thus we often write multi-word identifiers. When combining words, one can either use camel-case format, where each new word after the first is capitalized like `myFileList`. Alternatively, we also can separate words by using underscores like `my_filename_list`.  While both approaches are legal, it is best to be consistent as much as possible. Your variable naming convention depends on many personal preferences and/or biases. For example, I prefer camel-case because my first real programming jobs were done in Java where that is the de-facto standard.

The Python Enhancement Proposal (abbreviated as PEP), [PEP-8](http://legacy.python.org/dev/peps/pep-0008/#introduction), provides a complete discussion of recommended best practices when writing Python code.

### Documentation

The primary mechanism for documenting Python code is to use comments. Python supports two types of comment strings. The first type is a single-line comment, which begins with the hash or pound character `#` and continues until the end of the line. The `#` character can appear anywhere on the line. You can create large comment blocks by placing single-line comments adjacent to each other in a Python program. Here are a several examples of single-line comments; the first comment consists of the entire line, and the second comment extends from the preceding command to the end of that line. 

```python   
# Calculate the hypotenuse of a triangle
c = math.sqrt(a**2 + b**2) # Assuming Euclidean Geometry
```

The second type of comment is a multi-line comment, which begins and ends with either three single quote characters,  `''' comment text '''` , or three double quote characters in a row: `""" comment text """`. This comment can easily extend over multiple lines and is, therefore, used in a Python program to provide documentation via an implicit docstring for functions and classes. Here is an example of a multi-line comment string:

```python   

'''
This multi-line comment can provide useful information
for a function, class, or module.

This also allows whitespace to be used to help
write more clearly.
'''
```

The built-in `help`  function can be used to view *docstring* comments for different functions, classes, or other Python features, as shown in the following code block. As the adjacent comment suggests, you should execute this function and change the argument to the `help` function to view documentation for other Python language components like `int`, `complex`, `math`, or `list`.

Another built-in function that you will frequently use is the `print` function, which, if necessary, converts its arguments to a string and displays the resulting string to *STDOUT*, which is generally the display. We will discuss functions, like the `print` function, in more detail later.

-----

In [None]:
# First, run the following line of code
help(print) # Then, try changing print to something different like int, math, or list

In [None]:
# Python is case sensitive
my_int = 42
my_Int = 13
print(f'my_int={my_int} and my_Int={my_Int}')

In [None]:
# Change my_int
my_int += 1
print(f'my_int={my_int} and my_Int={my_Int}')

-----

<font color='red' size = '5'> Student Exercise </font>

In the empty **Code** cell below, write a simple Python comment, write a multi-line comment, and use the `help` command to display the built-in documentation for the `dict` and `str` classes.

-----

-----

## Data Types

Python supports several standard data types including integer, floating-point, and character data, or a string of characters (this data type will be discussed in more detail later). Python is a dynamically-typed language, which means we do not need to **declare** the data type for a variable. Instead, the Python interpreter determines the data type for a variable by the type of data the variable holds. For example, if a variable holds an integer number like 1, 2, 3, or -10, the variable is of type `int`. Likewise, if the variable contains a real number like 3.14, the variable is of type `float`, which is shorthand for floating-point.

The Python language has two other special data types. The first is a Boolean type, which can take one of two special values: `True` or `False`. We will explore the Boolean data type in more detail later. The second is a special type to indicate a null value, which in Python is encoded as `None`.

In the following Code cells, we demonstrate how Python handles these basic data types. Remember to make changes and rerun these cells to better understand the Python type system.

-----

In [None]:
# An integer value
x = 1
print(f'The variable x is {x} and has type {type(x)}')

In [None]:
# A floating-point value
pi = 3.14159
print(f'The variable pi is {pi} and has type {type(pi)}')

In [None]:
# A string of characters
my_string = 'Hello!'
print(f'The variable my_string is {my_string} and has type {type(my_string)}')

In [None]:
# A string of characters can also be inclused in double-quotes
my_other_string = "Howdy!"
print(f'The variable my_other_string is {my_other_string} has type {type(my_other_string)}')

In [None]:
# A Boolean value
y = True
print(f'The variable y is {y} and has type {type(y)}')

In [None]:
# A None value
z = None
print(f'The variable z is {z} and has type {type(z)}')

In [None]:
# What happens if we call type() on a built-in function like print?
type(print)

# After running that line of code, come back and change print to
# something else, like math or "Howdy, y'all!"

----

### Python Operators

Python supports the [basic mathematical operators][1]. The following list presents the basic mathematical operators, in order of precedence (operators in the same table cell have the same precedence):

| Operator                         | Description                              | Example                                  |
| -------------------------------- | ---------------------------------------- | ---------------------------------------- |
| `()`                             | Parentheses for grouping                 | `(2 + 3)`                                |
| `**`                             | Exponential function                     | `2**3`                                   |
| `*`<br/> `/`<br/> `//` <br/> `%` | Multiplication<br/> Division<br/>Integer Division<br/> Remainder | `2*3.1` <br/> `3.2 / 1.2 ` <br/> `5//2` <br/> `5%2` |
| `+` <br/> `-`                    | Addition <br/> Subtraction               | `1.45 + 3.14` <br/> `5.3 - 2.125`        |


When computing a quantity, you often will want to assign the value to a variable. This is done by using the assignment operator `=`. On the other hand, if you want to test if two values are the same, you use the equivalence operator, `==`. In addition, Python provides augmented assignment operators that combine a basic mathematical operator (`+`, `-`, `*`, `/`, `**`, `//`, or `%`) with the assignment operator: `+=`, `-=`, `*=`, `/=`, `**=`, `//=`, and `%=`; this can simplify and thus clarify some expressions. As a simple example of an augmented assignment operator, the following Python expressions are equivalent:

```python
a = a + 1
a += 1
```

In the preceding example code, we have introduced the use of variables to hold the result of a calculation. Python is a dynamically-typed language; thus we do not need to first declare the variable and its type before using it. If the variable is reused and assigned a different value, the variable takes on a new type. Python has a built-in `type` function that can always be used to ascertain the underlying data type of a variable or any other legal Python construct, as shown in the previous Code cells.

-----
[1]: https://en.wikibooks.org/wiki/Python_Programming/Basic_Math

<font color='red' size = '5'> Student Exercise </font>

In the empty **Code** cell below, write a simple Python script to calculate the number of minutes in one week. Use a variable to hold the calculation and print the answer within this notebook.

-----

-----

Python also supports other operators that are used when working with [Boolean][1] data or to perform [bit-wise operations][2]. For conciseness, we do not discuss these operators in this notebook.

In the next set of Code cells, we present several examples that demonstrate how to use the basic Python mathematical operators to compute different expressions. **These blocks are meant to be executed, modified, and re-executed!**

-----

[1]: https://en.wikibooks.org/wiki/Python_Programming/Operators#Boolean
[2]: https://wiki.python.org/moin/BitwiseOperators

In [None]:
# Division - Make a note of the resulting value!
4 / 2

In [None]:
# Division again
5 / 2

In [None]:
# If we want to do integer division we use // instead
5 // 2

In [None]:
# And to get just the remainder
5 % 2

In [None]:
# Using PEMDAS, what will be the result?
2.5 * 4 / 3 * (2 + 3 + 5) - 3**2

----

# Data Structures in Python
----

We have have already seen *built-in* numeric data types in Python: `int` and `float`. We have also seen `str` for strings which is technically called a *text sequence type*. First, we will revisit the `str` data type. Then, we will explore a few more built-in data types that will prove very useful: `list`, `tuple`, `range`, and `dict`. 

-----

## Sequence Types

### Strings - the `str` data type

The `str` data type is a special sequence type - a *text* sequence type. Recall that we create strings by providing a sequence of zero or more characters enclosed in either a pair of single quote characters, `'`, or a pair of double quote characters, `"`.

In [None]:
# Create a couple of strings, print them and their types out
string1 = 'My first string!'
print(f'string1 = {string1}')
print(f'   type = {type(string1)}')

string2 = '3000'
print(f'\nstring2 = {string2}')
print(f'   type ={type(string2)}')

-----
We will often need to manipulate string data. Luckily, there are several useful methods for strings. We look at several next.

-----

### Concatenate Strings

In [None]:
# Concatenate two strings together
# The + operator is "overloaded", so it works on strings in addition to numbers
string1 + string2

### Repeating Strings

In [None]:
# Repeating strings
# The * operator is also overloaded, allowing it work on strings
string2 * 4

### Extracting Pieces of Strings

To extract characters from a string, you pass an *index* number inside of square brackets `[]`. Indexing starts at 0. So, to get the first character from the string `string1` you would issue the command `string1[0]`. 

In [None]:
# We may want to extract parts of a string
# Get the first 2 characters of string1
string1[0:2]

In [None]:
# Get the last character of string1
string1[-1]

In [None]:
# We can also check to see if a substring exists (or not) in the string
# We use the `in` operator
# Check to see if the word `first` is in string1
'first' in string1

In [None]:
# Check to see if `hello` is in string2
'hello' in string2

In [None]:
# Is hello *not in* string2?
'hello' not in string2

-----

### Other Useful String Functions and Methods

Let's explore a few of the most commonly used string manipulations that we will use.

### Length of Strings

In [None]:
# How long is a string?
# Use the len() function
print(f'length of string1 = {len(string1)}')
print(f'length of string2 = {len(string2)}')

### Changing the Case of Strings

In [None]:
# What if we want to convert everything to lower case or upper case
# Convert string1 to lower
print(string1.lower())

# Convert string1 to upper
print(string1.upper())

## NOTE: Neither one of these modify the original string1. Instead they return a new string.

### Finding Substrings

In [None]:
# Create a string
string3 = 'I think text analysis is fun because text is where the hidden messages are.'

# Does string3 start with "I"?
print(string3.startswith('I'))

# What about case? Test with "i"
print(string3.startswith('i'))

# Does string3 end with "text"?
print(string3.endswith('text'))

In [None]:
# Where does the first occurrence of "text" occur in string3?
string3.find('text')

In [None]:
# Replace the word "text" with "TEXT" for all occurrences
print(string3.replace('text', 'TEXT'))

# Replace "text" with "TEXT" once - just the first occurrence
print(string3.replace('text', 'TEXT', 1))

----

You may also want to count the number of occurrence of a substring within in a string. You can use the `count` method.

*Thought and Real Excercise:* Are there any other ways to accomplish the same task?

In [None]:
# How many times does "text" show up in string3?
# We are going to convert everything to lower case and then count
string3.lower().count('text')

-----

<font color='red' size = '5'> Student Exercise </font>

In the **Code** cell below count the number of occurrences of the word 'text' in `string3` 
using another approach rather the one showed above.

In [None]:
# YOUR CODE HERE


-----

<font color='red' size = '5'> Student Exercise </font>

In the **Code** cell below count the number of occurrences of the word 'is' in `string3`.

In [None]:
# How many times 'is' shows up in the variable string3


----

### Raw and Formatted String Literals

As we saw in `badString2` above, we can include newline and tab characters within a string by using `\n` and `\t`, respectively. If you want to have the characters `\n` show up in the string instead of being replaced with a newline, then you have two options:

1. Escape it by adding an additional backslash: `\\n`.
2. Make the string a *raw string literal* by preceding the opening quotation mark with the letter `r`. 

In [None]:
# Create a string with a new line character in it
s1 = 'String over \n two lines?'
print(s1)

In [None]:
# Create a string with escaped new line character
s2 = 'String over \\n two lines?'
print(s2)

In [None]:
# Create a raw literal string
s3 = r'String over \n two lines?'
print(s3)

#### Formatted Literal Strings

To help with formatting when printing out strings, Python provides the concept of *formatted string literals*, also called f-strings. You can include the value of Python expressions inside a string by prefixing the string with an `f` or `F` and writing expressions as `{expression}`. 

In [None]:
# Create an f-string
myVar = 3*3
myFString = f'The value of myVar is {myVar}'
print(myFString)

myFString2 = f'The value of myVar with some formatting is {myVar:0.2f}'
print(myFString2)

-----

## Other Sequence Types

In addition to the text sequence type of `str`, there are three others that are built-in sequence types: `list`, `tuple`, and `range`.

### The `list` Sequence Type

We have already seen a `list` above, when we used `str.split()` to break the string up into substrings using whitespace as a delimiter. So, what exactly is a `list`? A `list` is an ordered, *mutable* collection of objects. *Mutable* means you can make changes to it: adding, deleting, or changing the objects in the collection. In Python, a `list` can contain different data types.

#### Creating Lists

You create a list by enclosing data inside square brackets, `[]`, and separating each item with a comma. Let's create a few different lists.

In [None]:
# Create a list that only contains integers
intList = [2, 4, 6]
print(intList)
print(type(intList))

In [None]:
# Create a list that contains integers and floats
numList = [2, 4.4, 6, 8.8]
print(numList)
print(type(numList))

In [None]:
# You can put string in the list too
strAndNumList = ['one', 2, 3.0]
print(strAndNumList)
print(type(strAndNumList))

In [None]:
# You can even create a list of lists
twoDList = [[1,1], [2,2], [3,3]]
print(twoDList)
print(type(twoDList))

#### Retrieving Elements of Lists

We've already seen how to retrieve characters out of a string. The process for a `list` is the same: we access an element of the list by typing the name of the list followed by the *index* of the element you want inside square brackets. **Indexing starts at 0.** For example, to retrieve the first element of the list `strAndNumList`, which is "one", we would use `strAndNumList[0]`. Let's try it.

In [None]:
# Get the first element of strAndNumList
print(strAndNumList[0])
print(type(strAndNumList[0]))

In [None]:
# To get the last element you can use the index -1
# This implies that we can count from either the beginning (0) or the end (-1)
strAndNumList[-1]

#### Slicing Lists

If we want more than a single element of a list, that is also quite easily done. The syntax is `listName[start:end:step]`, where `start` is the index of the first element we want to retrieve (inclusive lower bound), `end` is one more than the index of the last element we want to retrieve (exclusive upper bound), and `step` is the gap between indicies (default gap is 1).

In [None]:
# Create a new list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

In [None]:
# What will this print?
print(f'letters[0:2]   : {letters[0:2]}')

In [None]:
# What will this print?
print(f'letters[2:2]   : {letters[2:2]}')

In [None]:
# What will this print?
print(f'letters[:2]    : {letters[:2]}')

In [None]:
# What will this print?
print(f'letters[4:]    : {letters[4:]}')

In [None]:
# What will this print?
print(f'letters[0:8:2] : {letters[0:8:2]}')

In [None]:
# What will this print?
print(f'letters[8:0:-2]: {letters[8:0:-2]}')

#### Modifying List Elements

To modify a single element of a list, simply reference that index and assign a different value to it. For example, to change the letter "a" to "first" in the `letters` list from above, we would type `letters[0] = "first"`. To change multiple elements at once, you can assign the new values using a list slice.

In [None]:
# Change "a" to "first" in letters
letters[0] = 'first'
print(letters)

In [None]:
# Now, change the last two elements using a list slice
# We are counting backwards now: start at -1, end at -3 (exclusive, remember), use step = -1
# Verify it's the two we want
print(letters[-1:-3:-1])

In [None]:
# now change them
letters[-1:-3:-1] = ['last', 'second to last']
print(letters)

#### Copying Lists

If you try copying a list, say `myList`, to a second list called `yourList` with the following command `yourList = myList`, then you have created a *shallow* copy. In effect, you have simply created a new variable (or symbol if you will) called `yourList` that points to the exact same data as `myList` in the underlying memory space. Therefore, when you make changes to `myList`, those changes will show up in `yourList` and vice versa. If you do **not** want a shallow copy, then you need to create a *deep* copy by using the method `copy`.

In [None]:
# Create myList, print it out
myList = [1, 2, 3, 4, 5]
print(f'myList  : {myList}')

# Create yourList and print it out
yourList = myList
print(f'yourList: {yourList}')

In [None]:
# Now change the first element of myList
print('Changing the first element of myList to 999 ...')
myList[0] = 999
print(f'myList  : {myList}')
print(f'yourList: {yourList}')

In [None]:
# Try the other direction
print('Changing the last element of yourList to 999 ...')
yourList[-1] = 999
print(f'myList  : {myList}')
print(f'yourList: {yourList}')

In [None]:
# Let's try a deep copy instead
myNewList = [2, 4, 6, 8]
print(f'myNewList  : {myNewList}')

# Create yourNewList with .copy()
yourNewList = myNewList.copy()
print(f'yourNewList: {yourNewList}')

In [None]:
# Changing the first element of myNewList
print('Changing the first element of myNewList to -999 ...')
myNewList[0] = -999
print(f'myNewList  : {myNewList}')
print(f'yourNewList: {yourNewList}')

In [None]:
# Try the other direction
print('Changing the last element of yourNewList to -999 ...')
yourNewList[-1] = -999
print(f'myNewList  : {myNewList}')
print(f'yourNewList: {yourNewList}')

#### Other List Operations

We've already seen how to find how many elements are in a `list` by using the function `len`.  We can join two lists together by using the `+` operator. Similarly, we can use `*` to make copies of a list and append them to the end, thus duplicating lists *n* times. Other various helpful methods include `append`, `insert`, `remove`, `sort`, and `reverse` among others.

In [None]:
# Concatenate two lists
bigList = myNewList + yourNewList
print(bigList)

In [None]:
# Duplicate list
myNewList3Times = myNewList * 3
print(myNewList3Times)

In [None]:
# Append a new element to bigList
bigList.append('New Element')
print(bigList)

----

### Tuples

A `tuple` is a collection that is ordered and *immutable*. Creating a `tuple` is very similar to creating a `list` except you use parentheses `()` instead of square brackets, `[]`. The process of accessing elements of a `tuple` is identical to that of a `list`. You need to be aware of `tuple`s because some functions either return them or require them in various packages/modules that you will encounter. Let's try it.

In [None]:
# Create a tuple
t = (1, 2, 3)
print(t)
print(type(t))

In [None]:
# Get the first element of the tuple t
print(t[0])

In [None]:
# Try to change the first element
t[0] = 999

----

### Range

A `range` represents an immutable sequence of numbers is commonly used for looping a specific number of times in a `for` loop. You call `range(stop)` where `stop` represents the number of elements you want in the sequence. By default `range` starts indexing at 0. You can change this behavior using the other constructor call of `range(start, stop, [step])`. The optional argument of `step` defaults to 1.

In [None]:
# Call a few different ones to see how it works
print(range(10))

In [None]:
# Okay, that didn't tell me much
# Let's wrap it in a list and then print it out
print(list(range(10)))

In [None]:
# Change start to 1 ... notice stop is EXCLUSIVE
print(list(range(1, 10)))

In [None]:
# Count by 2s staring at 2 and going up to and INCLUDING 10
print(list(range(2, 11, 2)))

----
### Sets

Another useful data structure is a `set`, which represents a collection of *distinct* elements. You can define a `set` by listing its elements between curly braces:

```python
mySet = {2, 3, 5, 7}
```

Unfortunately, that does not work for empty sets, as `{}` means an empty dictionary (see below). If you want an empty `set`, then you need to use `set()`. For example:

```python
s = set()  # Create an empty set
s.add(1)   # Add the element 1 to the set
s.add(2)   # Add the element 2 to the set
s.add(2)   # s did not change!
```

You often use sets for two main reasons. The first is that the `in` operator is very fast when working with sets. If we have a large collection of items that we want to use for a membership test, a set is more appropriate than a list. The second reason is to find the **distinct** items in a collection (e.g., a list).

In [None]:
# Create mySet that contains prime numbers
mySet = {2, 3, 5, 7}
print(f'{mySet} has type {type(mySet)}')

In [None]:
# This will not create an empty set, but rather a dictionary
not_a_set = {}
print(f'{not_a_set} has type {type(not_a_set)}')

In [None]:
# This will create an empty set
s = set()
print(f's is {s} and has type {type(s)}')

In [None]:
# Add some elements to s
s.add(1)
print(f's is now {s}')
s.add(2)
print(f's is now {s}')

# Try to add 2 again
s.add(2)
print(f's is now {s}')

-----

<font color='red' size = '5'> Student Exercise </font>

Complete the following tasks in the empty **Code** cell below.

1. Create a new `list` called `theList` that contains the odd numbers from 1 to 9.
    1. Challenge: Can you complete this task using `range`?    
2. Print out the first 2 element of `theList`.
3. Print out the last 2 elements of `theList`.
4. Make a copy of the `theList`, call it `reversedList`, reverse the elements of it, and print it out.
    1. Make sure you do **not** change the order of the original `theList`.
5. Print out the combined list of `theList` and `reversedList`.

-----

-----

## Dictionaries

One of the most useful built-in data types in Python is the dictionary or the `dict` type. A dictionary is a collection that is *unordered* and *mutable*. The elements of the collection are **key-value** pairs. Instead of being indexed by a range of numbers (like a `list` or `tuple`), a dictionary is indexed by *keys* which can be be any immutable type. For example, strings and numbers can always be keys. You **cannot** use `list`s as keys since they can be modified in place. The values can be any valid data type.

It is best to think of dictionaries as *key:value* pairs with the requirement that keys are unique. To create a dictionary, you place comma-separted key:value pairs inside of curly braces, `{}`. 

In [None]:
# Create an income statement dictionary
incomeStmt = {'Revenue': 100,
             'COGS': 52,
             'Gross Margin': 45,
             'SG&A': 40,
             'Net Income': 5}

print(incomeStmt)

In [None]:
# Retrieve an element using the key
incomeStmt['COGS']

In [None]:
# To add a key-value to a dictionary, assign the value to a new key
# Add "Fiscal Year": 2018
incomeStmt['Fiscal Year'] = 2018
print(incomeStmt)

In [None]:
# To change a value, access it using the key and reassign
# Change the fiscal year to 1998
incomeStmt['Fiscal Year'] = 1998
print(incomeStmt)

In [None]:
# Get the keys of the dictionary as a list
list(incomeStmt)

In [None]:
# You can use the `in` operator to determine if the key exists
'COGS' in incomeStmt

In [None]:
'cogs' in incomeStmt

In [None]:
# To get all the items as an iterable object, you call .items()
incomeStmt.items()

In [None]:
# As a preview let's loop through the dictionary
for k, v in incomeStmt.items():
    print(f'key: {k:15}value: {v}')

-----

## Ancillary information

The following links are to additional documentation that you might find helpful in learning this material. 

1. The official [Python3 Tutorial][1]
2. An official guide to Python for [Beginners][2]
3. The book [*Think Python*][3] for Python3 provides a comprehensive view of Python for data science. 
4. The official Python tutorial about [formatting strings][4].
5. A nice post about [f-strings][5].
6. The official Python tutorial about [data structures][6].


-----
[1]: https://docs.python.org/3/tutorial/index.html
[2]: https://www.python.org/about/gettingstarted/
[3]: http://greenteapress.com/wp/think-python-2e/
[4]: https://docs.python.org/3/tutorial/inputoutput.html
[5]: https://realpython.com/python-f-strings/
[6]: https://docs.python.org/3/tutorial/datastructures.html
-----

**&copy; 2021 - Present: Matthew D. Dean, Ph.D.   
Clinical Associate Professor of Business Analytics at William \& Mary.**