<a href="https://colab.research.google.com/github/romerocruzsa/python-basic-training/blob/colab-uploads/PythonBasics_Part2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Copyright 2020 Google LLC.

*Changes made subject to discretion of revision author, Sebastián A. Cruz Romero*

In [1]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Python Basics - Part 2

Python is the most common language used for machine learning. It is an approachable yet versatile language that's used for a variety of applications.

It can take years to learn all the intricacies of Python, but luckily you can learn enough Python to become proficient in machine learning in a much shorter period of time.

This Colab is a quick introduction to the core attributes of Python that you'll need to know to get started. This is only a brief peek into the parts of the language that you'll commonly encounter as a data scientist. As you progress through this course, we'll introduce you to the Python concepts you'll need along the way.

If you already know Python, this lesson should be a quick refresher. You might be able to simply skip to the exercises at the bottom of the document.

If you know another programming language, you will want to pay close attention because Python is markedly different than most popular languages in use today. If you are new to programming, welcome! Hopefully this lesson will give you the tools you need to get started with data science.

### **This notebook will cover the following topics:**
1. Basic Data Structures
  1. Lists
  2. Tuples
  3. Dictionaries

## Lists

So far, the data types we've seen can be thought of as singular entities. We've seen:

* strings
* integers
* floating-points
* booleans

Often, you'll find yourself needing to work with multiple data elements together. There are several options for organizing a collection of data into a data-structure. One option is to use a list.

A list is just a list of multiple values.

In [2]:
[9, 8, 7, 6, 5]

[9, 8, 7, 6, 5]

The values in a list don't need to have the same data type.

In [3]:
[True, "Shark!", 3.4, False, 6]

[True, 'Shark!', 3.4, False, 6]

You can assign a list to a variable.

In [4]:
my_list = [True, "Shark!", 3.4, False, 6]
my_list

[True, 'Shark!', 3.4, False, 6]

You can also index a list and take slices from it, just like you can from a string. Conceptually, you can think of "a string" to be a sequence of characters similar to a list.

In [5]:
print(my_list[3])
print(my_list[3:])

False
[False, 6]


Indexing can also be used to selectively replace items in a list.

In [6]:
print(my_list)
my_list[1] = "Wolf!"
print(my_list)

[True, 'Shark!', 3.4, False, 6]
[True, 'Wolf!', 3.4, False, 6]


Lists have other interesting features. For example, there is an in-built method to sort a list.

In [7]:
number_list = [4, 2, 7, 9 ,3, 5, 3, 2, 9]
number_list.sort()
number_list

[2, 2, 3, 3, 4, 5, 7, 9, 9]

You can append an item to a list using `append()`.

In [8]:
letter_list = ["a", "b", "c"]
letter_list.append("d")
letter_list

['a', 'b', 'c', 'd']

You can append multiple items to a list using `extend()`.

In [9]:
letter_list.extend(["e", "f", "g"])
letter_list

['a', 'b', 'c', 'd', 'e', 'f', 'g']

You can even have lists within lists!

In [10]:
["List 1", ["List 2", 3, 4], False]

['List 1', ['List 2', 3, 4], False]

Lists-of-lists come in really handy, especially in data science since much of the data that you'll work with will be in a tabular format. In these cases, the internal lists are typically the same size. For example, you might have a list of data points about a customer, such as their age, income, and the amount they spent at your company last month.

In [11]:
customers = [
    ["C0", 42, 56000, 12.30],
    ["C1", 19, 15000, 43.21],
    ["C2", 35, 123000, 45.67],
]
customers

[['C0', 42, 56000, 12.3], ['C1', 19, 15000, 43.21], ['C2', 35, 123000, 45.67]]

You can use multiple indexing to get data out of a nested list. In the example below, we pull out the income of our second customer.

In [12]:
customers[1][2]

15000

We will explore lists more deeply and other data structures in future tutorials.

## Tuples

Tuples look and feel a whole lot like lists in Python. They can contain a sequence of data of any type, including lists and other tuples. The primary difference between lists and tuples is that you can't modify a tuple like you can a list.

Before we get too deep into immutability (whether you can change an object's value), let's take a look at a tuple.

In [13]:
my_tuple = (1, "dog", 3.987, False, ["a", "list", "inside", 1.0, "tuple"])
my_tuple

(1, 'dog', 3.987, False, ['a', 'list', 'inside', 1.0, 'tuple'])

The visible difference between a list and a tuple is that we create a tuple with parentheses instead of square brackets.

You can index a tuple and take a slice from a tuple just like you can from a list.

The only difference is that you can't change the values inside a tuple, like you can with a list. This is useful because Python can perform some optimizations when it knows a data structure can't change. This gives tuples a few powerful properties that lists don't have. We'll take a peek at one of these properties now, and we'll also learn more later in this tutorial.

We will use a property of a tuple to swap the values of two variables. In most languages, you need three variables to swap the value of two variables. Here is an example.

In [14]:
var1 = "Python"
var2 = "Perl"

tmp = var1
var1 = var2
var2 = tmp

var1, var2

('Perl', 'Python')

We had to introduce the tmp variable to perform the swap, and needed three lines of code. With tuples, we can do this more cleanly.

**Note:** You might have noticed that when we put `var1`, `var2` at the bottom of the last code section a tuple was printed out. Any sequence of variables separated by commas in Python automatically creates a tuple.

In [15]:
var1 = "Perl"
var2 = "Python"

(var1, var2) = (var2, var1)

var1, var2

('Python', 'Perl')

As you can see, swapping variables using tuples is much easier to read and less error-prone than having to use three variables. It uses the property that the values in a tuple are **immutable**.

Tuples come up everywhere when programming in Python. Sometimes you won't even realize that you are working with a tuple, since they are so integrated with the language.

## Dictionaries

Dictionaries are another fundamental data structure in Python. If you have experience with other programming languages, you might have encountered a similar data structure with a different name such as a map, a hashmap, or a hashtable.

Dictionaries contain key/value pairs. The reason this data structure is called a dictionary is because you can "look up" keys and find the corresponding value, just like you can look up a word in the Oxford Dictionary and find its definitions.

Let's take a look at some code that creates a dictionary and accesses a value in the dictionary by key.

In [16]:
my_dictionary = {
    "pet": "cat",
    "car": "Tesla",
    "lodging": "apartment",
}

my_dictionary["pet"]

'cat'

Notice that we use the same *indexing* notation that should be familiar to you from strings, lists, and tuples. However, instead of a numeric index, the look up is done by key.

A key can be any non-mutable data value. Keys can be numbers, strings, and even tuples. You can't use a dictionary or list as a key, but you can use them as values.

The data types of the keys do not need to be the same.

In [17]:
the_dictionary = {
    57: "the sneaky fox",
    "many things": [1, "little list", " of ", 5.0, "things"],
    (8, "ocho"): "Hi there",
    "KEY_ONE": {
        "a": "dictionary",
        "as a": "value"
    },
}

the_dictionary[(8, "ocho")]

'Hi there'

The dictionary above is much more unstructured than dictionaries that you'll typically encounter in practice. However, it illustrates the broad range of key types and value types that a dictionary can store.

You can also index many levels down in a dictionary. For example, in `the_dictionary` above, there is a sub-dictionary at the `KEY_ONE` key. Let's pull something out of this dictionary within a dictionary.

In [18]:
the_dictionary["KEY_ONE"]['as a']

'value'

We can also use indexing to access the values of the list that is the value in `the_dictionary` for the key `"many things"`.

In [19]:
the_dictionary["many things"][1]

'little list'

Dictionaries, lists, tuples, and other data structures can contain as much nesting as you need.

Dictionaries store their values by key. Only one value can exist per key, so if you write a new value to a key, the old value goes away.

In [20]:
my_dictionary = {
    "k1": "name",
    "k2": "age"
}

my_dictionary["k1"] = "surname"

my_dictionary

{'k1': 'surname', 'k2': 'age'}

You can add entiries to a dictionary by assigning them to a key.

In [21]:
del my_dictionary["k2"]

my_dictionary

{'k1': 'surname'}

To see if a key exists in a dictionary, you can use the in operator. Notice that it returns a boolean value.

In [22]:
"k2" in my_dictionary

False

It is advisable to check if a key exists in a dictionary before trying to index that key. If you try to access a key that doesn't exist using square brackets, your program will throw an error and possibly crash.

There is also a safer **`get`** method on the dictionary object. You provide `get` with a key and a default value to return if the key isn't present.

In [23]:
my_dictionary.get("k2", "There is no 'k2' key value")

"There is no 'k2' key value"

For more on dictionaries, check out the [official Python dictionary documentation](https://docs.python.org/3/tutorial/datastructures.html#dictionaries).

We've learned about the most fundamental data structures in Python:

* numbers
* booleans
* lists
* tuples
* dictionaries

We've learned how to store data in variables and how to change data in variables, dictionaries and lists. Each of these data types have more functionality than we have gone over in this tutorial, so please take some time to get a broader idea of what can be done with these data types in Python.

There are also many data types that we did not cover. As we encounter the need for other types of data in our study of machine learning and data science, we will introduce and explain some of them.