# <h1 style="text-align: center;" class="list-group-item list-group-item-action active" data-toggle="list" role="tab" aria-controls="home">Introduction to Python</h1>

**Course Description**

Python is a general-purpose programming language that is becoming ever more popular for data science. Companies worldwide are using Python to harvest insights from their data and gain a competitive edge. Unlike other Python tutorials, this course focuses on Python specifically for data science. In our Introduction to Python course, you’ll learn about powerful ways to store and manipulate data, and helpful data science tools to begin conducting your own analyses. Start DataCamp’s online Python curriculum now.

<a id="toc"></a>

<h3 class="list-group-item list-group-item-action active" data-toggle="list" role="tab" aria-controls="home">Table of Contents</h3>
    
* [1. Python basics](#1)
    - Hello Python
    - Variables and Types

* [2. Python Lists](#2) 
    - Python Lists
    - List of Lists
    - Manipulating Lists
    
* [3. Functions and Packages](#3)
    - Functions
    - Methods
    - Packages
    
* [4. Numpy](#4)
    - Numpy
    - 2D Numpy Arrays
    - Centering and scaling
    - Numpy: Basic Statistics

**Explore Datasets**

Use the arrays imported in the first cell to explore the data and practice your skills!

- Print out the weight of the first ten baseball players.
- What is the median weight of all baseball players in the data?
- Print out the names of all players with a height greater than 80 (heights are in inches).
- Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
- The values in soccer_shooting are decimals. Convert them to whole numbers (e.g., 0.98 becomes 98).
- Do taller players get higher ratings? Calculate the correlation between soccer_ratings and soccer_heights to find out!
- What is the average rating for attacking players ('A')?

In [22]:
# Importing the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

# Importing the course datasets 
baseball = pd.read_csv('datasets/baseball.csv')
soccer = pd.read_csv('datasets/soccer.csv')

In [23]:
baseball.head()

Unnamed: 0,Name,Team,Position,Height,Weight,Age,PosCategory
0,Adam_Donachie,BAL,Catcher,74,180,22.99,Catcher
1,Paul_Bako,BAL,Catcher,74,215,34.69,Catcher
2,Ramon_Hernandez,BAL,Catcher,72,210,30.78,Catcher
3,Kevin_Millar,BAL,First_Baseman,72,210,35.43,Infielder
4,Chris_Gomez,BAL,First_Baseman,73,188,35.71,Infielder


In [24]:
soccer.head()

Unnamed: 0,id,name,rating,position,height,foot,rare,pace,shooting,passing,dribbling,defending,heading,diving,handling,kicking,reflexes,speed,positioning
0,1001,Gábor Király,69,GK,191,Right,0,,,,,,,70.0,66.0,63.0,74.0,35.0,66.0
1,100143,Frederik Boi,65,M,184,Right,0,61.0,0.65,63.0,59.0,62.0,62.0,,,,,,
2,100264,Tomasz Szewczuk,57,A,185,Right,0,65.0,0.54,43.0,53.0,55.0,74.0,,,,,,
3,100325,Steeve Joseph-Reinette,63,D,180,Left,0,68.0,0.38,51.0,46.0,64.0,71.0,,,,,,
4,100326,Kamel Chafni,72,M,181,Right,0,75.0,0.64,67.0,72.0,57.0,66.0,,,,,,


In [25]:
baseball.sort_values(by='Weight', ascending=False)[['Name','Weight']][:10]

Unnamed: 0,Name,Weight
154,C.C._Sabathia,290
229,Chris_Britton,278
61,Bobby_Jenks,270
59,Andrew_Sisco,260
909,Jon_Rauch,260
815,Prince_Fielder,260
458,Boof_Bonser,260
890,Mike_Restovich,257
531,Carlos_Zambrano,255
567,Jose_Valverde,254


In [26]:
baseball.Weight.median()
#np.median(baseball.Weight)
#baseball.describe()['Weight']

200.0

In [27]:
baseball[baseball['Height']>80]['Name']

59         Andrew_Sisco
558       Randy_Johnson
764    Mark_Hendrickson
862         Chris_Young
909           Jon_Rauch
Name: Name, dtype: object

In [28]:
# 1 inc = 0.0254 m

print(baseball['Height'].mean()*0.0254*100)
print(soccer['height'].mean())

187.17172413793102
181.75042387249914


In [29]:
soccer.shooting.value_counts(dropna=False)

NaN     930
0.60    293
0.58    280
0.54    258
0.64    253
       ... 
0.86      2
0.12      1
0.13      1
0.90      1
0.14      1
Name: shooting, Length: 80, dtype: int64

In [34]:
soccer['shooting'] = soccer['shooting'].apply(lambda x: x*100)
soccer.shooting.value_counts(dropna=False)  # na_action='ignore'

NaN     930
60.0    293
58.0    280
54.0    258
64.0    253
       ... 
86.0      2
12.0      1
13.0      1
90.0      1
14.0      1
Name: shooting, Length: 80, dtype: int64

In [37]:
soccer['rating'].corr(soccer['height'])

-0.006108577058543698

In [38]:
soccer[soccer['position']=='A']['rating'].mean()

67.26080691642652

## <a id="1"></a>
<font color="lightseagreen" size=+2.5><b>1. Python Basics</b></font>

<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Table of Contents</a>

An introduction to the basic concepts of Python. Learn how to use Python interactively and by using a script. Create your first variables and acquaint yourself with Python's basic data types.

### 1. Hello Python!

Hi, my name is Hugo and I'll be your host for Introduction to Python for Data Science. I'm a data scientist and educator at DataCamp and host of the DataFramed podcast, which you must check out.

**2. How you will learn**

![image.png](attachment:image.png)

In this course, you will learn Python for Data Science through video lessons, like this one, and interactive exercises. You get your own Python session where you can experiment and try to come up with the correct code to solve the instructions. You're learning by doing, while receiving customized and instant feedback on your work.

**3. Python**

![image-2.png](attachment:image-2.png)

for latest version https://www.python.org/downloads/

Python was conceived by Guido Van Rossum. Here, you can see a photo of me with Guido. What started as a hobby project, soon became a general purpose programming language: nowadays, you can use Python to build practically any piece of software. But how did this happen? Well, first of all, Python is open source. It's free to use. Second, it's very easy to build packages in Python, which is code that you can share with other people to solve specific problems. Throughout time, more and more of these packages specifically built for data science have been developed. Suppose you want to make some fancy visualizations of your company's sales. There's a package for that. Or what about connecting to a database to analyze sensor measurements? There's also a package for that. People often refer to Python as the swiss army knife of programming languages as you can do almost anything with it. In this course, we'll start to build up your data science coding skills bit by bit, so make sure to stick around to see how powerful the language can be. Our courses focus on Python 3. To install Python 3 on your own system, follow the steps at this URL.

**4. IPython Shell**

![image-3.png](attachment:image-3.png)

Now that you're all eyes and ears for Python, let's start experimenting. I'll start with the

**5. IPython Shell**

![image-4.png](attachment:image-4.png)

Python shell, a place where you can type Python code and immediately see the results. In DataCamp's exercise interface, this shell is embedded here. Let's start off simple and use Python as a calculator.

**6. IPython Shell**

![image-5.png](attachment:image-5.png)

Let me type 4 + 5, and hit Enter. Python interprets what you typed and prints the result of your calculation, 9. The Python shell that's used here is actually not the original one; we're using IPython, short for Interactive Python, which is some kind of juiced up version of regular Python that'll be useful later on. IPython was created by Fernando Pérez and is part of the broader Jupyter ecosystem. Apart from interactively working with Python, you can also have Python run so called

**7. Python Script**

![image-6.png](attachment:image-6.png)

python scripts. These python scripts are simply text files with the extension (dot) py. It's basically a list of Python commands that are executed, almost as if you where typing the commands in the shell yourself, line by line.

**8. Python Script**

![image-7.png](attachment:image-7.png)

Let's put the command from before in a script now, which can be found here in DataCamp's interface. The next step is executing the script, by clicking 'Submit Answer'. If you execute this script in the DataCamp interface, there's nothing in the output pane. That's because you have to explicitly use print inside scripts if you want to generate output during execution.

**9. Python Script**

![image-8.png](attachment:image-8.png)

Let's wrap our previous calculation in a print call, and rerun the script. This time, the same output as before is generated, great! Putting your code in Python scripts instead of manually retyping every step interactively will help you to keep structure and avoid retyping everything over and over again if you want to make a change; you simply make the change in the script, and rerun the entire thing.

**10. DataCamp Interface**

![image-9.png](attachment:image-9.png)

Now that you've got an idea about different ways of working with Python, I suggest you head over to the exercises. Use the IPython Shell for experimentation, and use the Python script editor to code the actual answer. If you click Submit Answer, your script will be executed and checked for correctness.

**11. Let's practice!**

Get coding and don't forget to have fun!

![image.png](attachment:image.png)
Correct! Python is an extremely flexible language.

### Exercise

**The Python Interface**

Hit Run Code to run your first Python code with Datacamp and see the output!

Notice the script.py window; this is where you can type Python code to solve exercises. You can hit Run Code and Submit Answer as often as you want. If you're stuck, you can click Get Hint, and ultimately Get Solution.

You can also use the IPython Shell interactively by typing commands and hitting Enter. Here, your code will not be checked for correctness so it is a great way to experiment.

**Instructions**

- Experiment in the IPython Shell; type 5 / 8, for example.
- Add another line of code to script.py, print(7 + 10), to be checked for correctness.
- Hit Submit Answer to execute the Python script and receive feedback.

In [39]:
# Example, do not modify!
print(5 / 8)

# Print the sum of 7 and 10
print(7+10)

0.625
17


### Exercise

**Any comments?**

You can also add comments to your Python scripts. Comments are important to make sure that you and others can understand what your code is about and do not run as Python code.

They start with # tag. See the comment in the editor, # Division; now it's your turn to add a comment!

**Instructions**

- Above the print(7 + 10), add the comment # Addition

In [40]:
# Division
print(5 / 8)

# Addition
print(7 + 10)

0.625
17


### Exercise

**Python as a calculator**

Python is perfectly suited to do basic calculations. It can do addition, subtraction, multiplication and division.

The code in the script gives some examples.

Now it's your turn to practice!

**Instructions**

- Print the sum of 4 + 5.
- Print the result of subtracting 5 from 5.
- Print the result of multiplying 3 by 5.
- Print the result of dividing 10 by 2.

In [41]:
# Addition
print(4+5)

# Subtraction
print(5-5)

# Multiplication
print(3*5)

# Division
print(10/2)

9
0
15
5.0


### 1. Variables and Types

Well done and welcome back! It's clear that Python is a great calculator. If you want to do more complex calculations though, you will want to "save" values while you're coding along.

**2. Variable**

![image.png](attachment:image.png)

You can do this by defining a variable, with a specific, case-sensitive name. Once you create (or declare) such a variable, you can later call up its value by typing the variable name. Suppose you measure your height and weight, in metric units: you are 1-point-79 meters tall, and weigh 68-point-7 kilograms. You can assign these values to two variables, named height and weight, with an equals sign: If you now type the name of the variable, height, Python looks for the variable name, retrieves its value, and prints it out.

**3. Calculate BMI**

![image-2.png](attachment:image-2.png)

Let's now calculate the Body Mass Index, or BMI, which is calculated as follows, with weight in kilograms and height in meters. You can do this with the actual values, but you can just as well use the variables height and weight, like in here. Every time you type the variable's name, you are asking Python to change it with the actual value of the variable. weight corresponds to 68-point-7, and height to 1-point-79. Finally, this version has Python store the result in a new variable, bmi. bmi now contains the same value as the one you calculated earlier. In Python, variables are used all the time. They help to make your code reproducible.

**4. Reproducibility**

![image-3.png](attachment:image-3.png)

Suppose the code to create the height, weight and bmi variable are in a script, like this. If you now want to recalculate the bmi for another weight,

**5. Reproducibility**

![image-4.png](attachment:image-4.png)

you can simply change the declaration of the weight variable, and rerun the script. The bmi changes accordingly, because the value of the variable weight has changed as well. So far, we've only worked with numerical values, such as height and weight.

**6. Python Types**

![image-5.png](attachment:image-5.png)

In Python, these numbers all have a specific type. You can check out the type of a value with the type function. To see the type of our bmi value, simply write type and then bmi inside parentheses. You can see that it's a float, which is python's way of representing a real number, so a number which can have both an integer part and a fractional part. Python also has a type for integers: int, like this example. To do data science, you'll need more than ints and floats, though.

**7. Python Types (2)**

![image-6.png](attachment:image-6.png)

Python features tons of other data types. The most common ones are strings and booleans. A string is Python's way to represent text. You can use both double and single quotes to build a string, as you can see from these examples. If you print the type of the last variable here, you see that it's str, short for string. The Boolean is a type that can either be True or False. You can think of it as 'Yes' and 'No' in everyday language. Booleans will be very useful in the future, to perform filtering operations on your data for example. There's something special about Python data types.

**8. Python Types (3)**

![image-7.png](attachment:image-7.png)

Have a look at this line of code, that sums two integers, and then this line of code, that sums two strings. For the integers, the values were summed, while for the strings, the strings were pasted together. The plus operator behaved differently for different data types. This is a general principle: how the code behaves depends on the types you're working with. In the exercises that follow, you'll create your first variables and experiment with some of Python's data types. I'll see you in the next video to explain all about lists.

**9. Let's practice!**

Let's get you coding and I can't wait to see you in the next chapter where you'll build even more awesome python charts.

### Exercise

**Variable Assignment**

In Python, a variable allows you to refer to a value with a name. To create a variable x with a value of 5, you use =, like this example:

x = 5

You can now use the name of this variable, x, instead of the actual value, 5.

Remember, = in Python means assignment, it doesn't test equality!

**Instructions**

- Create a variable savings with the value of 100.
- Check out this variable by typing print(savings) in the script.

In [42]:
# Create a variable savings
savings = 100

# Print out savings
print(savings)

100


### Exercise

**Calculations with variables**

You've now created a savings variable, so let's start saving!

Instead of calculating with the actual values, you can use variables instead. The savings variable you created in the previous exercise with a value of 100 is available to you.

How much money would you have saved four months from now, if you saved $10 each month?

**Instructions**

- Create a variable monthly_savings, equal to 10 and num_months, equal to 4.
- Multiply monthly_savings by num_months and save it to new_savings.
- Add new_savings to savings, saving the sum as total_savings.
- Print the value of total_savings.

In [43]:
savings = 100

# Create the variables monthly_savings and num_months
monthly_savings = 10
num_months = 4

# Multiply monthly_savings and num_months
new_savings = monthly_savings * num_months

# Add new_savings to your savings
total_savings = savings + new_savings

# Print total_savings
print(total_savings)

140


### Exercise

**Other variable types**

In the previous exercise, you worked with the integer Python data type:

- int, or integer: a number without a fractional part. savings, with the value 100, is an example of an integer.

Next to numerical data types, there are three other very common data types:

- float, or floating point: a number that has both an integer and fractional part, separated by a point. 1.1, is an example of a float.
- str, or string: a type to represent text. You can use single or double quotes to build a string.
- bool, or boolean: a type to represent logical values. It can only be True or False (the capitalization is important!).

**Instructions**

- Create a new float, half, with the value 0.5.
- Create a new string, intro, with the value "Hello! How are you?".
- Create a new boolean, is_good, with the value True.

In [44]:
# Create a variable half
half = 0.5

# Create a variable intro
intro = 'Hello! How are you?'

# Create a variable is_good
is_good = True

In [45]:
a = 0.5
b = 'Hello'
c = False

### Exercise

**Guess the type**

To find out the type of a value or a variable that refers to that value, you can use the type() function. Suppose you've defined a variable a, but you forgot the type of this variable. To determine the type of a, simply execute:

type(a)

We already went ahead and created three variables: a, b and c. You can use the IPython shell to discover their type. Which of the following options is correct?

**Instructions**

![image.png](attachment:image.png)

In [47]:
print(type(a))
print(type(b))
print(type(c))

<class 'float'>
<class 'str'>
<class 'bool'>


### Exercise

**Operations with other types**

Hugo mentioned that different types behave differently in Python.

When you sum two strings, for example, you'll get different behavior than when you sum two integers or two booleans.

In the script some variables with different types have already been created. It's up to you to use them.

**Instructions**

- Calculate the product of monthly_savings and num_months. Store the result in year_savings.
- What do you think the resulting type will be? Find out by printing out the type of year_savings.
- Calculate the sum of intro and intro and store the result in a new variable doubleintro.
- Print out doubleintro. Did you expect this?

In [48]:
monthly_savings = 10
num_months = 12
intro = "Hello! How are you?"

# Calculate year_savings using monthly_savings and num_months
year_savings = monthly_savings * num_months

# Print the type of year_savings
print(type(year_savings))

# Assign sum of intro and intro to doubleintro
doubleintro = intro + intro

# Print out doubleintro
print(doubleintro)

<class 'int'>
Hello! How are you?Hello! How are you?


### Exercise

**Type conversion**

Using the + operator to paste together two strings can be very useful in building custom messages.

Suppose, for example, that you've calculated your savings want to summarize the results in a string.

To do this, you'll need to explicitly convert the types of your variables. More specifically, you'll need str(), to convert a value into a string. str(savings), for example, will convert the integer savings to a string.

Similar functions such as int(), float() and bool() will help you convert Python values into any type.

**Instructions**

- Hit Run Code to run the code. Try to understand the error message.
- Fix the code such that the printout runs without errors; use the function str() to convert the variables savings and total_savings to strings.
- Convert the variable pi_string to a float and store this float as a new variable, pi_float.

In [51]:
# Definition of savings and total_savings
savings = 100
total_savings = 150

# Fix the printout
#print("I started with $" + savings + " and now have $" + total_savings + ". Awesome!")
print("I started with $" + str(savings) + " and now have $" + str(total_savings) + ". Awesome!")

# Definition of pi_string
pi_string = "3.1415926"

# Convert pi_string into float: pi_float
pi_float = float(pi_string)

I started with $100 and now have $150. Awesome!


### Exercise

**Can Python handle everything?**

Now that you know something more about combining different sources of information, have a look at the four Python expressions below. Which one of these will throw an error? You can always copy and paste this code in the IPython Shell to find out!

**Instructions**

![image.png](attachment:image.png)

## <a id="1"></a>
<font color="lightseagreen" size=+2.5><b>1. Python Lists</b></font>

<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Table of Contents</a>

An introduction to the basic concepts of Python. Learn how to use Python interactively and by using a script. Create your first variables and acquaint yourself with Python's basic data types.

### 1. Python Lists

Welcome back aspiring Pythonista. By now, you've played around with different data types, and I hope you've had as much fun as I have.

**2. Python Data Types**

![image.png](attachment:image.png)

On the numbers side, there's the float, to represent a real number, and the int, to represent an integer. Next, we also have str, short for string, to represent text in Python, and bool, which can be either True or False. You can save these values as a variable, like these examples show. Each variable then represents a single value. As a data scientist,

**3. Problem**

![image-2.png](attachment:image-2.png)

you'll often want to work with many data points. If you for example want to measure the height of everybody in your family, and store this information in Python, it would be inconvenient to create a new python variable for each point you collected right? What you can do instead, is store all this information in a Python list.

**4. Python List**

![image-3.png](attachment:image-3.png)

You can build such a list with square brackets. Suppose you asked your two sisters and parents for their height, in meters. You can build the list as follows: Of course, also this data structure can be referenced to with a variable. Simply put the variable name and the equals sign in front, like here. A list is a way to give a single name to a collection of values. These values, or elements, can have any type; they can be floats, integer, booleans, strings, but also more advanced Python types, even lists. It's perfectly possible for a list to contain different types as well.

**5. Python List**

![image-4.png](attachment:image-4.png)

Suppose, for example, that you want to add the names of your sisters and parents to the list, so that you know which height belongs to who. You can throw in some strings without issues. But that's not all. I just told you that lists can also contain lists themselves. Instead of putting the strings in between the numbers, you can create little sublists for each member of the family. One for liz, one for emma and so on. Now, you can tell Python that these sublists are the elements of another list, that I named fam2: the little lists are wrapped in square brackets and separated with commas. If you now print out fam2, you see that we have a list of lists. The main list contains 4 sub-lists. We're dealing with a new Python type here, next to the strings, booleans, integers and floats you already know about:

**6. List type**

![image-5.png](attachment:image-5.png)

the list. These calls show that both fam and fam2 are lists. Remember that I told you that each type has specific functionality and behavior associated? Well, for lists, this is also true. Python lists host a bunch of tools to subset and adapt them. But let's take this step by step,

**7. Let's practice!**

and have you experiment with list creation first!

### Exercise

**Create a list**

As opposed to int, bool etc., a list is a compound data type; you can group values together:

- a = "is"
- b = "nice"
- my_list = ["my", "list", a, b]

After measuring the height of your family, you decide to collect some information on the house you're living in. The areas of the different parts of your house are stored in separate variables for now, as shown in the script.

**Instructions**

- Create a list, areas, that contains the area of the hallway (hall), kitchen (kit), living room (liv), bedroom (bed) and bathroom (bath), in this order. Use the predefined variables.
- Print areas with the print() function.

In [52]:
# area variables (in square meters)
hall = 11.25
kit = 18.0
liv = 20.0
bed = 10.75
bath = 9.50

# Create list areas
areas = [11.25,18.0,20.0,10.75,9.50]

# Print areas
print(areas)

[11.25, 18.0, 20.0, 10.75, 9.5]


### Exercise

**Create list with different types**

A list can contain any Python type. Although it's not really common, a list can also contain a mix of Python types including strings, floats, booleans, etc.

The printout of the previous exercise wasn't really satisfying. It's just a list of numbers representing the areas, but you can't tell which area corresponds to which part of your house.

The code in the editor is the start of a solution. For some of the areas, the name of the corresponding room is already placed in front. Pay attention here! "bathroom" is a string, while bath is a variable that represents the float 9.50 you specified earlier.

**Instructions**

- Finish the code that creates the areas list. Build the list so that the list first contains the name of each room as a string and then its area. In other words, add the strings "hallway", "kitchen" and "bedroom" at the appropriate locations.
- Print areas again; is the printout more informative this time?

![image.png](attachment:image.png)

In [53]:
# area variables (in square meters)
hall = 11.25
kit = 18.0
liv = 20.0
bed = 10.75
bath = 9.50

# Adapt list areas
areas = ["hallway", hall, "kitchen", kit, "living room", liv, "bedroom", bed, "bathroom", bath]

# Print areas
print(areas)

['hallway', 11.25, 'kitchen', 18.0, 'living room', 20.0, 'bedroom', 10.75, 'bathroom', 9.5]


### Exercise

**Select the valid list**

A list can contain any Python type. But a list itself is also a Python type. That means that a list can also contain a list! Python is getting funkier by the minute, but fear not, just remember the list syntax:

- my_list = [el1, el2, el3]

Can you tell which ones of the following lines of Python code are valid ways to build a list?

- A. [1, 3, 4, 2] B. [[1, 2, 3], [4, 5, 7]] C. [1 + 2, "a" * 5, 3]

![image.png](attachment:image.png)

### Exercise

**List of lists**

As a data scientist, you'll often be dealing with a lot of data, and it will make sense to group some of this data.

Instead of creating a flat list containing strings and floats, representing the names and areas of the rooms in your house, you can create a list of lists. The script in the editor can already give you an idea.

Don't get confused here: "hallway" is a string, while hall is a variable that represents the float 11.25 you specified earlier.

**Instructions**

- Finish the list of lists so that it also contains the bedroom and bathroom data. Make sure you enter these in order!
- Print out house; does this way of structuring your data make more sense?
- Print out the type of house. Are you still dealing with a list?

![image.png](attachment:image.png)

In [54]:
# area variables (in square meters)
hall = 11.25
kit = 18.0
liv = 20.0
bed = 10.75
bath = 9.50

# house information as list of lists
house = [["hallway", hall],
         ["kitchen", kit],
         ["living room", liv],
         ["bedroom", bed],
         ["bathroom", bath]]

# Print out house
print(house)

# Print out the type of house
print(type(house))

[['hallway', 11.25], ['kitchen', 18.0], ['living room', 20.0], ['bedroom', 10.75], ['bathroom', 9.5]]
<class 'list'>


### 1. Subsetting Lists

After you've created your very own Python list, you'll need to know how you can access information in the list.

**2. Subsetting lists**

![image.png](attachment:image.png)

Python uses the index to do this. Have a look at the fam list again here. The first element in the list has index 0, the second element has index 1, and so on. Suppose that you want to select the height of emma, the float 1-point-68. It's the fourth element, so it has index 3. To select it, you use 3 inside square brackets. Similarly, to select the string "dad" from the list,

**3. Subsetting lists**

![image-2.png](attachment:image-2.png)

which is the seventh element in the list, you'll need to put the index 6 inside square brackets. You can also count backwards, using negative indexes. This is useful if you want to get some elements at the end of your list. To get your dad's height, for example, you'll need the index -1. These are the negative indexes for all list elements.

**4. Subsetting lists**

![image-3.png](attachment:image-3.png)

This means that both these lines return the exact same result. Apart from indexing, there's also something called slicing,

**5. List slicing**

![image-4.png](attachment:image-4.png)

which allows you to select multiple elements from a list, thus creating a new list. You can do this by specifying a range, using a colon. Let's first have another look at the list, and then try this piece of code. Can you guess what it'll return? A list with the the float 1-point-68, the string "mom", and the float 1-point-71, corresponding to the 4th, 5th and 6th element in the list maybe? Let's see what the output is. Apparently, only the elements with index 3 and 4, get returned. The element with index 5 is not included. In general, this is the syntax: the index you specify before the colon, so where the slice starts, is included, while the index you specify after the colon, where the slice ends, is not. With this in mind, can you tell what this call will return? You probably guessed correctly that this call gives you a list with three elements, corresponding to the elements with index 1, 2 and 3 of the fam list. You can also choose to just leave out the index before or after the colon.

**6. List slicing**

![image-5.png](attachment:image-5.png)

If you leave out the index where the slice should begin, you're telling Python to start the slice from index 0, like this example. If you leave out the index where the slice should end, you include all elements up to and including the last element in the list, like here. Now it's time to head over to the exercises,

**7. Let's practice!**

where you will continue to work on the list you've created yourself before. You'll use different subsetting methods to get exactly the piece of information you need!

### Exercise

**Subset and conquer**

Subsetting Python lists is a piece of cake. Take the code sample below, which creates a list x and then selects "b" from it. Remember that this is the second element, so it has index 1. You can also use negative indexing.

- x = ["a", "b", "c", "d"]
- x[1]
- x[-3] # same result!

Remember the areas list from before, containing both strings and floats? Its definition is already in the script. Can you add the correct code to do some Python subsetting?

**Instructions**

- Print out the second element from the areas list (it has the value 11.25).
- Subset and print out the last element of areas, being 9.50. Using a negative index makes sense here!
- Select the number representing the area of the living room (20.0) and print it out.

In [55]:
# Create the areas list
areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]

# Print out second element from areas
print(areas[1])

# Print out last element from areas
print(areas[-1])

# Print out the area of the living room
print(areas[5])

11.25
9.5
20.0


### Exercise

**Subset and calculate**

After you've extracted values from a list, you can use them to perform additional calculations. Take this example, where the second and fourth element of a list x are extracted. The strings that result are pasted together using the + operator:

- x = ["a", "b", "c", "d"]
- print(x[1] + x[3])

**Instructions**

- Using a combination of list subsetting and variable assignment, create a new variable, eat_sleep_area, that contains the sum of the area of the kitchen and the area of the bedroom.
- Print the new variable eat_sleep_area.

In [56]:
# Create the areas list
areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]

# Sum of kitchen and bedroom area: eat_sleep_area
eat_sleep_area = areas[3] + areas[7]

# Print the variable eat_sleep_area
print(eat_sleep_area)

28.75


### Exercise

**Slicing and dicing**

Selecting single values from a list is just one part of the story. It's also possible to slice your list, which means selecting multiple elements from your list. Use the following syntax:

- my_list[start:end]

The start index will be included, while the end index is not.

The code sample below shows an example. A list with "b" and "c", corresponding to indexes 1 and 2, are selected from a list x:

- x = ["a", "b", "c", "d"]
- x[1:3]

The elements with index 1 and 2 are included, while the element with index 3 is not.

**Instructions**

- Use slicing to create a list, downstairs, that contains the first 6 elements of areas.
- Do a similar thing to create a new variable, upstairs, that contains the last 4 elements of areas.
- Print both downstairs and upstairs using print().

In [60]:
# Create the areas list
areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]

# Use slicing to create downstairs
downstairs = areas[0:6]

# Use slicing to create upstairs
upstairs = areas[-4:10]

# Print out downstairs and upstairs
print(downstairs, upstairs)

['hallway', 11.25, 'kitchen', 18.0, 'living room', 20.0] ['bedroom', 10.75, 'bathroom', 9.5]


### Exercise

**Slicing and dicing (2)**

In the video, Hugo first discussed the syntax where you specify both where to begin and end the slice of your list:

- my_list[begin:end]

However, it's also possible not to specify these indexes. If you don't specify the begin index, Python figures out that you want to start your slice at the beginning of your list. If you don't specify the end index, the slice will go all the way to the last element of your list. To experiment with this, try the following commands in the IPython Shell:

- x = ["a", "b", "c", "d"]
- x[:2]
- x[2:]
- x[:]

**Instructions**

- Create downstairs again, as the first 6 elements of areas. This time, simplify the slicing by omitting the begin index.
- Create upstairs again, as the last 4 elements of areas. This time, simplify the slicing by omitting the end index.

In [59]:
# Create the areas list
areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]

# Alternative slicing to create downstairs
downstairs = areas[:6]

# Alternative slicing to create upstairs
upstairs = areas[-4:]

print(downstairs, upstairs)

['hallway', 11.25, 'kitchen', 18.0, 'living room', 20.0] ['bedroom', 10.75, 'bathroom', 9.5]


### Exercise

**Subsetting lists of lists**

You saw before that a Python list can contain practically anything; even other lists! To subset lists of lists, you can use the same technique as before: square brackets. Try out the commands in the following code sample in the IPython Shell:

- x = [["a", "b", "c"],["d", "e", "f"],["g", "h", "i"]]
- x[2][0]
- x[2][:2]

x[2] results in a list, that you can subset again by adding additional square brackets.

What will house[-1][1] return? house, the list of lists that you created before, is already defined for you in the workspace. You can experiment with it in the IPython Shell.

**Instructions**

![image.png](attachment:image.png)

In [61]:
house

[['hallway', 11.25],
 ['kitchen', 18.0],
 ['living room', 20.0],
 ['bedroom', 10.75],
 ['bathroom', 9.5]]

In [62]:
house[-1][1]

9.5

### 1. Manipulating Lists

Wow, you're doing super well. So now, after creation and subsetting, the final piece of the Python lists puzzle is

**2. List Manipulation**

manipulation, so ways to change elements in your list, or to add elements to and remove elements from your list.

**3. Changing list elements**

Changing list elements is pretty straightforward. You use the same square brackets that we've used to subset lists, and then assign new elements to it using the equals sign. Suppose that after another look at fam, you realize that your dad's height is not up to date anymore, as he's shrinking with age. Instead of 1-point-89 meters, it should be 1-point-86 meters. To change this list element, which is at index 7, you can use this line of code. If you now check out fam, you'll see that the value is updated. You can even change an entire list slice at once. To change the elements "liz" and 1-point-73, you access the first two elements with 0:2, and then assign a new list to it. Do you still remember how the plus operator was different for strings and integers?

**4. Adding and removing elements**

Well, it's again different for lists. If you use the plus sign with two lists, Python simply pastes together their contents in a single list. Suppose you want to add your own name and height to the fam height list. This will do the trick. Of course, you can also store this new list in a variable, fam_ext for example. Finally, deleting elements from a list is also pretty straightforward, you'll have to use del here. Take this line, for example, that deletes the element with index 2, so "emma", from the list. If you check out fam now, you'll see that the "emma" string is gone. Because you've removed an index, all elements that came after "emma" scooted over by one index. If you again run the same line, you're again removing the element at index 2, which is emma's height, 1-point-68 meters now. Understanding how Python lists actually work

**5. Behind the scenes (1)**

behind the scenes becomes pretty important now. What actually happens when you create a new list, x, like this? Well, in a simplified sense, you're storing a list in your computer memory, and store the 'address' of that list, so

**6. Behind the scenes (1)**

where the list is in your computer memory, in x. This means that x does not actually contain all the list elements, it rather contains a reference to the list. For basic operations, the difference is not that important, but it becomes more so when you start copying lists. Let me clarify this with an example. Let's store the list x as a new variable y, by simply using the equals sign. Let's now change the element with index one in the list y, like this. The funky thing is that if you now check out x again, also here the second element was changed. That's because when you copied x to y with the equals sign,

**7. Behind the scenes (1)**

you copied the reference to the list, not the actual values themselves.

**8. Behind the scenes (1)**

When you're updating an element the list, it's one and the same list in the computer memory your changing. Both x and y point to this list, so the update is visible from both variables. If you want to create a list y that points to a new list in the memory with the same values,

**9. Behind the scenes (2)**

you'll need to use something else than the equals sign. You can use the list function,

**10. Behind the scenes (2)**

like this, or use slicing to select all list elements explicitly. If you now

**11. Behind the scenes (2)**

make a change to the list y points to, x is not affected. If this was a bit too much to take in, don't worry.

**12. Let's practice!**

The exercises will help you understand list manipulation and the subtle inner workings of lists. I'm sure you'll do great!