# Programming Essentials in Python - Arguements, Lists and Loops

## Lists

In the last session we looked into singular variables, and we assigned numeric, string and boolean values to the variables, and then did some basic calculations. But normally in analytics and machine learning, we do not work with single values, but multiple values stored in an object. For example, we will be storing a years worth of daily sales in a variable, and then work with it. For storing multiple values in a variable, we use lists in Python. The syntax to create lists are as follows:

In [1]:
# Suppose we look at daily sales over two weeks for a store

sales_w1 = [2500, 3000, 2750, 3100, 2600, 2860, 2650]
sales_w2 = [3000, 3200, 2800, 2900, 3150, 2950, 3000]

# Lets look at how they are stored in the python environment
%whos

Variable   Type    Data/Info
----------------------------
sales_w1   list    n=7
sales_w2   list    n=7


As you can see, the variables are stored differently, and it gives an idea of how many things are stored in each of the variables

In [2]:
# To understand the nature of the variable stored, we can use the following function
print(type(sales_w1))

# To get an idea of how many contents are there in a list
print(len(sales_w1))

<class 'list'>
7


One very important concept in coding is indexing. This gives us an idea of where in a list a specific value is situated. The following code cell gives us an idae of indexing in Python

In [3]:
# Getting the sales of the first day
print(sales_w1[1])

# But this is not the first entry. Actually the first entry comes after we give the following code
print(sales_w1[0])

3000
2500


Whats happening here? So python has a 0 indexing system, This means that the first entry into a list (can be a row or column data) will have the 0 position, then the second entry will have the 1 position and so on. So if there is a list of 100 (say the data of 100 days of sales), then the 100th entry will have the position 99. Lets see some more example

In [4]:
# Second day of week 2 sales
print(sales_w2[1])

# Third day of week 1 sales
print(sales_w1[2])

# First day week 2 sales
print(sales_w2[0])

# Last day of week 1 sales
print(sales_w1[6])

3200
2750
3000
2650


An interesting thing about coding is that you can automate everything as much as possible. In the previous examples we are calling the first day or the last day by giving the exact numbers. But what if we want to create a code that gives us the last entry of a list without giving the exact position of the list, we can try out the following code

In [5]:
# The sales of the last day of the week 2 will be
print(sales_w2[len(sales_w2) - 1])

3000


Lets take some time to understand what happened here. The len(sales_w2) gives me the output of 7. 7 - 1 is 6. So technically we are running the code print(sales_w2[6]) here - similar to manually entering the number. But in this way, I dont need to know the full detail of how long the list will be. In fact, if the sales_w2 list changes to 10 numbers, the code will output 10th number. If it becomes 15 number list, it will enter the 14th number

In coding we can do a single thing in multiple ways. What we have done in the above line of code, we can achieve is a shorted way in the following

In [6]:
# The last days sales of week 1
print(sales_w1[-1])

# The second to last days sales of week 2
print(sales_w2[-2])

2650
2950


The importance here is not know that there are many ways you can logic out your codes, and there is no single best method

So far we have only looked at numeric lists. But what happens when we have non numeric lists. Lets look at a list of planets

In [7]:
planets = ["Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune"]

# to check how they are stored
%whos

Variable   Type    Data/Info
----------------------------
planets    list    n=8
sales_w1   list    n=7
sales_w2   list    n=7


In [8]:
# First Planet
print(planets[0])

# Third Planet
print(planets[2])

# Second to Last Planet
print(planets[-2])

Mercury
Earth
Uranus


So we can see that planets work the same way as numbers in the list. There are some different between numeric and non_numeric lists, and some of them you will see soon in this lesson

## Slicing Lists

Slicing is the method in which instead of getting one item of the list, we get a group of items. Lets look at some examples below:

In [9]:
#First 3 days sales in week 1
print(sales_w1[:3])

# Sales from the 4th to the last day of week 2
print(sales_w2[3:])

# 3rd planet to 6th planet
print(planets[2:6])

# All planets except the first and last
print(planets[1:-1])

# Last 3 planets
print(planets[-3:])

[2500, 3000, 2750]
[2900, 3150, 2950, 3000]
['Earth', 'Mars', 'Jupiter', 'Saturn']
['Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus']
['Saturn', 'Uranus', 'Neptune']


One very important thing to note here is that for cases when we are starting from one position of the list and ending on another position, we start with the position index of the starting point, and end after the position index of the ending point. An example will make it easier. For 3rd to 6th planet, we used the code planets[2:6]. Started with 2, because that is the position index of the third planet. But notice that we ended at 6, the position index of the 7th planet, but its not diplayed. In programming language of python planets[2:6] means give me planets from position 3 and before position 6. Another example can be the first 3 days sale of week 1 - sales_w1[0:3]

Suppose I want to change the name of the 4th entry of the list

In [10]:
planets[3] = "Malacandra"

#looking at planets again
planets

['Mercury',
 'Venus',
 'Earth',
 'Malacandra',
 'Jupiter',
 'Saturn',
 'Uranus',
 'Neptune']

In [11]:
#what is I want to change the name of first 3 planets
planets[0:3] = ['Mer', 'Ven', 'Ear']

#checking the planets list again
planets

['Mer', 'Ven', 'Ear', 'Malacandra', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

In [12]:
#but now let us fix this list and get back to normal
planets[0:4] = ['Mercury', 'Venus', 'Earth', 'Mars']

#checking if we are getting the correct planets
planets

['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

Do note that in python when you overwrite or change a value in the list, there is no going back, no undo button! So be careful!!! If you are changing values in a list, and you think you will also need the original values, then you better create a duplicate

In [13]:
# Duplicating the planets list
planets_2 = planets

# Checking if the planets 2 look the same way as planets
planets_2

# Checking if both of the planet lists are stored in the system
%whos

Variable    Type    Data/Info
-----------------------------
planets     list    n=8
planets_2   list    n=8
sales_w1    list    n=7
sales_w2    list    n=7


## List Functions

In python we will be using many functions to get our data analysis job done. And most of the functions work on lists. So lets look at few functions. Worry not, these are just basic list functions to get you familiarised with coding. We will learn more of the functions over time related to statistics and machine learning

Some example functions

In [14]:
# total
print(sum(sales_w1))
print(sum(sales_w2))

# max sales of week 1
print(max(sales_w1))

# min sales of week 2
print(min(sales_w2))

# sorting planets albhabetically
print(sorted(planets))

# sorting planets_2 reverse alphabetically
print(sorted(planets, reverse=True))

19460
21000
3100
2800
['Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus']
['Venus', 'Uranus', 'Saturn', 'Neptune', 'Mercury', 'Mars', 'Jupiter', 'Earth']


## List Methods

One of the reasons List are used a lot in python is because of their 'methods' - special things that you can do with lists in Python.

Suppose you want to add another planet to our planets list, without wanting to know what is the total length. This can be easily arranged:

In [15]:
# Adding Pluto to the planets list
planets.append("Pluto")

# So now planets list looks like
planets

['Mercury',
 'Venus',
 'Earth',
 'Mars',
 'Jupiter',
 'Saturn',
 'Uranus',
 'Neptune',
 'Pluto']

What if we want to remove pluto

In [16]:
planets.pop()

# To see what happened
planets

['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

We can also check if a certain planet is on the list

In [17]:
# Checking if Earth is on the list
"Earth" in planets

True

In [18]:
# Cheking if Pluto is on the list
"Pluto" in planets

False

This summarizes our introduction to Lists. We will now look take a detour to understand how booleans in a list can help us get some very interesting insights, using conditional statements

## Conditionals and Loops

In Python, conditionals mean comparsions and checking for conditions. When we are trying to answer whether the value of a variable is equal to 7, or whether the daily sale is greater than 3000, we are using conditionals. First lets check some easy comparisons here

In [19]:
# Lets create some simple variables
A = 10
B = 13

# Now lets start asking comparison questions. Starting with, is A > 5
print(A > 5)

# Next, is B < 10?
print(B < 10)

# Next, is A equals 10?
print(A == 10)

# Next, is B greater than or equal to 12?
print(B >= 12)

#Next, is A not equal to 10?
print(A != 10)

True
False
True
True
False


As we can see, the statements are very simple, and the outcomes we get are in Boolean, True or False

But as again, individual variables are not fun to work with. We can do much more with lists. But for that to happen we will need to learn one last part of the puzzle, loops. Loops are one of the most powerful programming concepts. It allows you to do a repeated task really fast and without much coding. Lets look at an examples, to see how loops work

In [23]:
# Suppose we want to create a new variable from sales of week 1, where each variable we will add 10% more sales.
# Lets call our new variable - inflated_sales

# The long way to do this starts with creating the new variable as an empty list
inflated_sales = []

# Next for sales of day 1 week 1
inflated_sales.append(sales_w1[0] * 1.1)
print(inflated_sales)

# Then for day 2 week 1
inflated_sales.append(sales_w1[1] * 1.1)
print(inflated_sales)

# And so on ....

[2750.0]
[2750.0, 3300.0000000000005]


In [24]:
# When using loops, we again start with the first step - creating inflated_sales
inflated_sales = []

# Next, we write our loop
for sales in sales_w1:
    inflated_sales.append(sales * 1.1)

print(inflated_sales)

[2750.0, 3300.0000000000005, 3025.0000000000005, 3410.0000000000005, 2860.0000000000005, 3146.0000000000005, 2915.0000000000005]


Now lets try to understand what is happening. The loop, also known as for loop, starts with a 'for'. The next parts - 'for sales in sales_w1:' , this part basically tells python to take the sales_w1 list, and take one item at a time from the sales_w1, and store it in the variable sales, which is a temporary variable created for this situation. So first value sales has is the first value in sales_w1 list. The way loop is being run is that, it starts with the first value of sales_w1, and stores it as sales, then executes the next line of the command - 'inflated_sales.append(sales * 1.1)'. This command tells python to take the sales value (which is now the first value of sales_w1) and then multiply by 1.1, and then store in as a value in the list 'inflated_sales'. Then we go back to the for loop again, and this time, after taking the first value from salews_w1, python will take the second value, then store it in sales, and then add 'sales * 1.1' as another value in the 'inflated_sales' and so on!

Lets try our hand at something more complicated. Lets answer three questions:
1. Which days in week 2 were sales higher than 3000
2. What are the total number of days sales is higher than 3000
3. What percentage of days were sales higher than 3000

In [28]:
# First lets have a loop checking for each value of sales_w2, and whether they are greater than 3000

days_sales_greater_than_3k = []

for daily_sales in sales_w2:
    days_sales_greater_than_3k.append(daily_sales > 3000)
    
print(days_sales_greater_than_3k)

# Next is how many days in total were the sales greater than 3000 in week 2. Here remember that a True is 1, and means
# sales is greater than 3000. False is 0. So a sum function will do

print(sum(days_sales_greater_than_3k))

# Finally we check the percentage based on the same logic. The sum will give us total of all the 1s. The len() function
# will give us the total number of numebers in the list. So diving the sum by the len gives us the percentage

print(sum(days_sales_greater_than_3k)/len(days_sales_greater_than_3k))

[False, True, False, False, True, False, False]
2
0.2857142857142857


Now we see the codes of the nature we will be using in our data analytics sections, and now we are answering complex questions. How about we try and answer the next question:

What proportion of days the sales of week 2 were higher than the sales of week 1? Which were these days?

To answer this question, we will learn another method to work with for loop, using number indexes:

In [32]:
# The for loop first

sales_comparison = []

for i in range(7):
    sales_comparison.append(sales_w2[i] > sales_w1[i])
    
# The days when sales of week 2 were higher than the sales of week 1
print(sales_comparison)

# The percentage of days when the sales of week 2 were higher than sales of week 1
print(sum(sales_comparison)/len(sales_comparison))

[True, True, True, False, True, True, True]
0.8571428571428571


85% of the days the sales of week 2 were greter than sales of week 1. Lets look at the for loop again. Here we are using something called a loop counter. 'i' is the loop counter. so the code 'for i in range(7)' means, first, in the loop, i will take the value of 0. Then the next line of code is - we take the sales_w2[0] value, and then sales_w1[0], and then check if sales_w2[0] is greater than sales_21[0]. The result is either True or False. And then, after the first loop is run, i will be 1. Then we will compare sales_w2[1] with sales_w1[1]. Then we will do the same for i = 2 to i = 6. We will stop the loop at i = 6. Many lines of code compressed in 2 lines!

In [33]:
# one last example - gettig the total sales for each day for both week 1 and week 2

total_daily_sales = []

for i in range(7):
    total_daily_sales.append(sales_w1[i] + sales_w2[i])
    
print(total_daily_sales)

[5500, 6200, 5550, 6000, 5750, 5810, 5650]


Finally we look at one last feature of conditionals - If Them Else! Lets say we have a list of numbers from 0 to 99, and we want to divide them into three groups, 0-31 as low, 31-62 as medium and the rest as high. To do this, we will use the if then else statements

In [39]:
# First we create a list of numbers from 0 to 99
numbers = list(range(100))

print(numbers)


# Now we run a if-then-else code and populate our blank variable - hi_med_low
hi_med_low = []

for num in numbers:
    if num < 32:
        hi_med_low.append("low")
    elif num >= 32 and num < 62:
        hi_med_low.append("medium")
    else :
        hi_med_low.append("high")
        
print(hi_med_low)
        

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
['low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'low', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'medium', 'high', 'high', 'high', 'high', 'high', 'high', 'high', 'high', 'high', 'high', 'hig