An algorithm is a well-defined series of steps for performing a task, such as making calculations or processing data. An algorithm usually has an input and an output. In reality, any code we write performs an algorithm, whether it be simple or complicated.

In real life, we perform algorithms daily. Following a cookie recipe is an example of a series of steps that takes an input (the ingredients) and produces an output (the cookies).

Let's start with a simple algorithm that searches for a value in a list. We could use a linear search algorithm to do this. Remember that an algorithm is a particular method for performing a task, and linear search is only one of several algorithms that can solve this problem.

Linear search checks a list of items for a particular value by reviewing each item in the list until it finds the one it's looking for. If it doesn't find a matching item, we can conclude that there's no matching item in the list.

In [1]:
from csv import reader

In [2]:
nba = list(reader(open("nba_2013.csv")))

In [3]:
# When the algorithm finds Kobe in the data set, store his position in Kobe_position
kobe_position = ""

# Find Kobe in the data set
for row in nba:
    if row[0] == "Kobe Bryant":
        kobe_position = row[1]

In [4]:
kobe_position

'SG'

As algorithms become more complex, it's important to make sure the code remains modular.

**Modular** code consists of smaller chunks that we can reuse for other things. The most common way to make code modular is to use functions.

**Abstraction** is the idea that someone can use our code to perform an operation without having to worry about how we wrote or implemented it.

The sum() function exhibits both modularity and abstraction. We don't know exactly how the function is implemented, and we don't need to; we only need to know what it does. That makes it abstract. It also saves us the work of having to manually compute sums in many parts of our code. That makes it modular.

Now let's try writing a modular search function that can find the age of any player in our data set without having to repeat code.

In [8]:
# player_age returns the age of a player in our NBA data set
def player_age(name):
    for row in nba:
        if row[0] == name:
            return row[2]
    return -1 # If the function doesn't find the player, it should return -1

In [9]:
allen_age = player_age("Ray Allen")
durant_age = player_age("Kevin Durant")
shaq_age = player_age("Shaquille O'Neal")

In [7]:
print(allen_age)

38


So far, we've been working with linear search, which is a fairly basic algorithm. When we need to perform more complicated tasks, algorithms can become very involved, especially considering that many different ones can achieve the same result.

With multiple algorithms to choose from, a programmer has to make trade-offs and decide which algorithm best suits his or her needs. The most common factor to consider is time complexity.

Time complexity is a measurement of how much time an algorithm takes with respect to its input size. Algorithms with smaller time complexities generally take less time and are more desirable.

A constant algorithm takes the same amount of time to complete, regardless of the input size.

For example, let's consider an algorithm that returns the first element of a list:

def first(ls):

    return ls[0]
    
Regardless of list size, the algorithm returns the first element in constant time. It only takes one operation to retrieve this element, no matter how large the list.

We tend to think of algorithms in terms of steps. We consider any basic operation like setting a variable or performing arithmetic a step. Algorithms that take a constant number of steps are always constant time, even if that constant number is not 1.

Most complicated algorithms are not constant time. However, many operations within larger algorithms are constant time. Since we don't particularly care about what the constant is, we don't need to tediously count steps, as long as we're certain we'll get a constant.

An example of an operation that's not constant time is a loop that touches every element in an input list. Since a larger input would necessitate more steps, we can't treat this operation as a constant.

It's important to recognize the time complexity of the algorithms. 

In [11]:
# Implementation A: Convert degrees Celcius to degrees Fahrenheit
def celcius_to_fahrenheit(degrees):
    step_1 = degrees * 1.8
    step_2 = step_1 + 32
    return step_2

# Implementation B: Reverse a list
def reverse(ls):
    length = len(ls)
    new_list = []
    for i in range(length):
        new_list[i] = ls[length - i]
    return new_list

# Implementation C: Print a blastoff message after a countdown
def blastoff(message):
    count = 10
    for i in range(count):
        print(count - i)
    print(message)


In [13]:
# Indicate which one is not constant time on above three functions
not_constant = "B"

In the worst case scenario for a list of size n, the algorithm has to check n elements. We refer to this time complexity as **linear time** because the runtime grows at a constant rate with respect to the size of the input.

Algorithms that take constant multiples of n steps (where n is the input size) are still linear time. For instance, an algorithm that takes 5n steps, or even 0.5n steps, is linear time. If we have an algorithm that prints the first half of a list (and we know the length of the list ahead of time), the algorithm will take 0.5n time. Even though it takes less than n time, we still consider it linear.

It's also worth noting that we only care about performance at a large scale. At a small scale, most algorithms will run pretty quickly, and it's only when n becomes large that we worry about time complexity.

Consequently, we only consider the highest order of n for time complexity. That means that an algorithm that runs in 9n + 20 time is linear, because the constant component is negligible for large values of n.

So far, we've only seen linear time and constant time algorithms. While there are infinitely many categories of algorithms and time complexities, these two cover a large variety of possibilities.

In [14]:
# Find the length of a list
def length(ls):
    count = 0
    for elem in ls:
        count = count + 1
        
length_time_complexity = "linear"

In [15]:
# Check whether a list is empty -- Implementation 1
def is_empty_1(ls):
    if length(ls) == 0:
        return True
    else:
        return False

is_empty_1_complexity = "linear"

In [16]:
# Check whether a list is empty -- Implementation 2

def is_empty_2(ls):
    for element in ls:
        return False
    return True

is_empty_2_complexity = "constant"

When discussing time complexity, we should use the proper notation. Most commonly, we use **Big-O Notation**.

To denote constant time, we would write O(1), because 1 is a constant (and a simple constant).

To denote linear time, we would write O(n), because n is the simplest example of linearity.

Big-O Notation follows a similar pattern for other time complexities. For example, O(n^2), O(2^n), and O(log(n)) are all valid notation. The algorithms with these complexities are probably rather complicated

Time complexity is an important consideration when we're analyzing real-world data. An inefficient algorithm will perform very slowly on a large data set.

Algorithms with lower-order time complexities are more efficient. Constant time algorithms, which we denote with O(1), are more efficient than linear time algorithms, which we denote with O(n). Similarly, an algorithm with complexity O(n^2) is more efficient than one with complexity O(n^3).

When considering algorithms, we always want to choose the one with the lowest time complexity. It may not always be the easiest one to implement, but the extra effort is usually worth the resulting efficiency.