# Using Python Language Features and the Standard Library

- Before moving on to the data structures and algorithms, we should go through some of python basics

- We are going to focus on pythonic approaches

## 1.1 Built-in functions

In [1]:
import random
from timeit import timeit

N = 100000  # Number of elements in the list

# Ensure every list is the same
random.seed(12)
my_data = [random.random() for i in range(N)]

- Let's look at summing values in a list in a C and in a Python way

- In C, you would write a loop to sum numbers like this:

In [2]:
def manualSumC():
    n = 0
    for i in range(len(my_data)):
        n += my_data[i]
    return n

- In Python you can loop directly over the list of elements instead

In [3]:
def manualSumPy(): 
    n = 0
    for evt_count in my_data:
        n += evt_count
    return n


- There is also an in built sum function in python

In [4]:
def builtinSum(): 
    return sum(my_data)

- If we compare all of these to each other, we can see that the fastest version in actually in built python version, them the python version and then the C version

- You can see that by leveraging python we can write a much faster and cleaner code

In [5]:
repeats = 1000
print(f"manualSumC: {timeit(manualSumC, globals=globals(), number=repeats):.3f}ms")
print(f"manualSumPy: {timeit(manualSumPy, globals=globals(), number=repeats):.3f}ms")
print(f"builtinSum: {timeit(builtinSum, globals=globals(), number=repeats):.3f}ms")

manualSumC: 1.414ms
manualSumPy: 0.771ms
builtinSum: 0.278ms


- This is because the built-in function are typically implemented in CPython backend and it bypasses python interpreter 

- What is CPython?

Python is a programming language, and CPython is the main program that actually runs Python code. It’s the default version of Python that most people use. It’s written in C, which is why it’s called CPython. You write Python code, and CPython turns it into instructions that your computer can execute.

- In particular, those which are passed an iterable (e.g. lists) are likely to provide the greatest benefits to performance. The Python documentation provides equivalent Python code for many of these cases.

all(): boolean and of all items

any(): boolean or of all items

max(): Return the maximum item

min(): Return the minimum item

sum(): Return the sum of all items

- It’s usually better to tell Python what you want done (at a high level), rather than writing out all the steps. Built-ins and libraries will often do the work in optimised C code for you, and then just hand back a Python object.

## 1.2 Searching an element in a list

- Another example 

- Similarly to before we are going to compare manual python search and a proper pythonic method of searching elements

- Let's first generate inputs

In [6]:
import random

N = 2500  # Number of elements in list
M = 2  # N*M == Range over which the elements span

def generateInputs():
    random.seed(12)  # Ensure every list is the same
    return [random.randint(0, int(N*M)) for i in range(N)]

- Manual search is linear search which iterates though the list 

In [7]:
def manualSearch():
    ls = generateInputs()
    ct = 0
    for i in range(0, int(N*M), M):
        for j in range(0, len(ls)):
            if ls[j] == i:
                ct += 1
                break

- operatorSearch() uses the in operator to perform each search, which allows CPython to implement the inner loop in its C back-end

In [8]:
def operatorSearch():
    ls = generateInputs()
    ct = 0
    for i in range(0, int(N*M), M):
        if i in ls:
            ct += 1

- Manual search is 5x slower than the pythonic implementation

In [9]:
repeats = 1000
gen_time = timeit(generateInputs, number=repeats)
print(f"manualSearch: {timeit(manualSearch, number=repeats)-gen_time:.2f}ms")
print(f"operatorSearch: {timeit(operatorSearch, number=repeats)-gen_time:.2f}ms")

manualSearch: 57.24ms
operatorSearch: 24.28ms
