## Agenda
- To review the ideas of computer science, programming, and problem-solving
- To understand abstraction and the role it plays in the problem-solving process
- To understand and implement the notion of an abstract data type

***

Textbook: Problem Solving With Algorithms and Data Structures Using Python by Brad Miller and David Ranum, Luther College.
It can be reached at: https://runestone.academy/runestone/books/published/pythonds/index.html


Python cheat sheet: https://www.pythoncheatsheet.org/

## Something

In [1]:
print('this is my first python code')

this is my first python code


In [2]:
48//23

2

## Programming
- Take a computing problem and create an executable program
    1. Analyze and understand the problem
    2. Design a data model and an algorithm
        - Data model: way to represent the information
        - Algorithm: formal computation steps
    3. Implement the algorithm in a computer language


- Programming != solving a problem
- Programming is teaching somebody else how to solve a problem
    - “Somebody else” is a computer
    - A computer is a device that can be programmed to carry out a finite set of operations on binary numbers
    - Computers can’t do anything that they are not programmed to do

## Why Data Structures and Algorithms?
- The goal is to write efficient, and correct programs
- Write computer programs in a language, e.g. Python, Java, C++
- Develop mathematical maturity to understand analytical proofs
- Implement good abstractions
***

### Attributes of Algorithms
- Correctness
    - Give a correct solution to the problem!
- Efficiency
    - Time:   How long does it take to solve the problem?
    - Space: How much memory is needed?
    - Benchmarking vs. Analysis
- Ease of understanding
    - Program maintenance
- Elegance

### A Choice of Algorithms
- Possible to come up with several different algorithms to solve the same problem.
    - Which one is the "best"?
        - Most efficient
            - Time vs. Space?
        - Easiest to maintain?
- How do we measure time efficiency?
    - Running time? Machine dependent!
    - Number of steps?

### The Data Cleanup Problem
**Problem:** Remove 0 entries from a list of numbers.

<img src="images/week-01/data_cleanup_problem.png" width="500">

- We look at three algorithms for the same problem, and compare their **time** and **space** efficiency.

### The Shuffle-Left Algorithm
- We scan the list from left to right, and whenever we encounter a 0 element we copy ("shuffle") the rest of the list one position left.

In [10]:
import numpy as np
L=np.zeros(10, dtype = int)
print(L)
L[0]

[0 0 0 0 0 0 0 0 0 0]


0

In [5]:
L[1]=12
L[2]=32
L[3]=71
L[4]=34
L[6]=36
L[7]=92
L[9]=13

In [6]:
print(L)

[ 0 12 32 71 34  0 36 92  0 13]


In [7]:
def shuffle_left(L):
    m = len(L)
    nzeros = 0
    for i in range(m):
        if L[i]==0:
            nzeros = nzeros +1
            for j in range(i,m-1):
                L[j]=L[j+1]
    L = L[0:m-nzeros]
    return L

In [8]:
print(shuffle_left(L))

[12 32 71 34 36 92 13]


### The Copy-Over Algorithm
- We scan the list from left to right, and whenever we encounter a nonzero element we copy it over to a new list.

### The Converging-Pointers Algorithm
We scan the list from both left (L) and right (R). Whenever L encounters a 0 element, the element at location R is copied to location L, then R reduced.

### Data-Cleanup Algorithm Comparison
- Which one is the most space efficient?
Shuffle-left		no additional space
Copy-over		needs a new list
Converging-pointers	no additional space
- Which one is the most time efficient?
Shuffle-left		many comparisons
Copy-over		goes through list only once
Converging-pointers	goes through list only once
- How do we measure time efficiency?

## Measuring Efficiency
- Need a metric to measure time efficiency of algorithms:
    - How long does it take to solve the problem?
        - Depends on machine speed
    - How many steps does the algorithm execute?
        - Better metric, but a lot of work to count all steps
    - How many "fundamental steps" does the algorithm execute?
- Depends on size and type of input, interested in knowing:
    - Best-case, Worst-case, Average-case behavior
- Need to analyze the algorithm!

### Sequential Search

1. Get values for Name, N1,…, Nn, T1,…, Tn
2. Set the value i to 1and set the value of Found to NO
3. Repeat steps 4 through 7 until Found = YES or i > n
4.      If Name = Ni then
5.            Print the value of Ti
6.            Set the value of Found to YES
         Else
7.            Add 1 to the value of i
8. If Found = NO then print "Sorry, name not in directory"
9. Stop

- The list of operations you want the files to support is analogous to an __Abstract Data Type (ADT)__
- The physical organization is analogous to a **Data Structure (DS)**
***
- It’s very important to distinguish between these two concepts

### Abstract Data Type (ADT)
- An ADT represents a particular set of behaviors
- You can formally define, perhaps using mathematical logic what an ADT is/does
- For example, a Stack is a list implements a LIFO policy on additions/deletions
### Data Structure (DS)
- A DS is more concrete
- Typically, it is a technique or strategy for implementing an ADT
- Use a linked list or an array to implement a stack class

### Some common ADTs that most programmers know about:
- stack
- queue
- priority queue
- dictionary
- sequence
- set

### Some common DSs used to implement those ADTs:
- Array
- Linked list
- Hash table (open, closed, circular
- Trees (binary search trees, heaps, AVL trees, 2-3 tries, red/black trees, B-trees)

### Topics will be covered throughout this course
- Data Structures
    - Lists, Stacks, Queues, Trees, Heaps, Hashes, Graphs, Sets
- Algorithms (Data Structures and Algorithms 2)
    - Sorting, Greedy, Backtracking, Randomized,
    - Dynamic Programming

# Motivation - Recursive Algorithms

## Example - Factorial

Today, we will motivate ourselves why we take this course.

We start off with *factorial*. 

$$n! = n \cdot (n-1) \cdot \ldots \cdot 1\;,$$

can be defined recursively by

$$n! = n \cdot (n-1)!\;.$$

More precise definition is

$$n! = \begin{cases} 1 &\mbox{if } n = 1 \\
                   n \cdot (n-1)! & \mbox{if } n >1\end{cases}$$

In [None]:
def myFactorial(n):
    if (n==1): return 1
    else: return n* myFactorial(n-1)

In [None]:
myFactorial(5)

In [None]:
import time

In [None]:
s = time.process_time_ns()
myFactorial(2500)
e = time.process_time_ns()

In [None]:
print(e - s)

## Example - Fibonacci Numbers

Fibonacci numbers are:
$$F_{1}=1, F_{2}=1,\quad \textrm{and}\quad F_{n}=F_{n-1}+F_{n-2}\quad \textrm{when}\quad n>2.$$

In [None]:
import numpy as np

In [None]:
def myFibonacciRecursive(n):
    if (n==1 or n==2): return 1
    else: return myFibonacciRecursive(n-1)+myFibonacciRecursive(n-2)

In [None]:
def myFibonacci(n):
    if (n==1 or n==2): 
        return 1
    else: 
        F=np.zeros(n) 
        F[0]=1
        F[1]=1
        for i in range(2,n):
            F[i]=F[i-1]+F[i-2]
    return F,F[n-1]

In [None]:
A,b=myFibonacci(50)

In [None]:
print(A)

In [None]:
myFibonacciRecursive(5)

In [None]:
myFibonacciRecursive(30)

In [None]:
myFibonacciRecursive(35)

In [None]:
myFibonacciRecursive(50)

In [None]:
print(int(myFibonacci(50)))

In [None]:
s = time.process_time()
myFibonacci(40)
e = time.process_time()
print("{:.12f}".format(e-s))

In [None]:
s = time.process_time()
myFibonacciRecursive(40)
e = time.process_time()
print(e - s)

In [None]:
s = time.process_time_ns()
myFibonacci(40)
e = time.process_time_ns()
print("{:.12f}".format(e-s))

In [None]:
s = time.process_time()
for i in range(100000):
    pass
e = time.process_time()
print(e - s)

In [None]:
def myFastFibonacciRecursive(n):
    if (n==1 or n==2): 
        return 1

    if temp[n] != -1: # Check whether it is already calculated
        return temp[n]
    
    temp[n] = myFastFibonacciRecursive(n-1) + myFastFibonacciRecursive(n-2)
    return temp[n]

In [None]:
n=40
temp = np.full(n+1,-1) # Array of -1 of size n.

In [None]:
print(temp)

In [None]:
myFastFibonacciRecursive(n)

In [None]:
print(temp)

In [None]:
n=40
temp = np.full(n+1,-1)
s = time.process_time()
print(myFastFibonacciRecursive(n))
e = time.process_time()
print(e - s)

## Data structures
- A __data structure__ is a way to organize to enable efficient computation over that information

- A data structure supports certain operations, each with a:
    - Meaning: what does the operation do/return
    - Performance: how efficient is the operation

Examples of data structures:
- List  with operations insert and delete
- Stack  with operations push(insert an element) and pop (delete the element at the top)

Data structures are important in coding efficient programs.

What is an “Efficient” Algorithm?
- Efficiency measures
    - Small amount of time,
    - Low memory usage,
    - Low power consumption,

But there are unavoidable trade-offs:
- Time vs. space
- One operation more efficient if another less efficient
- Generality vs. simplicity vs. performance

## Measuring Running Time
- Experimentally
    - Implement algorithm
    - Run program with inputs of varying size
    - Measure running times
    - Plot results (**Exercise: plot problem size vs running time, then see the diffence between myFibonacci and myFibonacci2**)
    
- Running Time
    - Grows with the input size
        - Focus on large inputs
    - Varies with input
        - Consider the **worst-case** input

In [None]:
def myArrayElement(a):
    return a[0] ## Two operations: index 0 and return
alist = ['red','green','blue']
myArrayElement(alist)

In [None]:
A=np.random.rand(10) 
A

In [None]:
nsize=10
B=np.random.rand(nsize) 

In [None]:
def myargmax(array):
    # Input: an array
    # Output: the index of the maximum value
    index = 0 # assignment, 1 op
    mylength = array.shape[0] # assignment, 2 op
    for i in range(1,mylength): # 1 op per loop
        if array[i] > array[index]: # 3 ops per loop
            index = i # 1 op per loop, but not always
    return index # 1 op

In [None]:
myargmax(B)

In [None]:
B

- How many operations if the list has 10 elements? 100,000 elements?
    - Varies proportional to the size of the input list: 5*nsize + 4
    - Note that the **for** loop longer and longer as the input list grows
    - Thus, if we plot nsize against running time, the runtime would increase linearly

## Terminology
- Abstract Data Type (ADT)
    - Mathematical description of a “thing” with set of operations
    - Not concerned with implementation details

- Algorithm
    - A high level, language-independent description of a step-by-step process

- Data structure
    - A specific organization of data and family of algorithms for implementing an ADT

- Implementation of a data structure
    - A specific implementation in a specific language
    

## For example : Stacks
- The Stack ADT supports operations:
    - isEmpty: have there been same number of pops as pushes
    - push: takes an item
    - pop: raises an error if empty, else returns most-recently pushed item
    - top: returns the top element of the stack
    
A Stack data structure could use a _linked-list_ or an _array_ or something else, and associated algorithms for the operations