# Array Sequences

What to Expect in this Section

In [6]:
# # # Introduction to Arrays
# # # Low Level Arrays
# # # Dynamic Arrays and Amortization
# # # Array based "Mini Project"
# # # Several Array Intervies Problems

# In this lecture we will briefly set the background for arrays in Python.
# # Term "array sequence" is general
# # Python has 3 main sequence classes:
# - List: [1,2,3]
# - Tuple: (1,2,3)
# - String: '123'
# # All support indexing (e.g. t[0]=1)

# ## Rest of this Section.
# # Rest of this section will focus on various aspects of arrays.
# # How arrays are constructed.
# # Focus will be heavy on theory.
# # Interview problems heavy on practical usage.



# Low Level Arrays

How Computer Store Information

In [7]:
# Focus on computer memory
# Memory address
# Units of memory (bits and bytes)
# Memory retieval
# Note: Text heavy slides!
    

Low-level computer architecture

In [1]:
# # Memory of a computer stored in bits
# # Typical unit is byte, which is 8 bits
# # Computers typically use a memory address.
# # Each byte associated with unique address.
# - Byte #2144 versus Byte #2147


# Arrays

In [2]:
# # Representation of computer memory

# # Individual bytes with consecutive addresses

# # Computer hardware is designed, in theory, so that any byte of the 
# main memory can be efficiently accessed

# # Computer's main memory performs as random access memory (RAM).

# # Just as easy to retrieve byte #8675309 as it is to retrieve byte #309.

# # Individual byte of memory can be stored or retrieved in O(1) time.

# # Programming language keeps track of the association between an 
# identifier and the memory address.

# # May want a video game to keep track of the top ten scores for that game.

# # Prefer to use a single name for the group 

# # Use index numbers to refer to the scores in that group.

# # A group of related variables can be stored one after another in a 
# contiguous portion of the computer's memory.

# # We will denote such a representation as an array.

# # A text string is stored as an ordered sequence of individaul characters.

# # Python internally represents each Unicode character with 16 bits 
# (i.e., 2 bytes).

# # Python internally represents each Unicode character with 16 bits
# (i.e., 2 bytes)

# # Six-character string, such as 'SAMPLE' would be stored in 12 consecutive
# bytes of memory.

# # This is an array of six characters.

# # Each location within an array as a cell

# # Integer index to describe its location

# # Each cell of an array uses the same number of bytes.

# # Allows any cell to be accessed in constant time

# # Appropriate memory address can be computed using the calculation,
# start + (cellsize)(index)

# # Higher level abstraction

# # Basic abstraction for real-world discussion



# Referential Arrays

Referential Arrays examples

In [4]:
# ## Final Topic of this lecture

# # Imagine 100 student names with ID numbers 

# # Each cell of the array needs to have the same number of bytes

# # How can we avoid having to have a series of names?

# # We can use an array of object References

# # Each element is a reference to the object.

# # A single list instance may include multiple references to the same 
# object as elements of the list.

# # Single object can be an element of two or more lists.

# # When computing the slice of a list, the result is a new list instance.

# # New list has references to the same elements that are in the original
# list.

# # When computing the slice of a list, the result is a new list instance.

# # New list has references to the same elements that are in the original list.

# # temp = primes[3:6]

# # temp[2] = 15


Copying Arrays

In [5]:
# # backup = list(primes)

# # This produces a new list that is a shallow copy in that it references
# the same elements as in the first list.

# # If the contents of the list were of a mutable type, a deep copy, meaning 
# a new list with new elements, can be produced by using the deepcopy 
# function from the copy module.

# # counters = [0]*8

# # All eight cells reference the same object!

# # counters[2] += 1 

# # Does not technically change the value of the existing interger instance.
# This computes a new integer

# # primes.extend(extras)

# # 

# Review

In [6]:
# Basic Computer Architecture

# Low-level array representation

# Referential Arrays

# Dynamic Arrays

In [1]:
# Don't need to specify how an array is beforehand.

# A list instance often has greater capacity than current length.

# if elements keep getting appended, eventually this extra space runs out.

# Let's show an example of this extra "room" with a live code demonstration!


In [2]:
import sys    # it allows us to use "sysgetsizeof()" function that use to know 
              # the actual size in bytes.


# Set n
n = 10

data = []

for i in range(n):
    
    # Number of elements
    a = len(data)
    
    # Actual Size in Bytes
    b = sys.getsizeof(data)
    
    print('Length: {0:3d}; Size in bytes: {1:4d}'.format(a,b))
    
    # increase Length by one
    data.append(n)

Length:   0; Size in bytes:   56
Length:   1; Size in bytes:   88
Length:   2; Size in bytes:   88
Length:   3; Size in bytes:   88
Length:   4; Size in bytes:   88
Length:   5; Size in bytes:  120
Length:   6; Size in bytes:  120
Length:   7; Size in bytes:  120
Length:   8; Size in bytes:  120
Length:   9; Size in bytes:  184


In [3]:
# Let's show an example of this extra "room" with a live code demonstration!


Dynamic Array Implementation

In [4]:
# # The key to provide means to grow the array "A" that stores the elements 
# of a list.

# # We can't actually grow that array, its capacity is fixed.

# # If an element is appended to a list at a time when the underlying 
# array is full, we'll need to perform the following steps....

# # Allocate a new array "B" with larger capacity.

# # Set B[i] = A[i], for i = 0, ...,n-1, where n denotes current number ot
# items.

# # Set A = B, that is, we henceforth use B as the array supporting the list.

# # Insert the new element in the new array.

# # (a) Create new array B; (b) store elements of A in B; (c) reassign 
# reference A to the new array.

# # How large of a new array to create?

# # A commonly used rule is for the new array to have twice the capacity
# of the existing array that has been filled.

# # We'll see mathematic reasoning behind this later.

# # What a dynamic array is.

# # How to theoretically implement a dynamic array.

# # Up next: Code an example Dynamic Array!

# # 

# Dynamic Array Exercise

Dynamic Array Implementation

In [5]:
# # A quick note on public vs private methods, we can use underscore_before
# the method name to keep it non-public. For example:
    

In [6]:
class M(object):
    
    def public(self):
        print('Use Tab to see me!')
        
    def _private(self):               # underscore is used for nonpublic method
        print("You won't be able to Tab to see me!")

In [7]:
m = M()

In [8]:
m.public()

Use Tab to see me!


In [9]:
m._private()

You won't be able to Tab to see me!


Dynamic Array Implementation

In [13]:
import ctypes

class DynamicArray(object):
    '''
    DYNAMIC ARRAY CLASS (Similar to Python List)
    '''
    
    def __init__(self):
        self.n = 0 # Count actual elements (Default is 0)
        self.capacity = 1 # Default Capacity
        self.A = self.make_array(self.capacity)
        
    def __len__(self):
        """
        Return number of elements sorted in array 
        """
        return self.n
    
    def __getitem__(self,k):
        """
        Return element at index k
        """
        if not 0 <= k <self.n:
            return IndexError('K is out of bounds!') # check it k index is in bounds of array!
        
        return self.A[k] # Retrieve from array at index k
    
    def append(self,ele):
        """
        Add element to end of the array
        """
        if self.n == self.capacity:
            self._resize(2*self.capacity) # Double capacity if not enough room
            
        self.A[self.n] = ele # Set self.n index to element
        self.n += 1
        
    def _resize(self,new_cap):
        """
        Resize internal array to capacity new_cap
        """
        
        B = self.make_array(new_cap) # New bigger array
        
        for k in range(self.n): # Reference all existing values
            B[k] = self.A[k]
            
        self.A = B # call A the new bigger array 
        self.capacity = new_cap # Reset the capacity
        
    def make_array(self,new_cap):
        """
        Returns a new array with new_cap capacity
        """
        return (new_cap * ctypes.py_object)()
    
    

In [14]:
# Instantiate
arr = DynamicArray()


In [15]:
# Append new element
arr.append(1)

In [16]:
# check length
len(arr)

1

In [17]:
# Append new element
arr.append(2)

In [18]:
# check length
len(arr)

2

In [19]:
# Index
arr[0]

1

In [20]:
arr[1]

2

In [21]:
# Awesome, we made our own dynamic array! Play around with it and see how it auto-resizes. Try 
# using the same sys.getsizeof() function we worked with previously!

# AMORTIZATION

Amortized Analysis

In [1]:
# # The strategy of replacing an array with a new, larger array might at
# first seem slow.

# # A single append operation may require O(n) time to perform.

# # Our new array alllows us to add n new elements before the array must
# be replaced again.

# # Using an algorithmic design pattern called amortization, we can show
# that performing a sequence of such append operations on a dynamic array
# is actually quite efficient.

# # 1) Allocate memory for a larger array of size, typically twice the 
# old array.

# # 2) Copy the contents of old array to new array.

# # 3) Free the old array.

# # Watch the lecture video then.....

# # Check out resource section of lecture for PDF of a more detailed explanation!

# # Up next, a lecture overviewing the interview problems in this section
# of the course.

# #

# Array Interview Problems

In [1]:
# # Technical Section are essentially split into 2 parts
# - Theory and Learning Exercises
# - Interview Problems and Solutions

# # Sometimes the theory lines up well with common interview questions.

# # Unfortunately, this isn't the case with arrays!

# # You will most likely find the interview problems (for this section)
# not having much in common with the theory and exercise lectures!

# #### Don't use Python "tricks" when answering the interview problems.

# # For example, if asked to reverse a string, don't use:
# - my_string[::-1]

# ------------------------------------------------------------------------------
# # These interview problems will seem very challanging!

# # If you find yourself completely stuck try writing out what you would
# do manually to solve the problem (brute-force)

# # It's highly suggested you try solving these with pen/paper or a
# whiteboard first

# # Using a whiteboard or pen and paper for a problem makes it much harder!

# # If you are completely stuck on the problem and have tried brute forcing it 
# - Give it 1-2 days and try it again.
# - Still stuck? Look at the solution and code it out
# - Wait 1-2 days and try the problem again.

# # Lectures alternate between problems and their solutions.


# Anagram Check- Interview Problem

Problem-1

In [2]:


# Given two strings, check to see if they are anagrams. An anagram is 
# when the two strings can be written using the exact same letters (so
# you can just rearrange the letters to get a different phrase or word).

# For example:
    
#     "public relations" is an anagram of "crap built on lies."
    
#     "clint eastwood " is an anagram of "old west action"
    
# Note: Ignore spaces and capitalization. So "d go" is an anagram of "God" 
#     and "dog" and "o d g".
    

Solution

In [11]:
def anagram2(s1,s2):
    
    s1 = s1.replace(' ','').lower()
    s2 = s2.replace(' ','').lower()
    
    # Edge Case Check
    if len(s1) != len(s2):
        return False
    
    count = {}
    
    for letter in s1:
        if letter in count:
            count[letter] += 1
        else:
            count[letter] = 1
            
    for letter in s2:
        if letter in count:
            count[letter] -= 1
        else:
            count[letter] = 1
            
    for k in count:
        if count[k] != 0:
            return False
        
        return True
            

In [12]:
anagram2('clint eastwood','olds west action')

False

# Array Pair Sum Interview Problem

Problem-2

In [13]:
# # Given an integer array,all the unique pairs that sum up to a specific
# valu "k". So that input:
#     Pair_sum([1,3,2,2],4)
    
    
# Would return 2 pairs:

#       (1,3)
#       (2,2)
        
# NOTE: FOR TESTING PURPOSE CHANGE YOUR FUNCTION SO IT OUTPUTS THE NUMBER
#     OF PAIRS
    
    


Solution

In [None]:
def pair_sum(arr,k):
    
    if len(arr)<2:
        return
    
    # Sets for tracking
    seen = set()
    output = set()
    
    for num in arr:
        
        target = k-num
        
        if target not in seen:
            seen.add(num)
            
        else:
            output.add(((min(num,target)), max(num,target)))
            
    #return len(output)
    print('\n'.join(map(str,list(output))))    

In [10]:
pair_sum([1,3,2,2],4)

(1, 3)
(2, 2)


# Find the Missing Element Interview Problem

Problem 3

In [1]:
# # Consider an array of non-negative integers. A second array is 
# formed by shuffling the elements of the first array and deleting a 
# random element. Given these two arrays, find which element is missing
# in the second array.

# # Input:
     
#     finder([1,2,3,4,5,6,7],[3,7,2,1,4,6])
    
# # OUtput:

#      5 is the missing number
    
    

Solution

In [2]:
def finder(arr1,arr2):
    
    # Sort the arrays
    arr1.sort()
    arr2.sort()
    
    # Compare elements in the stored arrays
    for num1,num2 in zip(arr1,arr2):
        if num1 != num2:
            return num1
        
    # Otherwise return last element
    return arr1[-1]
 


In [3]:
arr1 = [1,2,3,4,5,6,7]
arr2 = [3,7,2,1,4,6]
finder(arr1,arr2)

5

In [4]:
# In most interviews, you would be expected to come up with a linear
# time solution. We can use a hashtable and store the number of times
# each element appears in the second array. Then for each element in 
# the first array we decrement its counter. Once hit an element with 
# zero count that’s the missing element. Here is this solution:

In [5]:
import collections

def finder2(arr1,arr2):
    
    # Using default dict to avoid key errors
    d = collections.defaultdict(int)
    
    # Add a count for every instance in Array 1
    for num in arr2:
        d[num] += 1
        
    # Check if num not in dictionary
    for num in arr1:
        if d[num] == 0:
            return num 
        
        # Otherwise, subtract a count
        else: d[num] -= 1

In [6]:
arr1 = [5,5,7,7]
arr2 = [5,7,7]

finder2(arr1,arr2)

5

In [7]:
# One possible solution is computing the sum of all the numbers in arr1
# and arr2, and subtracting arr2’s sum from array1’s sum. The difference
# is the missing number in arr2. However, this approach could be 
# problematic if the arrays are too long, or the numbers are very large. Then overflow will occur while summing up the numbers.

# By performing a very clever trick, we can achieve linear time and 
# constant space complexity without any problems. Here it is: initialize
#     a variable to 0, then XOR every element in the first and second
#     arrays with that variable. In the end, the value of the variable 
#     is the result, missing element in array2

In [8]:
def finder3(arr1, arr2):
    result = 0
    
    # Perform an XOR between the numbers in the arrays
    for num in arr1+arr2:
        result^=num
        print(result)
        
    return result    

In [9]:
finder3(arr1,arr2)

5
0
7
0
5
2
5


5

# Largest Continuous Sum Interview Problem

Problem 4

In [10]:
# # Given an array of integers(positive and negative) find the largest 
# continuous sum.

Solution

In [11]:
def large_cont_sum(arr):
    
    # Check to see if array is length 0
    if len(arr)==0:
        return 0
    
    # start the max and current sum at the first element
    max_sum = current_sum = arr[0]
    
    # For every element in array
    for num in arr[1:]:
        
        # Set current sum as the higher of the two
        current_sum = max(current_sum + num, num)
        
        # Set max as the higher between the currentSum and the current mas
        max_sum = max(current_sum, max_sum)
        
    return max_sum

In [12]:
large_cont_sum([1,2,-1,3,4,10,10,-10,-1])

29

# Sentence Reversal Interview Problem

Problem 5

In [13]:
# Given a string of words, reverse all the words. For example:

# Given:

# 'This is the best'

# Return:

# 'best the is This'

# As part of this exercise you should remove all leading and trailing whitespace. So that inputs such as:

# '  space here'  and 'space here      '

# both become:

# 'here space'

Solution

In [14]:
def rev_word1(s):
    return " ".join(reversed(s.split()))

# Or

def rev_word2(s):
    return " ".join(s.split()[::-1])

In [15]:
rev_word1('Hi John, are you ready to go?')

'go? to ready you are John, Hi'

In [16]:
rev_word2('Hi John, are you ready to go?')

'go? to ready you are John, Hi'

While these are valid solutions, in an interview setting you'll
have to work out the basic algorithm that is used. In this case what 
we want to do is loop over the text and extract words form the string 
ourselves. Then we can push the words to a "stack" and in the end opo
them all to reverse. Let's see what this actually looks like:

In [1]:
def rev_word3(s):
    """
    Mannually doing the splits on the spaces.
    """
    
    words = []
    length = len(s)
    spaces = [' ']
    
    # Index Tracker
    i = 0
    
    # While index is less than length of string
    while i < length:
        
        # If element isn't a space
        if s[i] not in spaces:
            
            # The word start at this index 
            word_start = i
            
            while i < length and s[i] not in spaces:
                
                # Get index where word ends
                i += 1
            # Append that word to the list
            words.append(s[word_start:i])
        # Add to index
        i += 1
        
    # Join the reversed words
    return " ".join(reversed(words))

In [2]:
rev_word3('   Hello John    how are you   ')

'you are how John Hello'

In [3]:
rev_word3('    space before')

'before space'

If you want you can further develop this solution so its 
all manual, you can create your own reversal function

# String Compression Interview Problem

Problem 6

In [4]:
# Given a string in the form 'AAAABBBBCCCCCDDEEEE' compress it to 
# become 'A4B4C5D2E4'. For this problem, you can falsely "compress" 
# strings of single or double letters. For instance, it is okay for
# 'AAB' to return 'A2B1' even though this technically takes more space.

# The function should also be case sensitive, so that a string 'AAAaaa'
# returns 'A3a3'.

Solution

In [5]:
# Since Python strings are immutable, we'll need to work off of a list
# of characters, and at the end convert that list back into a string 
# with a join statement.

# The solution below should yield us with a Time and Space complexity 
# of O(n). Let's take a look with careful attention to the explanatory
# comments:

In [47]:
def compress(s):
    """
    This solution compresses without checking. Known as the RunLength Compression algorithm
    """
    
    # Begin Run as empty string
    r = ""
    l = len(s)
    
    # Check for length 0
    if l == 0:
        return ""
    # Check for length 1
    if l == 1:
        return s + "1"
    
    # Initialize values
    last = s[0]
    cnt = 1
    i = 1
    
    while i < l:
        
        # Check to see if it is the same letter
        if s[i] == s[i - 1]:
            # Add a count if same as previous
            cnt += 1
        else:
            # Otherwise store the previous data
            r = r + s[i - 1] + str(cnt)
            cnt = 1
            
        # Add to index count to terminate while loop
        i += 1
        
    # Put everything back into run
    r = r + s[i - 1] + str(cnt)
        
    return r
            

In [48]:
compress('AAAAABBBBCCCC')

'A5B4C4'

# Unique Characters in String

In [1]:
# # Given a string, determine if it is compreised of all unique characters.
# For example, the string 'abcde' has all unique characters and should 
# return True. The string 'aabcde' contains duplicate characters and should
# return false.


# Solution

We'll show two possible solutions, one using a built-in data structure
and a built in function, and another using a built-in data structure 
but using a look-up method to check if the characters are unique.

In [2]:
def uni_char(s):
    return len(set(s)) == len(s)

In [3]:
def uni_char2(s):
    chars = set()
    for let in s:
        # Check if in set
        if let in chars:
            return False
        else:
            #Add it to the set
            chars.add(let)
    return True        

In [10]:
s = 'abcde'
uni_char2(s)

True