# Array sequences in Python
Array sequences in Python include: lists, tuples and strings. All the three support indexing.

Each byte has a unique address. Consecutive bytes have consecutive numbers as address. Each byte can be accessed, read or written to in O(1), i.e. in constant time.

A group of related variables can be stored one after another in contiguous portion of memory. This is an **array**.
Each unicode character is represented in 2 bytes by Python. For example, the string "SAMPLE" will take 12 bytes.
The two byte location alloted to a character is called **cell**. Now, each cell must be of same size, not necessarily of 2 bytes.

## Referential Arrays
Consider an array of strings. All strings will have different length. Hence they cannot be stored in consecutive locations (as discussed earlier). Hence in an array of strings, a reference of each string is stored. All references take up equal space, i.e. each cell has a reference to a string stored in another location.
In python, list and tuples are referential in nature. 

In slices, it looks like a new list is formed. In reality, it holds the reference to the same objects as the original list. Changes made to slice do not change the original list. The new list element now points to a new object that is different from the old list.

## Coping Arrays
```backup = list(array)``` This creates a shallow copy of the list, each elements reference the same object as the first list. To create a new list, with elements pointing to new objects, **deepcopy** function from **copy** module can be used.

```counter = [0] * 8``` all elements refer to the same location that holds value 0. ```counter[1] = 2``` This creates a new reference and saves it to index 1. 

```counter.extend(lst)``` Here, extend function add cells to the old list where each element is referring to the same object as the list *lst*.

## Time Complexities of functions called on Python lists
**Operation**|**Big-O Efficiency**
:-----:|:-----:
index [] |O(1)
index assignment |O(1)
append |O(1)
pop() |O(1)
pop(i) |O(n)
insert(i,item) |O(n)
del operator |O(n)
iteration |O(n)
contains (in) |O(n)
get slice [x:y] |O(k)
del slice |O(n)
set slice |O(n+k)
reverse |O(n)
concatenate |O(k)
sort |O(n log n)
multiply |O(nk)

A list instance often has greater capacity than specified. The following examples illustrates that size of array is increased in chunks.

In [1]:
import sys

n=10
data = []
for i in range(n):
    a = len(data)
    b = sys.getsizeof(data)
    print("length", a, "size in bytes", b)
    data.append(n)

length 0 size in bytes 64
length 1 size in bytes 96
length 2 size in bytes 96
length 3 size in bytes 96
length 4 size in bytes 96
length 5 size in bytes 128
length 6 size in bytes 128
length 7 size in bytes 128
length 8 size in bytes 128
length 9 size in bytes 192


Change the value of n to 50 and run the program again. This is to understand how increase of capacity occurs. This is called dynamic array. List class in Python are dynamic arrays.

In following example, a custom dynamic array is written.

In [2]:
import ctypes

class DynamicArray(object):
    
    def __init__(self):
        self.n = 0
        self.capacity = 1
        self.A = self.make_array(self.capacity)
        
    def __len__(self):
        return self.n
    
    def __getitem__(self, k):
        if not 0 <= k < self.n:
            return IndexError('K is out of bound!')
        return self.A[k]
    
    def append(self, ele):
        if self.n == self.capacity:
            self._resize(2*self.capacity) #2x if capacity isn't enough
        self.A[self.n] = ele
        self.n += 1
        
    def _resize(self, new_cap):
        B = self.make_array(new_cap)
        for k in range(self.n):
            B[k] = self.A[k]
        self.A = B
        self.capacity = new_cap
        
    def make_array(self, new_cap):
        return (new_cap * ctypes.py_object)()

In [8]:
arr = DynamicArray()
arr.append(1)
print("length:", len(arr))
arr.append(2)
print("length:", len(arr))
print("second element", arr[1])

length: 1
length: 2
second element 2


## Amotized Analysis of Array
This is to study how capacity is increased for dynamic arrays.

## Interview Questions

In [31]:
"""
Anagram Check: Given two strings if they are anagrams
Ignore white spaces and capitalizations
"""

def anagram_check(a, b): # preferred solution
    a = a.lower().replace(' ','')
    b = b.lower().replace(' ','')
    return sorted(a) == sorted(b)

print("Results of first implementation")
print(anagram_check("God", "Dog"))
print(anagram_check("clint eastwood", "olsd west action"))

def anagram_check2(a, b):
    
    a = a.lower().replace(' ','')
    b = b.lower().replace(' ','')
    if len(a) != len(b):
        return False
    count = {}
    for letter in a:
        if letter in count:
            count[letter] += 1
        else:
            count[letter] = 1
    for letter in b:
        if letter in count:
            count[letter] -= 1
        else:
            return False
    
    for k in count:
        if count[k] != 0:
            return False
    return True

print("Results of second implementation")
print(anagram_check2("God", "Dog"))
print(anagram_check2("clint eastwood", "olsd west action"))
print(anagram_check2("clint eastwood", "old west action"))

Results of first implementation
True
False
Results of second implementation
True
False
True


In [36]:
"""
Array Pair Sum:  given an integer array, output all unique pairs that sum up to a specific value k
"""

def array_pair_sum(arr,k):
    if len(arr) < 2: # no pairs can be formed
        print("Length of array less than 2")
    seen = set()
    output = set()
    for num in arr:
        target = k-num
        if target not in seen:
            seen.add(num)
        else:
            output.add(((min(num, target)), max(num, target)))
    print('\n'.join(map(str, list(output))))
array_pair_sum([1,3,2,2], 4)

(1, 3)
(2, 2)


In [51]:
"""
Find missing element: two arrays of non-negative elements
First array has only one element which is not there in the second array
Find it
"""

def find_missing_element(arr1, arr2): # O(nlogn)
    arr1.sort()
    arr2.sort()
    for num1, num2 in zip(arr1, arr2):
        if num1!= num2:
            return num1
    return arr1[-1]

print("First Implementation")
print(find_missing_element([1,2,3,4,5], [1,2,3,4]))
print(find_missing_element([1,2,3,4,5], [1,2,5,4]))

import collections

# defaultdict allows you to access a key that does not exist in the dictionary
def find_missing_element2(arr1, arr2): #o(n)
    d = collections.defaultdict(int)
    for num in arr2:
        d[num]+=1
    for num in arr1:
        if d[num] == 0:
            return num
        else:
            d[num] -= 0

print("Second Implementation")
print(find_missing_element([1,2,3,4,5], [1,2,3,4]))
print(find_missing_element([1,2,3,4,5], [1,5,3,4]))

def find_missing_element2(arr1, arr2): # clever trick O(n)
    result = 0
    for num in arr1+arr2:
        result ^= num
    return result

print("Third Implementation")
print(find_missing_element([1,2,3,4,5], [1,2,3,4]))
print(find_missing_element([1,2,3,4,5], [1,5,3,4]))

First Implementation
5
3
Second Implementation
5
2
Third Implementation
5
2


Another approach is to sum all elements of array1 and subtract it with sum of array2 elements. This approach is not good when the numbers are very precise with decimal points or the numbers are very large. In the former you may loose data and in the latter you may encounter overflow exceptions.

In [53]:
"""
Largest Continuous Sum: given an array of integers, find largest continuous sum sub array
"""

def largest_sum(arr):
    if len(arr) == 0:
        return 0
    max_sum = current_sum = arr[0]
    for num in arr[1:]:
        current_sum = max(current_sum+num, num)
        max_sum = max(current_sum, max_sum)
        
    return max_sum

largest_sum([1,2,-1,3,4,10,10,-10,-1])

29

In [59]:
"""
Sentence Reversal:  given a string reverse the word order in the string
"""

def reverse_string1(s):
    return " ".join(reversed(s.split()))

def reverse_string2(s):
    return " ".join(s.split()[::-1])

# The above solutions are usinf python basic functions. Not acceptable in interview. 
# Hence more algorithmic approach is required.

def reverse_string3(s):
    words = []
    length = len(s)
    space = [' ']
    
    i=0
    
    while i<length:
        if s[i] not in space:
            word_start = i
            while i < length and s[i] not in space:
                i+=1
            words.append(s[word_start:i])
        i+=1
        
    return " ".join(reversed(words))

In [63]:
print(reverse_string3('     Hello   John    how are   you   '))
print(reverse_string3('     space before   '))

you are how John Hello
before space


In [66]:
"""
String compression: 'AAAABBBBCCCCCCCCDDEEE' -> 'A4B4C8D2E3'
Function should be case sensitive 'AAAaaa' -> 'A3a3'
"""

def string_compression(s): #O(n)
    r = ""
    if len(s) == 0:
        return ""
    if len(s) == 1:
        return s+"1"
    last = s[0]
    count = 1
    i = 1
    
    while i < len(s):
        if s[i] == s[i-1]:
            count += 1
        else:
            r += s[i-1] + str(count)
            count = 1
        i+=1
    r += s[i-1]+str(count)
    return r

In [69]:
print(string_compression('AAAABBBBCCCCCCCCDDEEE'))
print(string_compression('AAAaaa'))

A4B4C8D2E3
A3a3


In [70]:
"""
Unique characters in string: given a string, check if it has all unique characters
"""
def unique_char_check(s):
    return len(set(s)) == len(list(s))

print("First implementation")
print(unique_char_check("abcde"))
print(unique_char_check("abcdde"))

# above solution is okay, but another approach should also be mentioned

def unique_char_check2(s):
    chars  = set()
    for l in s:
        if l in chars:
            return False
        else:
            char.add(l)
    return True

print("Second implementation")
print(unique_char_check("abcde"))
print(unique_char_check("abcdde"))

First implementation
True
False
Second implementation
True
False
