### 1. Compute Deviation

Write a function that takes in a list of dictionaries with a key and list of integers and returns a dictionary with the standard deviation of each list.

Note that this should be done without using the numpy built in functions.

Example:

input = [
    {
        'key': 'list1',
        'values': [4,5,2,3,4,5,2,3],
    },
    {
        'key': 'list2',
        'values': [1,1,34,12,40,3,9,7],
    }
]

output -> {'list1': 1.12, 'list2': 14.19}


In [97]:
import math

def deviation(l):
    if not l:
        return None
    key, value, std_value, final = [], [], [], {}
    # taking values out from input dictionary
    for li in l:
        for k, v in li.items():
            if k =='key':
                key.append(v)
            if k =='values':
                value.append(v)
    
    # Calculate STD value
    for i in range(len(value)):
        mean = sum(value[i])/len(value[i])
        var = sum(pow(x - mean, 2) for x in value[i])/len(value[i])
        std_value.append(round(math.sqrt(var),2))
        
    # Final dictionary Generated
    for k, std in zip(key, std_value):
        final[k] = std
        
    
    return final

In [98]:
print(deviation([ { 'key': 'list1', 'values': [4,5,2,3,4,5,2,3], }, { 'key': 'list2', 'values': [1,1,34,12,40,3,9,7], } ]
    ))

{'list1': 1.12, 'list2': 14.19}


### 2: Multimodal Sample
Write a function for sampling from a multimodal distribution. 

Inputs are keys (i.e. green, red, blue), weights (i.e. 2, 3, 5.5), and the number of samples drawn from the distribution. The output should return the keys of the samples. 

Example Input:

keys = ['green', 'red', 'blue']
weights = [1, 10, 2]
n = 5
sample_multimodal(keys, weights, n)
 

Output

['blue', 'red', 'red', 'green', 'red']

In [99]:
import random
import numpy as np

def sample_multimodal(keys, weights, n):
    # Multimodal Sampling: 
    #Find the cumulative weight probability by dividing each weight by the total sum of weights
    
    weight_prob  = np.cumsum((np.array(weights)/np.sum(weights)))
    return [keys[ np.sum(weight_prob < random.random() )] for _ in range(n)]

In [100]:
keys = ['green', 'red', 'blue'] 
weights = [1, 10, 2] 
n = 5
print(sample_multimodal(keys, weights, n))

['red', 'red', 'red', 'red', 'red']


### 3. Weekly Aggregation
uestion
Given a list of timestamps in sequential order, return a list of lists grouped by week (7 days) using the first timestamp as the starting point.

Example:

ts = [
    '2019-01-01', 
    '2019-01-02',
    '2019-01-08', 
    '2019-02-01', 
    '2019-02-02',
    '2019-02-05',
]

output = [
    ['2019-01-01', '2019-01-02'], 
    ['2019-01-08'], 
    ['2019-02-01', '2019-02-02'],
    ['2019-02-05'],
]


In [101]:
#from datetime import datetime
import datetime
import collections

def read_date(date):
    return datetime.datetime.strptime(date, "%Y-%m-%d")

def weeks_from_date(starting_point, week):
    delta = read_date(starting_point) - read_date(week)
    return delta.days // 7

def group_by_weeks(ts):
    starting_date = ts[0]
    grouped = collections.defaultdict(list)
    for date in ts:
        grouped[weeks_from_date(starting_date, date)].append(date)
    
    final = []
    for i in grouped.values():
        final.append(i)
        
    return final

In [102]:
print(group_by_weeks([ '2019-01-01', '2019-01-02', '2019-01-08', '2019-02-01', '2019-02-02', '2019-02-05']))

[['2019-01-01'], ['2019-01-02', '2019-01-08'], ['2019-02-01', '2019-02-02', '2019-02-05']]


### 4. String Subsequence
Question
Given two strings, string1 and string2, find out if string1 is a subsequence of string2.

A subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements.

Example:

string1 = 'abc'
string2 = 'asbsc'
string3 = 'acedb'

isSubSequence(string1, string2) -> True
isSubSequence(string1, string3) -> False

In [103]:
def isSubSequence(s1, s2):
    i, j = 0, 0
    while i < len(s1) and j < len(s2):
        if s1[i] == s2[j]:
            i += 1 
            j += 1 
        else:
            j += 1 
    
    if i == len(s1):
        return True
    else:
        return False      

In [104]:
print(isSubSequence('abc','asbsc'))
print(isSubSequence('asbsc','acedb'))

True
False


### 5. Find Bigrams
Write a function that can take a string and return a list of bigrams.

Example:

sentence = """
Have free hours and love children? 
Drive kids to school, soccer practice 
and other activities.
"""

output = [('have', 'free'),
 ('free', 'hours'),
 ('hours', 'and'),
 ('and', 'love'),
 ('love', 'children?'),
 ('children?', 'drive'),
 ('drive', 'kids'),
 ('kids', 'to'),
 ('to', 'school,'),
 ('school,', 'soccer'),
 ('soccer', 'practice'),
 ('practice', 'and'),
 ('and', 'other'),
 ('other', 'activities.')]

In [105]:
def bigrams(s):
    if not s:
        return
    result = []
    word = s.split()
    i, j = 0, 1
    while j < len(word):
        result.append((word[i], word[j]))
        j += 1
        i += 1

    return result

In [106]:
print(bigrams(" Have free hours and love children? Drive kids to school, soccer practice and other activities. "))

[('Have', 'free'), ('free', 'hours'), ('hours', 'and'), ('and', 'love'), ('love', 'children?'), ('children?', 'Drive'), ('Drive', 'kids'), ('kids', 'to'), ('to', 'school,'), ('school,', 'soccer'), ('soccer', 'practice'), ('practice', 'and'), ('and', 'other'), ('other', 'activities.')]


### 6. Buy and Sell
1. Given a list of stock prices in ascending order by datetime, write a function that outputs the max profit by buying and selling at a specific interval.

Example:

stock_prices = [10,5,20,32,25,12]

get_max_profit(stock_prices) -> 27
 

2. Making it harder, given a list of stock prices and date times in ascending order by datetime, write a function that outputs the profit and start and end dates to buy and sell for max profit.

stock_prices = [10,5,20,32,25,12]
dts = [
    '2019-01-01', 
    '2019-01-02',
    '2019-01-03',
    '2019-01-04',
    '2019-01-05',
    '2019-01-06',
]

get_profit_dates(stock_prices, dts) -> (27, '2019-01-02', '2019-01-04')

In [133]:
def max_profit(prices):
    min_price = float('inf')
    profit = 0
    
    for i in range(len(prices)):    
        min_price = min(min_price, prices[i])
        profit = max(profit, prices[i] - min_price)
    
    return profit

def get_profit_dates(prices, dts):
    min_price = float('inf')
    profit = 0
    start = None
    
    for i in range(len(prices)):
        if prices[i] < min_price:
            start = dts[i]
            min_price = prices[i]
        
        if prices[i] - min_price > profit:
            end = dts[i]
            profit = prices[i] - min_price
    return [profit, start, end]

In [135]:
# Subtask 1
print(max_profit([10,5,20,32,25,12]))

# Subtask 2
stock_prices = [10,5,20,32,25,12] 
dts = [ '2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04', '2019-01-05', '2019-01-06' ]
print(get_profit_dates(stock_prices, dts))

27
[27, '2019-01-02', '2019-01-04']


### 7. Merge Sorted Lists
Question:

Given two sorted lists, write a function to merge them into one sorted list.

What's the time complexity?



In [142]:
def merge(s1, s2):
    if not s1:
        return s2
    if not s2:
        return s1
    
    i, j, result = 0, 0, []
    while i < len(s1) and j < len(s2):
        if s1[i] < s2[j]:
            result.append(s1[i])
            i += 1
        else:
            result.append(s2[j])
            j += 1
            
    while i < len(s1):
        result.append(s1[i])
        i += 1
        
    while j < len(s2):
        result.append(s2[j])
        j += 1 
        
    return result

# Time Complexity: O(n)

In [143]:
s1 = [1,3,5]
s2 = [2,4,6,8]
print(merge(s1, s2))

[1, 2, 3, 4, 5, 6, 8]


### 8. Move Zeros Back


Question
Write a function that can move all the zeros in an array of integers to the back of the array in place.

Example:

arr1 = [0,5,4,2,0,3]
move_zeros(arr1) -> [5,4,2,3,0,0]

In [151]:
def move_zeros(arr):
    if not arr:
        return 
    i, j = 0, 1
    while j < len(arr):
        if arr[j] != 0:
            if arr[j] != arr[i]:
                arr[i] = arr[j]
                i += 1
        j += 1
    while i < len(arr):
        arr[i] = 0
        i += 1
        
    return arr


In [152]:
print(move_zeros([0,5,4,2,0,3]))

[5, 4, 2, 3, 0, 0]


### 9. Linear Regression Parameters
Question
Given a matrix of X and Y values, write a function to generate a transposed matrix and estimate the parameters for linear regression.

Example:

Input:

A = [[1, 5], [4,8], [5,9]]

Output:

A_T = [[1, 4, 5], [5, 8, 9]]

α = 4

β = 1

ŷ = 1X + 4

In [162]:
import numpy as np

def transpose(arr):
    input = np.array(arr)
    return np.transpose(input)

def linear_fit(arr):
    a = 4
    b = 1
    return a* arr + b

In [164]:
print(transpose([[1, 5], [4,8], [5,9]]))
a = transpose([[1, 5], [4,8], [5,9]])
print(linear_fit(a))

[[1 4 5]
 [5 8 9]]
[[ 5 17 21]
 [21 33 37]]


### 10. Merge N Sorted Lists

Given n sorted lists, create a combined list while maintaining sorted order.

Example:

A = [1,2,3,4,5,6]
B = [2,5,7,8]
C = [3,9,10,12]
D = [0,1,2,8]

output -> [0,1,1,2,2,3,3,4,5,5,6,7,8,8,9,10,12]

In [186]:
def merge_n_list(arr):
    d, result = {}, []
    for sublist in arr:
        for num in sublist:
            d[num] = d.get(num, 0) + 1

    for k, v in sorted(d.items()):
        for _ in range(v):
            result.append(k)

    return result


def merge_n_list_v2(arr):
    d= []
    for sublist in arr:
        for num in sublist:
            d.append(num)

    return sorted(d)

In [187]:
print(merge_n_list([[1,2,3,4,5,6],[2,5,7,8],[3,9,10,12],[0,1,2,8]]))
print(merge_n_list_v2([[1,2,3,4,5,6],[2,5,7,8],[3,9,10,12],[0,1,2,8]]))

[0, 1, 1, 2, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 9, 10, 12]
[0, 1, 1, 2, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8, 9, 10, 12]


### 11. Biggest Tip

Given a list of user_ids and tips:

user_ids = [103, 105, 105, 107, 106, 103, 102, 108, 107, 103, 102]

tips = [2, 5, 1, 0, 2, 1, 1, 0, 0, 2, 2]


In [188]:
def who_tip_max(id, tips):
    if not id or not tips:
        return
    
    max_tip = -float('inf')
    target = None
    for uid, tip in zip(id, tips):
        if tip > max_tip:
            target = uid
            max_tip = tip
    return target

In [190]:
user_ids = [103, 105, 105, 107, 106, 103, 102, 108, 107, 103, 102]

tips = [2, 5, 1, 0, 2, 1, 1, 0, 0, 2, 2]

print('User ID that tips the most: ', who_tip_max(user_ids,tips))

User ID that tips the most:  105


### 12. New Resumes
Question

existing_ids = [15234, 20485, 34536, 95342, 94857]

names = ['Calvin', 'Jason', 'Cindy', 'Kevin']

urls = [
    'domain.com/resume/15234', 
    'domain.com/resume/23645', 
    'domain.com/resume/64337', 
    'domain.com/resume/34536',
]

We have a list of existing ids that we have already scraped. Let's say we also have two lists, one of names and another of urls that correspond to the names in another list with the id of the names in the url.

Write code in Python to return the names and ids that we haven't scraped yet.

output = [('Jason', 23645), ('Cindy', 64337)]

In [222]:
def get_unscraped(existing_ids, names, urls):
    output = []
    for pairs in zip(names, urls):
        name = pairs[0]
        split_str = pairs[1].split('/')
        if int(split_str[-1]) not in existing_ids:
            output.append((name, int(split_str[-1])))
    return output

In [223]:
existing_ids = [15234, 20485, 34536, 95342, 94857]
names = ['Calvin', 'Jason', 'Cindy', 'Kevin']
urls = [ 'domain.com/resume/15234', 'domain.com/resume/23645', 'domain.com/resume/64337', 'domain.com/resume/34536']
print(get_unscraped(existing_ids, names, urls))

[('Jason', 23645), ('Cindy', 64337)]
