# Find the Median

## Problem Statement

In [1]:
'''
The median of a list of numbers is essentially its middle element after sorting. The same number of elements occur after it as before. Given a list of numbers with an odd number of elements, find the median?

Example
arr = [5,3,1,2,4]
The sorted array arr'= [1,2,3,4,5]. The middle element and the median is 3.

Function Description
Complete the findMedian function in the editor below.

findMedian has the following parameter(s):
int arr[n]: an unsorted array of integers

Returns
int: the median of the array

Input Format
The first line contains the integer n, the size of arr.
The second line contains n space-separated integers arr[i]

Constraints
1 <= n <= 1000001
n is odd
-10000 <= arr[i] <= 10000
'''

"\nThe median of a list of numbers is essentially its middle element after sorting. The same number of elements occur after it as before. Given a list of numbers with an odd number of elements, find the median?\n\nExample\narr = [5,3,1,2,4]\nThe sorted array arr'= [1,2,3,4,5]. The middle element and the median is 3.\n\nFunction Description\nComplete the findMedian function in the editor below.\n\nfindMedian has the following parameter(s):\nint arr[n]: an unsorted array of integers\n\nReturns\nint: the median of the array\n\nInput Format\nThe first line contains the integer n, the size of arr.\nThe second line contains n space-separated integers arr[i]\n\nConstraints\n1 <= n <= 1000001\nn is odd\n-10000 <= arr[i] <= 10000\n"

## Given Test Cases

In [2]:
'''
Sample Input 0
7
0 1 2 4 6 5 3

Sample Output 0
3
'''

'\nSample Input 0\n7\n0 1 2 4 6 5 3\n\nSample Output 0\n3\n'

### Data Setup

In [3]:
arr = [0,1,2,4,6,5,3]

## Strategy and Solution

### Brute Force (On^2)

In [16]:
'''
In a nutshell, this is testing the speed of sorting algorithms.

Bubble Sort, Insertion Sort etc. are common naive solutions that sort with two for loops.
'''

'\nthe brute force method would be to compare every query with every item in arr for a match; increment a counter if there is a match and then return.\n\nluckily, we can either manually code out the nested for-loops or we can use the builtin .count function from python \n'

### Traditional "Optimal" Solutions (O n logn) or O(n) average

In [4]:
'''
following suit, we know that the best sorting algorithm is going to be a quick sort, which comes out to Onlogn

since we dont need to actually sort the list, we can reduce this down to an average of O(n) time with quick-select algorithm.
'''

'\nfollowing suit, we know that the best sorting algorithm is going to be a quick sort, which comes out to Onlogn. while this is fast, it is not the fastest solution.\n'

In [None]:
def quickselect():
    return

In [5]:
# we can use python's built in sort function (TimSort) to condense the code even more
def findMedian(arr):
    sorted_arr = sorted(arr)
    return sorted_arr[int(len(sorted_arr)/2 - 0.5)]

In [None]:
'''
the issue with quick-select is that its over-dependent on the pivot selection being good.

for eg, in quickselect implementation above, it is possible that we select a pivot such that we only remove 1 term from our search, effectively boiling the runtime down to O(n^2)

we can improve upon the above by writing another piece of code to determine the optimal pivot selection everytime, and thereby guarenteeing quickselect is going to run O(n) in ALL CASES.

this method is called median-of-medians.
'''

### Most Optimal Determininistic (On)

In [None]:
'''
As mentioned above, quickselect gives an average of O(n) but it is heavily reliant on picking non-bad pivots. In the worst case, the pivots picked will only remove 1 item from the search,
meaning you end up with a O(n^2) runtime.

Thus, we can improve upon quick select by giving a solution that optimizes the picking of the best pivots, which is called the median-of-medians approach.


'''

## Testing

In [20]:
'''
Sample Input
3
def
de
fgh
3
de
lmn
fgh

Sample Output
1
0
1
'''

'\nSample Input\n3\ndef\nde\nfgh\n3\nde\nlmn\nfgh\n\nSample Output\n1\n0\n1\n'

In [21]:
arr1 = ['def','de','fgh']
q1 = ['de','lmn','fgh']
print(matchingStrings_func(arr1, q1))

[1, 0, 1]


In [22]:
'''
Sample Input
13
abcde
sdaklfj
asdjf
na
basdn
sdaklfj
asdjf
na
asdjf
na
basdn
sdaklfj
asdjf
5
abcde
sdaklfj
asdjf
na
basdn

Sample Output
1
3
4
3
2
'''

'\nSample Input\n13\nabcde\nsdaklfj\nasdjf\nna\nbasdn\nsdaklfj\nasdjf\nna\nasdjf\nna\nbasdn\nsdaklfj\nasdjf\n5\nabcde\nsdaklfj\nasdjf\nna\nbasdn\n\nSample Output\n1\n3\n4\n3\n2\n'

In [23]:
arr2 = ['abcde','sdaklfj','asdjf','na','basdn','sdaklfj','asdjf','na','asdjf','na','basdn','sdaklfj','asdjf']
q2 = ['abcde','sdaklfj','asdjf','na','basdn']
print(matchingStrings_func(arr2, q2))

[1, 3, 4, 3, 2]


In [24]:
'''
Passed all test cases
'''

'\nPassed all test cases\n'