## Sample offline data

A company needs to select a random subset with size K from its customers to roll out a new feature.  

Implement an algorithm that takes as input an array of distinct elements and a size, and returns a subset of the given size of the array elements. All subsets should be equally likely.

Return the result in input array itself.

Hint: How would you construct a random subset of size $k + 1$ give a random subset of size k?


### Approach

How do you randomly select 2 cards from a deck of cards? You choose one readomly, and then choose another one from the rest randomly. We can apply the same princinple here. Intuitively, if all subset of size k are equally likely, then this construction procss ensures that the subsets of size $k + 1$ are also equally likely. A formal proff uses mathematical induction.

As a concrete example, let the input be A = [3, 7, 5, 11] and the size be 3.
* In the first iteration, use a random number generator to pick a random integer in the range of [0, 3], e.g. 2
* Swap A[0] with A[2], and now the array is [5, 7, 3, 11]. Now we pick a another random number of [1,3], e.g. 3
* Swap A[1] with A[3] ...
* Continue doing after k pickups, and we got the first k elements as the target subset

**Time complexity**: $O(k)$  
**Space complexity**: $O(1)$  

In [4]:
from typing import List

def random_sampling(k: int, A: List[int]) -> None:

    for i in range(k):
        # Generate a random index in [i, len(A) - 1].
        r = random.randint(i, len(A) - 1)
        A[i], A[r] = A[r], A[i]


**Variant**

Write a program that takes input of a positive integer n and a size k <= n, and returns a size-k subset of [0, 1, 2, ... , n-1]. The subset should be reprensentd as an array. All subsets should be equally likely. in addition, all permutations of elements of teh array should be equally likely. 


In [None]:
def random_subset(n: int, k: int) -> List[int]:
    A = [i for i in range(n)]
    for i in range(k):
        r = random.randint(i, n-1)
        A[i], A[r] = A[r], A[i]
    return A[:k]


In [5]:
# Another way is to use Python library random.sample(population, k)
# Return a k length list of unique elements chosen from the population sequence or set. Used for random sampling without replacement.

def random_subset(n: int, k: int) -> List[int]:
    """
    Use the library random.sample(population, k) method
    """
    return random.sample(range(n), k)