## Sampling and Shuffling – Practice Problems with Solutions

This notebook contains a set of medium-advanced practice problems on Python's `random` module, focusing on:

- Shuffling sequences
- Sampling with and without replacement
- Weighted random choices
- Performance comparisons
- Good API and RNG practices (dependency injection, no hidden mutation, etc.)

---

### Problem 1 – Safe Shuffling (Avoiding Surprises)

You are given a list of items.

Write a function:

```python
def shuffled_copy(items: list, *, rng: random.Random | None = None) -> list:
    ...
```

that:

1. Returns a **new shuffled list** (does **not** mutate the input).
2. Uses a provided `random.Random` instance `rng` if given, otherwise uses the module-level `random` functions.
3. Works for any list (not just numbers).

**Requirements:**

- Don’t mutate `items`.
- Don’t use `random.seed` inside the function.
- Use `random.shuffle` internally.


In [1]:
import random
from typing import List, TypeVar

T = TypeVar('T')

def shuffled_copy(items: List[T], *, rng: random.Random | None = None) -> List[T]:
    '''
    Return a shuffled copy of the given list.

    Parameters
    ----------
    items : list[T]
        The list to shuffle. It is not modified.
    rng : random.Random | None, optional
        Optional RNG instance. If None, uses the module-level random.

    Returns
    -------
    list[T]
        A new list containing the same elements as `items`, in random order.
    '''
    # Use dependency injection instead of seeding inside
    r = rng if rng is not None else random

    copy = list(items)          # avoid mutating caller's list
    r.shuffle(copy)             # in-place shuffle of the copy only
    return copy

if __name__ == '__main__':
    base = [1, 2, 3, 4, 5]
    rng = random.Random(0)
    print('original:', base)
    print('shuffled:', shuffled_copy(base, rng=rng))
    print('still original:', base)


original: [1, 2, 3, 4, 5]
shuffled: [3, 2, 1, 5, 4]
still original: [1, 2, 3, 4, 5]


---

### Problem 2 – Deal Cards Without Replacement

Model a **52-card deck** as tuples `(rank, suit)` where:

- `rank` is one of `A, 2, ..., 10, J, Q, K`,
- `suit` is one of `♠, ♥, ♦, ♣`.

Write a function:

```python
def deal_hands(num_players: int, cards_per_player: int,
               rng: random.Random | None = None) -> list[list[tuple[str, str]]]:
    ...
```

that:

1. Builds a fresh deck internally.
2. Randomly deals `cards_per_player` cards to each of `num_players`, **without replacement**.
3. Raises `ValueError` if there are not enough cards.
4. Does not expose the deck globally.

Use `random.sample` or `shuffle`, whichever you prefer.


In [2]:
from typing import Tuple

Rank = str
Suit = str
Card = Tuple[Rank, Suit]

RANKS: tuple[Rank, ...] = (
    'A', '2', '3', '4', '5', '6', '7',
    '8', '9', '10', 'J', 'Q', 'K'
)
SUITS: tuple[Suit, ...] = ('♠', '♥', '♦', '♣')

def build_deck() -> List[Card]:
    '''Return a standard 52-card deck as a list of (rank, suit).'''
    return [(rank, suit) for suit in SUITS for rank in RANKS]

def deal_hands(num_players: int, cards_per_player: int,
               rng: random.Random | None = None) -> List[List[Card]]:
    '''
    Deal cards_per_player cards to num_players, without replacement.

    Raises ValueError if not enough cards.
    '''
    if num_players <= 0 or cards_per_player <= 0:
        raise ValueError('num_players and cards_per_player must be positive')

    deck = build_deck()  # fresh deck
    total_needed = num_players * cards_per_player

    if total_needed > len(deck):
        raise ValueError('Not enough cards to deal')

    r = rng if rng is not None else random
    # Efficient way: shuffle once, then slice
    r.shuffle(deck)

    hands: List[List[Card]] = []
    index = 0
    for _ in range(num_players):
        hand = deck[index:index + cards_per_player]
        hands.append(hand)
        index += cards_per_player

    return hands

if __name__ == '__main__':
    rng = random.Random(42)
    for i, hand in enumerate(deal_hands(4, 5, rng=rng), start=1):
        print(f'Player {i}:', hand)


Player 1: [('10', '♠'), ('J', '♥'), ('K', '♥'), ('4', '♠'), ('9', '♥')]
Player 2: [('K', '♦'), ('4', '♥'), ('A', '♣'), ('7', '♥'), ('Q', '♠')]
Player 3: [('8', '♣'), ('Q', '♥'), ('8', '♦'), ('4', '♦'), ('6', '♦')]
Player 4: [('5', '♣'), ('5', '♠'), ('3', '♦'), ('J', '♠'), ('A', '♦')]


---

### Problem 3 – Simulate a Biased Die With `choices`

You have a 6-sided die with these relative weights:

- 1: weight 1  
- 2: weight 1  
- 3: weight 1  
- 4: weight 1  
- 5: weight 2  
- 6: weight 4  

1. Use `random.choices` with a `weights` argument to simulate **100_000** rolls.
2. Compute the **relative frequency (in %) for each face**.
3. Print them sorted by face.
4. Check that 6 appears the most often and that 1–4 are roughly similar and less frequent than 5 and 6.


In [3]:
from collections import Counter

def simulate_biased_die(num_rolls: int = 100_000,
                        rng: random.Random | None = None) -> dict[int, float]:
    '''
    Simulate rolling a biased 6-sided die and return relative frequencies (%).
    '''
    faces = [1, 2, 3, 4, 5, 6]
    weights = [1, 1, 1, 1, 2, 4]  # relative weights
    r = rng if rng is not None else random

    rolls = r.choices(faces, weights=weights, k=num_rolls)
    counts = Counter(rolls)

    return {face: counts[face] / num_rolls * 100 for face in faces}

if __name__ == '__main__':
    rng = random.Random(0)
    freqs = simulate_biased_die(rng=rng)
    for face in sorted(freqs):
        print(f'{face}: {freqs[face]:6.2f}%')


1:  10.04%
2:  10.09%
3:  10.02%
4:  10.09%
5:  19.84%
6:  39.93%


---

### Problem 4 – Re-implement `sample` Without Using `random.sample`

Implement a function:

```python
def sample_without_replacement(population: list[T], k: int,
                               rng: random.Random | None = None) -> list[T]:
    ...
```

that mimics `random.sample(population, k)` **without using** `random.sample` or `random.shuffle`.

Hints:

- You can use `random.randrange`.
- You must **not pick the same element twice**.
- Time complexity should be **O(k)** on average if `k` is small relative to `len(population)`.


In [4]:
def sample_without_replacement(population: List[T], k: int,
                               rng: random.Random | None = None) -> List[T]:
    '''
    Sample k distinct elements from population without using random.sample or shuffle.

    Complexity is O(k) expected, assuming k << len(population).
    '''
    n = len(population)
    if k < 0 or k > n:
        raise ValueError('Sample larger than population or is negative.')

    r = rng if rng is not None else random

    chosen_indices: set[int] = set()
    result: List[T] = []

    while len(chosen_indices) < k:
        idx = r.randrange(n)
        if idx not in chosen_indices:
            chosen_indices.add(idx)
            result.append(population[idx])

    return result

if __name__ == '__main__':
    rng = random.Random(123)
    population = list(range(10))
    print(sample_without_replacement(population, 3, rng=rng))


[0, 4, 1]


---

### Problem 5 – Range Sampling vs `randrange`: Performance Wrapper

Wrap the performance test into a reusable function:

```python
def compare_sampling_speed(N: int, k: int, trials: int) -> None:
    ...
```

This should:

1. Time `random.sample(range(N), k)` repeated `trials` times.
2. Time `[random.randrange(N) for _ in range(k)]` repeated `trials` times.
3. Print the two timings and which one was faster.

Use `time.perf_counter`.


In [5]:
from time import perf_counter

def compare_sampling_speed(N: int, k: int, trials: int = 100) -> None:
    '''
    Compare speed of random.sample(range(N), k) vs list of randrange calls.
    '''
    # Time random.sample
    start = perf_counter()
    for _ in range(trials):
        _ = random.sample(range(N), k)
    sample_time = perf_counter() - start

    # Time randrange list comprehension
    start = perf_counter()
    for _ in range(trials):
        _ = [random.randrange(N) for _ in range(k)]
    randrange_time = perf_counter() - start

    print(f'random.sample:   {sample_time:.6f} s')
    print(f'randrange list:  {randrange_time:.6f} s')
    if sample_time < randrange_time:
        print('random.sample is faster in this setup.')
    else:
        print('randrange list is faster in this setup.')

if __name__ == '__main__':
    compare_sampling_speed(N=1_000_000, k=10_000, trials=50)


random.sample:   0.474983 s
randrange list:  0.344070 s
randrange list is faster in this setup.


---

### Problem 6 – Weighted Choices for A/B/n Testing

You run a website with three possible layouts: `'A'`, `'B'`, `'C'`.

You want:

- 70% of visitors to see `'A'`, 20% to see `'B'`, and 10% to see `'C'`.

1. Implement `choose_layout`:

```python
def choose_layout(rng: random.Random | None = None) -> str:
    ...
```

2. Run it 50_000 times and show empirical percentages.
3. Don’t use `random.uniform` / custom logic—use `random.choices`.


In [6]:
def choose_layout(rng: random.Random | None = None) -> str:
    '''
    Choose a layout with target probabilities:
    A: 70%, B: 20%, C: 10%.
    '''
    r = rng if rng is not None else random
    layouts = ['A', 'B', 'C']
    weights = [70, 20, 10]  # relative probabilities
    return r.choices(layouts, weights=weights, k=1)[0]

if __name__ == '__main__':
    rng = random.Random(123)
    num_visitors = 50_000
    chosen = [choose_layout(rng=rng) for _ in range(num_visitors)]
    counts = Counter(chosen)

    for layout in ['A', 'B', 'C']:
        pct = counts[layout] / num_visitors * 100
        print(f'{layout}: {pct:6.2f}%')


A:  69.77%
B:  20.06%
C:  10.17%


---

### Problem 7 – Frequency Analysis Helper (Mini `analyze_choices`)

Recreate a simplified version of an `analyze_choices` utility:

```python
def analyze_uniform_choices(population: str,
                            num_choices: int,
                            choice_size: int,
                            rng: random.Random | None = None) -> dict[str, float]:
    ...
```

This should:

1. Use `random.choices` to generate `num_choices` lists, each of size `choice_size`.
2. Compute **relative frequencies (in %) for each element** in the population.
3. Return the frequencies as a dict (you can sort keys when printing).


In [7]:
def analyze_uniform_choices(population: str,
                            num_choices: int,
                            choice_size: int,
                            rng: random.Random | None = None) -> dict[str, float]:
    '''
    Run random.choices on `population` num_choices times (each of size choice_size)
    and return a dict of relative frequencies (%).
    '''
    if not population:
        raise ValueError('Population must not be empty.')
    if num_choices <= 0 or choice_size <= 0:
        raise ValueError('num_choices and choice_size must be positive.')

    r = rng if rng is not None else random
    all_picks: list[str] = []

    for _ in range(num_choices):
        picks = r.choices(population, k=choice_size)
        all_picks.extend(picks)

    counts = Counter(all_picks)
    total = num_choices * choice_size
    freqs = {char: counts[char] / total * 100 for char in population}
    return freqs

if __name__ == '__main__':
    rng = random.Random(0)
    population = 'abcdefghij'
    freqs = analyze_uniform_choices(population, num_choices=10_000, choice_size=5, rng=rng)
    for k in sorted(freqs):
        print(f'{k}: {freqs[k]:6.2f}%')


a:  10.06%
b:  10.03%
c:   9.93%
d:  10.25%
e:   9.76%
f:  10.26%
g:   9.83%
h:   9.94%
i:   9.97%
j:   9.97%


---

### Problem 8 – Correct Handling of Deprecated Set Sampling

In some Python versions you may see a **DeprecationWarning** for sampling from a `set`.

Write a function:

```python
def safe_sample(population, k: int,
                rng: random.Random | None = None) -> list:
    ...
```

that:

1. Accepts any iterable (`list`, `tuple`, `set`, `range`, etc.).
2. Converts **non-sequence** iterables (like `set`) to a list internally.
3. Then uses `random.sample` (or `Random.sample`) to return a sample of size `k`.
4. Raises `ValueError` if `k` is invalid (delegate to `random.sample`).


In [8]:
from collections.abc import Sequence, Iterable

def safe_sample(population: Iterable[T], k: int,
                rng: random.Random | None = None) -> List[T]:
    '''
    Safely sample k elements from any iterable.

    If population is not a Sequence (e.g., set, generator), it is first converted
    to a list to avoid deprecation warnings and to work with random.sample.
    '''
    # If it's already a Sequence (supports indexing and len), use it directly.
    if isinstance(population, Sequence):
        seq = population
    else:
        # Snapshot the iterable into a list
        seq = list(population)

    # Use injected RNG if provided
    r = rng if rng is not None else random

    # Use the Random.sample method if we have a Random instance
    if isinstance(r, random.Random):
        return r.sample(seq, k)  # type: ignore[arg-type]
    else:
        # Fallback to module-level random.sample (unlikely branch)
        return random.sample(seq, k)

if __name__ == '__main__':
    rng = random.Random(0)
    s = {'a', 'b', 'c', 'd', 'e', 'f'}
    print(safe_sample(s, 3, rng=rng))


['f', 'c', 'a']
