# INFO 6205 - Program Structure and Algorithms
# Worked Assignment 5 Solutions
### student Name: yuxuan zhang
### Professor: Nik Bear Brown
### Date: 12/01/2023

## Q1 (10 Points)
In a collectible card game where players receive cards randomly for completing challenges. There are n distinct types of cards. After completing each challenge, a player receives one card, chosen randomly and with equal probability from the n types. What is the expected number of challenges a player must complete to collect at least one of each type of card?

## Reflection

1. **The Nature of Randomness:** The problem highlights how randomness doesn't equate to uniform distribution of outcomes over a short period. A player might receive duplicate cards before completing the set, illustrating the unpredictable nature of random events.

2. **Non-Linear Progression:** The expected number of challenges needed increases non-linearly as the player collects more types. It's easier to find a new card at the beginning when many types are uncollected, but it becomes progressively harder as the collection grows, demonstrating diminishing returns.

3. **Harmonic Numbers:** The solution introduces the concept of Harmonic numbers, which are significant in various areas of mathematics and computer science. This problem provides a practical application for these numbers, showing their relevance in calculating expectations in random distributions.

4. **Applications Beyond Games:** While framed in the context of a game, the principles apply to many real-world scenarios. For instance, it could model situations in ecology (like expecting different species in samples), marketing (like collecting a variety of customer feedback), or even in computer science (like hashing algorithms and collision avoidance).

5. **Educational Value:** This problem is a classic example used in teaching probability and statistics. It helps students understand complex concepts like expected value and probability distributions through a relatable and engaging example.


## Solution
To find the expected number of challenges a player must complete to collect at least one of each type of card in a set of $n$ distinct types, we can use the concept of expected value in probability.

The problem is akin to the "Coupon Collector's Problem." Here's how it works:

1. **First Card:** The first card a player collects is always a new type, so it only takes 1 challenge to get the first new card.

2. **Second Card:** For the second card, there are $n-1$ new types out of $n$ total types. The probability of getting a new type in each challenge is$\frac{n-1}{n}$. The expected number of challenges to get a new card is the reciprocal of this probability, which is $\frac{n}{n-1}$.

3. **Third Card:** Similarly, for the third card, the probability of getting a new type is $\frac{n-2}{n}$, and the expected number of challenges is $\frac{n}{n-2}$.

4. **Continuing this Pattern:** This pattern continues until the player collects all $n$ types. For the $k$-th card, where $k$ ranges from 1 to $n$, the expected number of challenges to get a new card is $\frac{n}{n-(k-1)}$.

The total expected number of challenges is the sum of the expected values for each of these stages:

$
\text{Expected number of challenges} = \sum_{k=1}^{n} \frac{n}{n-(k-1)} = n \left( \frac{1}{n} + \frac{1}{n-1} + \frac{1}{n-2} + \ldots + \frac{1}{1} \right)
$

This sum is the nth Harmonic number, denoted as $H_n$. Therefore, the formula becomes:

$
\text{Expected number of challenges} = n \times H_n
$

Where $ H_n = 1 + \frac{1}{2} + \frac{1}{3} + \ldots + \frac{1}{n} $.

This formula gives us the expected number of challenges a player must complete to collect at least one of each type of card.

In [1]:
import random

def coupon_collector(n):
    """
    Simulate the coupon collector's problem for n different types of coupons (or cards).
    
    Args:
    n (int): The total number of distinct coupon types.

    Returns:
    int: The total number of coupons collected to complete the set.
    """
    collected_types = [False] * n
    num_collected = 0
    num_challenges = 0

    while num_collected < n:
        num_challenges += 1
        card_type = random.randint(0, n-1)  # Simulate obtaining a random card type
        if not collected_types[card_type]:
            collected_types[card_type] = True
            num_collected += 1

    return num_challenges

# Example usage
n = 10  # Let's say there are 10 different types of cards
total_challenges = coupon_collector(n)
total_challenges

19

## Q2 (10 Points) 
PushPush is a 2-D pushing-blocks game with the following rules:
#### Initial Setup:
A rectangular grid is set up with several single-cell tiles placed at various positions. A robot is also positioned on a designated cell within this grid.

#### Robot Movement:
The robot has the capability to move to any adjacent cell, provided the cell is either vacant or contains a movable single-cell tile.

#### Tile Sliding:
In this version of the game, when the robot pushes an adjacent tile, the tile moves exactly one cell in the direction of the push. This differs from the traditional PushPush mechanic where the tile moves to the farthest possible extent.

#### Tile Merging:
When a tile is pushed into another tile, they merge to form a larger tile that occupies two cells. These merged tiles become immovable and cannot be traversed.

#### Goal:
The aim of the game is to maneuver the robot to a specific target cell, which is initially inaccessible due to the placement of the tiles.

#### Solution Specification:

- MoveRobot(x, y): Command to move the robot from its current position to the coordinates (x, y).
- SlideTile(x, y): Command to slide a tile one cell towards the specified coordinates (x, y).
- CheckGoal(robot): Function to determine if the robot has reached its target position.

#### Example Problem Statement:
"Given a particular arrangement of tiles and a robot on a rectangular grid, devise a sequence of moves that will allow the robot to reach a designated goal cell."

## Reflection

1. **Strategic Thinking and Planning:** The game requires players to think several steps ahead. Since tiles merge into immovable objects when pushed together, each move can significantly alter the playing field. Players must plan their moves carefully to avoid creating obstacles that could block the path to the goal.

2. **Spatial Reasoning:** The game is a great exercise in spatial reasoning. Players need to visualize the effects of their moves on the grid, considering how sliding tiles and merging them will change the layout.

3. **Algorithmic Thinking:** From a computational perspective, this game presents an interesting problem in pathfinding and state-space search. Finding the most efficient sequence of moves to reach the goal is akin to solving a puzzle, where each move changes the state of the game board. It's a practical illustration of concepts like search algorithms, heuristics, and optimization.

4. **Complexity from Simple Rules:** This game is a classic example of how a set of simple rules can create a complex and challenging puzzle. The mechanics of moving the robot and sliding tiles are straightforward, but the emergent gameplay from these simple interactions can be deeply engaging and complex.

5. **Educational Value:** For educational purposes, this game can be used to teach problem-solving, logical reasoning, and even basic programming concepts. It can be an excellent tool for engaging students in computational thinking.

6. **Adaptability and Variability:** The game's rules are simple yet flexible, allowing for a wide range of puzzles and difficulties. This adaptability makes it suitable for a variety of skill levels, from beginners to advanced players.

## Solution
To provide a solution for the modified PushPush game, we would need a specific grid layout including the starting positions of the robot and the tiles, as well as the target position for the robot. The solution would involve a sequence of `MoveRobot(x, y)` and `SlideTile(x, y)` commands to navigate the robot to the goal while managing the positions and mergers of the tiles.

Since the game is a puzzle, the solution can vary widely depending on the initial setup and there's often more than one way to solve it. In more complex setups, finding the optimal solution may require algorithmic approaches, such as depth-first search, breadth-first search, or even more advanced pathfinding algorithms like A*.

1. **Grid Size:** 4x4.
2. **Robot's Starting Position:** (0, 0) - top left corner.
3. **Tiles' Positions:** 
   - Tile 1 at (1, 0)
   - Tile 2 at (2, 0)
4. **Goal Position for the Robot:** (3, 0) - far right on the top row.

The objective is to move the robot to (3, 0). However, the path is blocked by two tiles. The robot can push these tiles to clear the path, but if pushed together, they will merge and block the path.

#### Solution Steps:

1. `MoveRobot(1, 0)`: Move the robot to the position of Tile 1.
2. `SlideTile(2, 0)`: Push Tile 1 to the right. Now, Tile 1 is at (2, 0), and the robot is at (1, 0).
3. `MoveRobot(2, 0)`: Move the robot to the position of Tile 2.
4. `SlideTile(3, 0)`: Push Tile 2 to the right. Now, Tile 2 is at (3, 0), and the robot is at (2, 0).
5. `MoveRobot(3, 0)`: Finally, move the robot to the goal position.

This sequence of moves allows the robot to reach the goal without merging the tiles, thereby solving the puzzle for this particular setup. Keep in mind that solutions can vary greatly based on the initial configuration of the grid, the robot, and the tiles.

In [2]:
def move_robot(robot_pos, new_pos):
    """
    Simulate moving the robot to a new position.
    """
    print(f"MoveRobot from {robot_pos} to {new_pos}")
    return new_pos

def slide_tile(tile_pos, new_pos, grid):
    """
    Simulate sliding a tile to a new position.
    """
    print(f"SlideTile from {tile_pos} to {new_pos}")
    grid[new_pos[0]][new_pos[1]] = grid[tile_pos[0]][tile_pos[1]]
    grid[tile_pos[0]][tile_pos[1]] = 0

def solve_pushpush(grid, robot_pos, goal_pos):
    """
    Solve a simple PushPush puzzle.
    """
    # First, move the robot to the first tile
    robot_pos = move_robot(robot_pos, (1, 0))

    # Slide the first tile to the right
    slide_tile((1, 0), (2, 0), grid)

    # Move the robot to the second tile
    robot_pos = move_robot(robot_pos, (2, 0))

    # Slide the second tile to the goal position
    slide_tile((2, 0), (3, 0), grid)

    # Finally, move the robot to the goal
    robot_pos = move_robot(robot_pos, goal_pos)

    return grid, robot_pos

# Initial grid setup
grid = [[0 for _ in range(4)] for _ in range(4)]
grid[0][1] = 1  # Tile 1
grid[0][2] = 2  # Tile 2

# Robot's starting position and goal position
robot_pos = (0, 0)
goal_pos = (0, 3)

# Solve the puzzle
final_grid, final_robot_pos = solve_pushpush(grid, robot_pos, goal_pos)
final_grid, final_robot_pos

MoveRobot from (0, 0) to (1, 0)
SlideTile from (1, 0) to (2, 0)
MoveRobot from (1, 0) to (2, 0)
SlideTile from (2, 0) to (3, 0)
MoveRobot from (2, 0) to (0, 3)


([[0, 1, 2, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], (0, 3))

## Q3 (10 Points)
What is a 'steady state' in a Hopfield Network?

## Reflection

1. **Understanding Neural Network Dynamics:** The 'steady state' in a Hopfield Network exemplifies how neural networks can reach equilibrium. This state is achieved when the network's neuron activations stop changing, indicating that the network has settled into a stable pattern. This concept is crucial for understanding how some neural networks process and stabilize information.

2. **Association with Memory and Pattern Recognition:** Hopfield Networks are often used to model associative memory. The steady state is significant because it usually corresponds to a pattern or memory the network has stored. This property illustrates the network's ability to recall specific patterns from incomplete or noisy inputs, making it a powerful tool for pattern recognition tasks.

3. **Convergence and Stability Analysis:** The study of how and when Hopfield Networks reach a steady state involves convergence and stability analysis, which are key topics in the study of dynamic systems. Understanding these aspects is crucial for designing and applying these networks effectively.

4. **Implications for Learning and Memory in Biological Systems:** The concept of a steady state in neural networks like Hopfield's also provides insights into how learning and memory might work in biological systems. It offers a simplified model for how the brain might store and recall information.

5. **Challenges in Network Design:** Designing a Hopfield Network to ensure it reaches a desired steady state (especially in the presence of multiple stable states) poses interesting challenges. It requires careful consideration of network structure, initial conditions, and the learning algorithm used to set the weights.

6. **Applications in Computing:** Beyond theoretical interest, the steady state phenomenon in Hopfield Networks has practical applications in solving optimization problems, error correction in data transmission, and even in developing algorithms for content-addressable memory systems.

## Solution

1. **Setting up a Hopfield Network:** Define the network with a set of neurons and initialize their states.
2. **Defining the Connection Weights:** Establish the connection weights between neurons, which could be based on a set of patterns that the network is meant to learn and recall.
3. **Simulating the Network Dynamics:** Run the network dynamics by updating the states of neurons iteratively based on the input and the network's weight matrix.
4. **Observing the Steady State:** After several iterations, the network should settle into a steady state, where further updates do not change the neurons' states.

## Q4 (15 Points) 
Does the payoff matrix for the 'Prisoner's Dilemma' game contain any Nash equilibria?

## Reflection

1. **Understanding Nash Equilibrium:** The concept of Nash equilibrium, a fundamental principle in game theory, illustrates how in certain strategic scenarios, players can reach a state where no one benefits from changing their strategy unilaterally. Understanding whether a game has a Nash equilibrium is crucial for predicting the outcomes and behaviors of rational players.

2. **Insights into Human Behavior:** These types of questions, especially in the context of the "Prisoner's Dilemma," provide deep insights into human cooperation, trust, and conflict. They demonstrate how individual rationality can lead to collective irrationality, where players might not achieve the best collective or individual outcome.

3. **Application Across Disciplines:** The search for Nash equilibria in various games is not just limited to theoretical exercises. It has practical applications in economics, politics, sociology, and evolutionary biology, helping to model and understand competitive and cooperative interactions in these fields.

4. **Strategic Thinking and Real-World Implications:** Analyzing games like the "Prisoner's Dilemma" encourages strategic thinking, showcasing how the choices of others impact an individual's decision-making process. It also reflects real-world scenarios where individuals or groups must choose between cooperative and selfish behaviors.

5. **Complexity of Decision-Making:** These games underline the complexity of decision-making in scenarios where outcomes depend not only on one's actions but also on the actions of others. It emphasizes the importance of anticipating others' decisions in strategic planning.

6. **Educational Value in Game Theory:** Exploring Nash equilibria in game theory problems is a valuable educational tool, helping students and learners grasp the intricate nature of strategic interactions and the mathematical underpinnings of decision-making processes.

## Solution
To determine if the payoff matrix of a game like "TradeTrade" or the "Prisoner's Dilemma" has any Nash equilibria, we need to analyze the specific details of the payoff matrix. However, I can demonstrate how this is typically done using the classic example of the "Prisoner's Dilemma."

In the "Prisoner's Dilemma," two players (prisoners) must independently decide whether to cooperate with each other or to betray. The payoff matrix usually looks something like this:

|            | Cooperate | Betray |
|------------|-----------|--------|
| Cooperate  | R, R      | S, T   |
| Betray     | T, S      | P, P   |

Where:
- T is the temptation payoff (received if a player betrays the other while the other cooperates)
- R is the reward for mutual cooperation
- P is the punishment for mutual betrayal
- S is the sucker's payoff (received if a player cooperates while the other betrays)

Typically, the values are set so that T > R > P > S.

A Nash equilibrium occurs when each player's strategy is optimal given the other player's strategy. In the "Prisoner's Dilemma," betraying is always the best individual strategy, regardless of what the other player does. This is because:
- If the other player cooperates, betraying gives the higher temptation payoff T (compared to the lower reward R for mutual cooperation).
- If the other player betrays, betraying avoids the sucker's payoff S and leads to the punishment payoff P, which is better than being the sucker.

Therefore, the Nash equilibrium for the "Prisoner's Dilemma" is for both players to betray each other.

In [3]:
def find_nash_equilibria(payoff_matrix):
    """
    Find Nash equilibria in a 2x2 payoff matrix for a two-player game.

    Args:
    payoff_matrix (list of lists): The payoff matrix of the game.
                                   Format: [[(R, R), (S, T)], [(T, S), (P, P)]]
                                   where R, S, T, P are the payoffs for Cooperate/Betray choices.

    Returns:
    list of tuples: List of strategy pairs that are Nash equilibria.
    """
    nash_equilibria = []

    # Check each player's strategy against the other's
    for i in range(2):
        for j in range(2):
            player1_strategy = payoff_matrix[i][0][0] >= payoff_matrix[1-i][0][0]
            player2_strategy = payoff_matrix[0][j][1] >= payoff_matrix[0][1-j][1]
            if player1_strategy and player2_strategy:
                nash_equilibria.append(("Cooperate" if i == 0 else "Betray", 
                                       "Cooperate" if j == 0 else "Betray"))

    return nash_equilibria

# Example: Prisoner's Dilemma Payoff Matrix
# T > R > P > S, for example, T=5, R=3, P=1, S=0
payoff_matrix = [[(3, 3), (0, 5)], [(5, 0), (1, 1)]]

# Find Nash Equilibria
nash_equilibria = find_nash_equilibria(payoff_matrix)
nash_equilibria

[('Betray', 'Betray')]

## Q5 (15 Points)  
Consider a fair coin (with a heads and a tails side, each having an equal probability of landing). How many independent flips X are needed until the first heads is flipped? Express the expectation as a function of the probability of flipping heads.

## Reflection

1. **Understanding Geometric Distribution:** This problem is a classic example of geometric distribution, which describes the number of Bernoulli trials needed for a success to occur. It's a fundamental concept in probability theory, illustrating how distributions can model real-world random processes.

2. **Simplicity of Expected Value Calculation:** The problem demonstrates how the expected value can be easily calculated in certain probability distributions. In this case, with a fair coin (probability \( p = \frac{1}{2} \) for heads), the expected number of flips is simply the reciprocal of the probability of success (\( E[X] = \frac{1}{p} \)).

3. **Intuitive Understanding of Averages:** The result, that on average it takes two flips to get a heads, aligns well with our intuitive understanding of probability. This provides an intuitive check that our mathematical understanding of probability corresponds to what we might expect in real life.

4. **Application in Decision Making:** This type of problem, though simple, is analogous to many real-world scenarios where decisions or predictions are made based on the probability of certain outcomes. It highlights the importance of understanding the underlying probability distributions for effective decision-making.

5. **Exploring Variability in Random Processes:** While the expected number of flips is 2, the actual number in any given trial could be more or less. This underscores the concept of variability in random processes and the distinction between expected and actual outcomes.

6. **Educational Value:** Problems like this are widely used in teaching basic probability and statistics, as they provide a clear and simple example of how probability theory can be applied to calculate real-world quantities.

## Solution
To solve this problem, we need to calculate the expected number of coin flips $ X $ until the first heads is flipped on a fair coin. This is a classic example of a geometric distribution, where we are finding the expected number of trials until the first success.

In a fair coin, the probability of flipping heads (success) is $ p = \frac{1}{2} $.

The expectation (or expected value) $ E[X] $ of a geometrically distributed random variable, where each trial is independent, is given by:

$ E[X] = \frac{1}{p} $

Substituting the probability of flipping heads $ p = \frac{1}{2} $, we get:

$ E[X] = \frac{1}{\frac{1}{2}} = 2 $

So, the expected number of coin flips until the first heads is flipped is 2. This means, on average, you would expect to flip the coin twice to get the first heads.

In [9]:
import random

def flip_until_heads():
    """
    Simulate flipping a fair coin until heads is flipped.
    Count the number of flips needed.
    """
    count = 0
    while True:
        count += 1
        # Simulate a coin flip: 0 for tails, 1 for heads
        if random.randint(0, 1) == 1:
            break
    return count

# Simulate the process multiple times to get an average
num_simulations = 10000
total_flips = sum(flip_until_heads() for _ in range(num_simulations))

# Calculate the average number of flips
average_flips = total_flips / num_simulations
average_flips

1.9772

## Q6 (15 Points) 
Consider a Las Vegas algorithm that searches for a specific element in an unsorted array by randomly picking elements to check. Propose a Monte Carlo version of this algorithm.

## Reflection

1. **Understanding Las Vegas vs. Monte Carlo Algorithms:** The core difference between these two types of algorithms lies in their approach to randomness. Las Vegas algorithms always produce a correct or optimal result, but their running time is variable. Monte Carlo algorithms, on the other hand, have a fixed running time but only offer a probabilistic guarantee of correctness. This distinction is crucial in applications where the balance between accuracy and speed is critical.

2. **Trade-offs in Algorithm Design:** The process of converting a Las Vegas algorithm to a Monte Carlo one involves deliberate trade-offs. While Monte Carlo algorithms can significantly improve computational efficiency, this comes at the cost of reduced accuracy or certainty in the results. This trade-off is a common theme in many areas of computer science and engineering.

3. **Applications in Complex Problems:** In many real-world scenarios, especially those involving large datasets or complex computations, the deterministic approach of a Las Vegas algorithm becomes impractical. Monte Carlo methods can provide a viable alternative, offering good enough results within a reasonable timeframe.

4. **Probabilistic Thinking:** The Monte Carlo approach requires a shift from deterministic to probabilistic thinking. It involves understanding and quantifying uncertainty and risk, which is a fundamental aspect of statistical and probabilistic analysis.

5. **Algorithmic Efficiency vs. Result Accuracy:** This transformation highlights the balance between algorithmic efficiency and the accuracy of results. In many practical applications, such as real-time processing or large-scale data analysis, a faster, approximate answer may be more valuable than a slow, exact one.

6. **Educational Value:** This transformation is a valuable educational exercise in algorithm design and analysis, teaching important concepts about probabilistic algorithms and their applications in solving complex problems where exact solutions are computationally infeasible.


## Solution
To propose a Monte Carlo version of a Las Vegas algorithm that searches for a prime number within a range by performing primality tests on random numbers, we need to adjust the approach to allow for a probabilistic rather than a guaranteed correct result. 

The Las Vegas version of this algorithm picks random numbers within the specified range and performs a primality test on each until it finds a prime number. It guarantees to find a prime number, but the time it takes to do so is uncertain.

A Monte Carlo version, on the other hand, would set a fixed number of attempts to find a prime and then stop, whether or not it has found one. This version trades off the certainty of finding a prime for a predictable running time. 

Here's a basic outline for the Monte Carlo algorithm:

1. **Input**: A range $[a, b]$ within which to search for a prime number, and a maximum number of attempts $N$.

2. **Procedure**:
   a. For each attempt $i$ from 1 to $N$:
      i. Select a random number $x$ within the range $[a, b]$.
      ii. Perform a primality test on $x$.
      iii. If $x$ is prime, return $x$ as the result and stop.
   
   b. If no prime number is found after $N$ attempts, either return a failure indication or the best candidate found (which may not be prime).

3. **Output**: The first prime number found within the range, or an indication that no prime was found within the specified number of attempts.

This Monte Carlo algorithm will complete in a predictable amount of time, making at most $N$ primality tests, but it may not always find a prime number, even if one exists within the range. The probability of success depends on the density of primes within the range and the number of attempts $N$.

In [14]:
import random
import sympy

def monte_carlo_prime_search(a, b, max_attempts):
    """
    Monte Carlo algorithm to find a prime number within a range [a, b].

    Args:
    a (int): Lower bound of the range.
    b (int): Upper bound of the range.
    max_attempts (int): Maximum number of attempts to find a prime.

    Returns:
    int or None: A prime number found within the range, or None if no prime is found.
    """
    for _ in range(max_attempts):
        # Generate a random number within the range
        candidate = random.randint(a, b)

        # Check if the number is prime
        if sympy.isprime(candidate):
            return candidate

    # No prime number found within the given number of attempts
    return None

# Example usage: Search for a prime number between 100 and 200 with a maximum of 100 attempts
a = 100
b = 200
max_attempts = 100
found_prime = monte_carlo_prime_search(a, b, max_attempts)
found_prime

127

## Q7 (10 Points) 
Consider a graph G = (V, E), where each node can either be 'active' or 'inactive', and an edge represents a connection between pairs of nodes. A clique in this context is a subset of nodes such that each node in the subset is 'active' and every pair of nodes in the subset is connected by an edge.

Suppose every node in the graph has exactly m neighbors. We are interested in finding the largest possible clique using a random algorithm. Each node P_j decides independently to be 'active' with probability r or 'inactive' with probability 1 - r. For a node to be considered part of the clique, it must be 'active', and all of its m neighbors must also be 'active'.

Provide a formula for the expected size of the clique K when r is set to a specific value, for instance, r = 1/(m+1).

## Solution 

### Problem Parameters
- **Graph G = (V, E)**: Each node can be 'active' or 'inactive'.
- **Each node has m neighbors**.
- **Probability r = 1/(m+1)**: Each node independently becomes 'active' with this probability.

### Goal
- **Find the expected size of the largest clique K**.

### Solution Approach
The expected size of the largest clique involves finding the expected number of nodes that are 'active' and have all their m neighbors also 'active'.

1. **Probability of a Node Being Part of a Clique**:
   - A node is part of a clique if it is 'active' and all its m neighbors are 'active'.
   - The probability of a node being 'active' is r.
   - The probability of each of its m neighbors being 'active' is also r.
   - Therefore, the probability of a node and its m neighbors all being 'active' is $r^{m+1}$.

2. **Using r = 1/(m+1)**:
   - Replace r with 1/(m+1), so the probability becomes $(1/(m+1))^{m+1}$.

3. **Expected Size of the Clique**:
   - Let N be the total number of nodes.
   - The expected number of nodes that form a clique is N times the probability that any given node is part of a clique.
   - So, the expected size of the largest clique, K, is $N \times (1/(m+1))^{m+1}$.

### Formula
$ K = N \times \left(\frac{1}{m+1}\right)^{m+1} $

In [15]:
def expected_clique_size(N, m):
    """
    Calculate the expected size of the largest clique in a graph.

    Parameters:
    N (int): Total number of nodes in the graph.
    m (int): Each node has exactly m neighbors.

    Returns:
    float: Expected size of the largest clique.
    """
    r = 1 / (m + 1)
    return N * (r ** (m + 1))

# Example usage
N = 100  # total number of nodes in the graph
m = 5    # each node has exactly 5 neighbors

expected_size = expected_clique_size(N, m)
expected_size

0.0021433470507544574

## Q8 (15 Points)
Consider an optimized algorithm for sorting an array of integers using a variation of the QuickSort method. The task is to derive a recurrence relation for this algorithm's best, average, and worst-case scenarios, and then analyze its time complexity using the Master Theorem.

## Reflection
The task involves examining a variant of the QuickSort algorithm, which is a classic example in the field of computer science for understanding divide-and-conquer strategies and their complexities. QuickSort is known for its efficiency in average cases but can degrade in performance in worst-case scenarios. The challenge here is to understand how the optimizations in the given variant affect its performance across different scenarios (best, average, worst-case).

Establishing the recurrence relations for these scenarios will likely involve understanding the partition strategy of this variant, as the choice of pivot and partitioning approach in QuickSort significantly influences the complexity. The recurrence relations will express how the sorting problem is broken down into smaller sub-problems and how these sub-problems contribute to the overall complexity.

Applying the Master Theorem in this context is an exercise in applying theoretical knowledge to practical algorithm analysis. The Master Theorem provides a direct way to get the time complexity from the recurrence relations, which helps in understanding the efficiency of the algorithm without delving into more complex mathematical proofs. This exercise underscores the importance of theoretical computer science concepts in analyzing and understanding practical algorithms.

## Solution

### Step 1: Establishing Recurrence Relations

1. **Best Case:** This occurs when the pivot divides the array into two equal halves. The recurrence relation in this scenario is often:
   $ T(n) = 2T(n/2) + \Theta(n) $
   Here, $ \Theta(n) $ represents the time taken for partitioning the array.

2. **Average Case:** For average case analysis, we assume that the partition happens at some constant ratio (not necessarily in half). The recurrence relation could be:
   $ T(n) = T(k) + T(n - k) + \Theta(n) $
   Here, $ k $ is some fraction of $ n $, say $ n/4 $, $ n/5 $, etc. 

3. **Worst Case:** This happens when the pivot is the smallest or the largest element, leading to very uneven divisions. The recurrence relation is:
   $ T(n) = T(n-1) + \Theta(n) $
   This essentially means each step only reduces the problem size by 1.

### Step 2: Applying the Master Theorem

The Master Theorem is used to determine the time complexity of recurrence relations of the form:
$ T(n) = aT(n/b) + f(n) $
where:
- $ a $ = number of subproblems in the recursion
- $ n/b $ = size of each subproblem
- $ f(n) $ = cost of the work done outside the recursive calls

1. **Best Case:** 
   - The relation is $ T(n) = 2T(n/2) + \Theta(n) $.
   - Here, $ a = 2 $, $ b = 2 $, and $ f(n) = \Theta(n) $.
   - By the Master Theorem, this falls under Case 2, giving a complexity of $ O(n\log n) $.

2. **Average Case:** 
   - The exact complexity would depend on the value of $ k $. Generally, it tends to be $ O(n\log n) $, but the exact analysis might need more sophisticated methods than the Master Theorem if $ k $ is not a constant fraction.

3. **Worst Case:** 
   - The relation is $ T(n) = T(n-1) + \Theta(n) $.
   - This does not fit the form for the Master Theorem directly. However, it's a well-known relation that indicates a complexity of $ O(n^2) $.

In [16]:
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    else:
        pivot = arr[0]
        less = [x for x in arr[1:] if x <= pivot]
        greater = [x for x in arr[1:] if x > pivot]
        return quicksort(less) + [pivot] + quicksort(greater)

# Example usage
arr = [3, 6, 8, 10, 1, 2, 1]
sorted_arr = quicksort(arr)
print(sorted_arr)


[1, 1, 2, 3, 6, 8, 10]
