# Emerging Technologies

This notebook contains solutions for the assessment problems on classical and quantum algorithms using the Deutsch–Jozsa algorithm.

## Import Required

In [1]:
import random # problem 1 - random is used for uniform random choices and shuffling;
import numpy as np #  problem 1 - numpy (np) is used later for numeric checks and array operations; 
import itertools as it # problem 1 - itertools (it) is used for generating all binary tuples.


## Introduction

### Classical vs Quantum Information

**[Classical Information](https://en.wikipedia.org/wiki/Classical_information_theory)** - Think about how a CD or DVD stores music and videos. The disc surface contains tiny physical pits — a pit represents a 0, and a flat area (called a "land") represents a 1. When you play the disc, a laser reads these pits one by one, [decoding the binary data sequentially.](https://www.sciencedirect.com/topics/engineering/compact-disc) This is how classical computers work: they process information as definite 0s and 1s, checking each value individually. If you wanted to find out whether a mystery function always gives the same answer (constant) or gives a mix of answers (balanced), you would have to test many inputs one at a time — potentially needing up to 9 queries for a 4-input function.

**[Quantum Information](https://en.wikipedia.org/wiki/Quantum_information)** - Now imagine you need to decide whether to bring a raincoat before going outside. Classically, you would check the weather forecast first, then decide. But what if you could somehow be prepared for all weather possibilities at once? This is the essence of [quantum superposition](https://en.wikipedia.org/wiki/Quantum_superposition) — a **[quantum bit (qubit)](https://en.wikipedia.org/wiki/Qubit)** can exist as both 0 AND 1 simultaneously, like being in a state of "maybe rain, maybe sunshine" until you observe it. The [Deutsch–Jozsa algorithm](https://arxiv.org/abs/quant-ph/9708016) exploits this property: by putting qubits into superposition, we can query a function with all possible inputs at once, and through **[quantum interference](https://en.wikipedia.org/wiki/Wave_interference#Quantum_interference)**, the answer (constant or balanced) reveals itself in just a single query. This is the **[quantum advantage](https://en.wikipedia.org/wiki/Quantum_supremacy)** we will explore in this notebook.


## Problem 1: Generating Random Boolean Functions

**Instructions:**  
Write a Python function `random_constant_balanced` that returns a randomly chosen function from the set of constant or balanced functions taking four **[Boolean](https://en.wikipedia.org/wiki/Boolean_data_type)** inputs.

### 1. Classical Systems and Their State Sets

A classical system is defined by the set of states it can be in.

Examples:

- If \(X\) is a **bit**, then  
Σ = \{0,1\}

- If \(X\) is a **six-sided die**, then  
Σ = \{1,2,3,4,5,6\}


- If \(X\) is a **fan switch**, then  
Σ = {high, medium, low, off}

The physical representation does not matter — only the distinct states matter.

“A bit is just a system that can be in two different states.”

### 2. When We Know the State Exactly

Sometimes we know the state with certainty.

Example:  
If the fan switch is set to *high*, we know the exact classical state.

But in real computation (e.g., networking), we often **don’t** know the state.

### 3. When We Don’t Know the State: Probabilistic States

“If you do any network programming, you’re going to take bits from the network, and you don’t know what they are.”

Real-world information is often uncertain. You cannot predict incoming bits. If you could predict them, the channel would carry less information (because information = unpredictability).

Suppose we believe:

The probability of \(X=0\) is $ \Pr(X=0) = \frac{3}{4} $, and the probability of $X=1$ is $ \Pr(X=1) = \frac{1}{4} $.

This is exactly the IBM Quantum textbook example.

### 4. Representing Uncertainty with Probability Vectors

We represent the probabilistic state as a **[column vector](https://en.wikipedia.org/wiki/Row_and_column_vectors)**:

\[
\begin{bmatrix}
0.75 \\
0.25
\end{bmatrix}
\]

- Top entry = probability of 0  
- Bottom entry = probability of 1  
- They must sum to 1  

A **[probability vector](https://en.wikipedia.org/wiki/Probability_vector)** must satisfy:

1. All entries ≥ 0  
2. Entries sum to 1  

This is the classical analogue of **[quantum state vectors](https://en.wikipedia.org/wiki/Quantum_state#Pure_states_as_rays_in_a_Hilbert_space)**.

### 5. Special Probability Vectors: The Basis States

Two special vectors represent definite classical states:

$$
|0\rangle = \begin{bmatrix}1 \\ 0\end{bmatrix}, \qquad
|1\rangle = \begin{bmatrix}0 \\ 1\end{bmatrix}.
$$

These are the **basis vectors** of classical information. Any probability vector can be written as:

$$
p(0)\,|0\rangle + p(1)\,|1\rangle.
$$

### 6. Demonstrating Probability Vectors in Python




In [2]:
# Example probability vector
p = np.array([[0.75],
              [0.25]])
# Check that the probabilities sum to 1
p.sum()


np.float64(1.0)

### 7. Classical Functions on a Single Bit

A function that takes one bit in and outputs one bit can only behave in **four** possible ways:

| Name | Description | Mapping |
|------|-------------|---------|
| F1 | Constant 0 | 0→0, 1→0 |
| F2 | Identity | 0→0, 1→1 |
| F3 | NOT | 0→1, 1→0 |
| F4 | Constant 1 | 0→1, 1→1 |

“No matter how complicated thecode is, at the end it’s one of these four.”

### 8. Implementing the Four Functions in Python

In [3]:
def f1(a):
    return 0

def f2(a):
    return a

def f3(a):
    return 1 - a

def f4(a):
    return 1

functions = [f1, f2, f3, f4]

### 9. Choosing a Random Function

In [4]:
f = random.choice(functions) 
f

<function __main__.f4(a)>

### 10. Identifying the Random Function

“The only thing can do is ask the function what happens if put in 0 and what happens if put in 1.”

In [5]:
f(0), f(1)

(1, 1)

### 11. Determining Which Function It Is

We compare the outputs to the four possibilities.

In [6]:
def identify_function(f):
    out0 = f(0)
    out1 = f(1)
    
    if out0 == 0 and out1 == 0:
        return "F1 (constant 0)"
    if out0 == 0 and out1 == 1:
        return "F2 (identity)"
    if out0 == 1 and out1 == 0:
        return "F3 (NOT)"
    if out0 == 1 and out1 == 1:
        return "F4 (constant 1)"

identify_function(f)

'F4 (constant 1)'

### 12. Why This Matters for Quantum Computing

Classically, we need **two queries** to fully identify the function.

Quantum computing asks:

> Can we learn something about the function with **one** query?

This leads directly to **[Deutsch's algorithm](https://en.wikipedia.org/wiki/Deutsch%E2%80%93Jozsa_algorithm#Deutsch's_algorithm)**, the first example of **[quantum speedup](https://en.wikipedia.org/wiki/Quantum_computing#Potential)**.

---

### Random Functions (classical toy problem)

One of the simplest toy problems for introducing quantum speedups is to work with raw bits.  
This notebook sets up functions $f:\{0,1\}^n \to \{0,1\}$ that are guaranteed to be either **constant** or **balanced**.  
The $n=3$ example, the random-tuple representation, the combinatorics for balanced functions, and Python code to generate and test random functions are included.

#### Purpose

The problem is reduced to its barest form: functions on bits. No semantic meaning is attached to inputs or outputs — they are raw bits.  
The task: given **[oracle](https://en.wikipedia.org/wiki/Oracle_machine)** access to a function $f$ that is either constant or balanced, determine which type it is. The classical baseline requires many queries; this sets up the Deutsch / Deutsch–Jozsa style question.

#### Definitions

- **[Constant function](https://en.wikipedia.org/wiki/Constant_function):** the output is identical for every input. For a length-$2^n$ tuple the output is either all zeros or all ones.  
- **[Balanced function](https://en.wikipedia.org/wiki/Deutsch%E2%80%93Jozsa_algorithm#Problem_statement):** exactly half of the $2^n$ outputs are $0$ and half are $1$ (so there are $2^{n-1}$ zeros and $2^{n-1}$ ones).

#### Example: $n = 3$ (table representation)

All input triples are listed and one possible function output column is shown. The function column is a length-$2^n$ tuple.

| **a** | **b** | **c** | **f(a,b,c)** |
|-----:|-----:|-----:|:------------:|
| 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 1 |
| 0 | 1 | 0 | 0 |
| 0 | 1 | 1 | 0 |
| 1 | 0 | 0 | 1 |
| 1 | 0 | 1 | 0 |
| 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 1 |

- **Constant examples:** $(0,0,0,0,0,0,0,0)$ and $(1,1,1,1,1,1,1,1)$.  
- **Balanced examples:** any tuple with exactly four $0$'s and four $1$'s, e.g. $(0,0,0,0,1,1,1,1)$ or $(1,0,0,1,1,0,0,1)$.

#### Combinatorics for balanced functions ($n=3$)

A balanced function for $n=3$ must have exactly half the outputs $0$ and half $1$, so four $0$'s and four $1$'s in the length-$8$ tuple.  
The number of distinct balanced tuples equals the number of ways to choose which $4$ of the $8$ positions are ones:

$$
\binom{8}{4}
$$

(See **[binomial coefficient](https://en.wikipedia.org/wiki/Binomial_coefficient)**)

This can be computed by counting ordered selections then removing order:

- Ordered selections of 4 distinct positions: $8 \times 7 \times 6 \times 5 = 1680$.
- Divide by the number of orders of those 4 positions, $4! = 24$.

Thus

$$
\binom{8}{4} = \frac{8\cdot7\cdot6\cdot5}{4!} = \frac{1680}{24} = 70.
$$

Therefore there are **70** distinct balanced functions when $n=3$.

#### Random-function model and probabilities

The random model used here is:

1. Choose uniformly between the two types $\{\text{constant},\text{balanced}\}$ (50/50).  
2. If constant: choose uniformly between the two constant tuples (all-0 or all-1).  
3. If balanced: choose uniformly among all $\binom{2^n}{2^{n-1}}$ balanced tuples.

For $n=3$:

- Probability of selecting a particular constant tuple (e.g., all zeros):
  $$
  P(\text{specific constant}) = \tfrac{1}{2}\times\tfrac{1}{2} = \tfrac{1}{4}.
  $$
- Probability of selecting a particular balanced tuple:
  $$
  P(\text{specific balanced}) = \tfrac{1}{2}\times\tfrac{1}{70} = \tfrac{1}{140}.
  $$

---
#### One-bit examples

**Reference:** These represent all possible **[Boolean functions](https://en.wikipedia.org/wiki/Boolean_function)** of one variable. The complete **[truth table](https://en.wikipedia.org/wiki/Truth_table)** enumeration is a fundamental concept in digital logic and the basis for understanding Deutsch's algorithm (see [IBM Quantum Learning: Deutsch-Jozsa Algorithm](https://learning.quantum.ibm.com/course/fundamentals-of-quantum-algorithms/quantum-query-algorithms#the-deutsch-jozsa-algorithm)).

1. Function definitions: four minimal one-bit functions are defined to illustrate constant and non-constant behaviours: f1 and f4 are constant, f2 is identity, f3 is logical NOT.

In [7]:
# One-bit example functions (illustration)
def f1(a):
    return 0 # constant function that always returns 0

def f2(a):
    return a # identity function: returns the input bit unchanged

def f3(a):
    return 0 if a else 1 # NOT function: returns 1 when input is 0, and 0 when input is 1

def f4(a):
    return 1 # constant function that always returns 1


2. random_one_bit_function: demonstrates that functions are **[first-class objects](https://en.wikipedia.org/wiki/First-class_function)** and can be stored in lists; **[random.choice](https://docs.python.org/3/library/random.html#random.choice)** returns one function object using **[uniform random selection](https://en.wikipedia.org/wiki/Discrete_uniform_distribution)**.
3. Test: the chosen function is evaluated on both inputs 0 and 1 to show its behaviour.

In [8]:
# Example: random one-bit function
def random_one_bit_function():
    fns = [f1, f2, f3, f4]  # list of candidate one-bit functions
    return random.choice(fns)  # select and return one function uniformly at random

# Quick test
f = random_one_bit_function()  # obtain a random one-bit function
print("f(0), f(1) =", f(0), f(1))  # evaluate the chosen function on both possible inputs

f(0), f(1) = 1 1


---
#### random_tuple generator
**Purpose**  
Produce a length-$2^n$ tuple representing either a **constant** function (all entries identical) or a **balanced** function (exactly half zeros and half ones).

**Reference:** This implements a **[uniform sampling](https://en.wikipedia.org/wiki/Discrete_uniform_distribution)** strategy over the space of promise functions used in the Deutsch-Jozsa problem. The **[Fisher-Yates shuffle](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle)** (implemented by `random.shuffle`) ensures unbiased random permutation of balanced functions.

* Type selection: ftype is chosen uniformly between 'constant' and 'balanced'.
* Constant branch: zero_or_one picks 0 or 1; (zero_or_one,) * (2**n) constructs the repeated-tuple efficiently.
* Balanced branch: a list is built with the required counts of zeros and ones; random.shuffle randomizes the order; conversion to tuple makes the result hashable/ immutable and consistent with the constant case.
* Notes: using a list for shuffling is necessary because random. shuffle operates in-place on mutable sequences.



In [9]:
def random_tuple(n):
    """Return a random constant or balanced tuple of length 2**n."""
    ftype = random.choice(['constant', 'balanced'])  # choose function type uniformly

    if ftype == 'constant':
        zero_or_one = random.choice([0, 1]) # choose whether the constant is 0 or 1
        t = (zero_or_one,) * (2**n) # create a tuple repeating that bit 2**n times

    else:
        # balanced: create a list with half zeros and half ones, then shuffle
        t = [0] * (2**(n-1)) + [1] * (2**(n-1)) # list with 2^(n-1) zeros followed by 2^(n-1) ones
        random.shuffle(t) # shuffle in-place to randomize positions
        t = tuple(t) # convert list to tuple for immutability/consistency

    return t # return the resulting tuple


---
#### Variadic Functions in Python

**What are Variadic Functions?**

A **[variadic function](https://en.wikipedia.org/wiki/Variadic_function)** is a function that accepts an arbitrary (variable) number of arguments. In languages like **[C](https://en.wikipedia.org/wiki/C_(programming_language))**, you typically must declare the exact number and type of arguments in the function signature (e.g., `int i, char c, char* s`). However, some functions—such as C's famous `printf`—can accept any number of arguments.

**Reference:** The concept of variadic functions dates back to early programming languages and is essential for flexible APIs. C's `printf` function is perhaps the most well-known example, using **[variable argument lists](https://en.wikipedia.org/wiki/Stdarg.h)** (`<stdarg.h>`) to handle formatting strings with any number of values.

**Python's `*args` Syntax**

Python provides syntax for creating variadic functions using the **asterisk operator (`*`)** before an argument name. The most common convention is to name this parameter `*args`, though any valid name can be used.

**How it works:**
- The `*` prefix tells Python to collect all positional arguments into a **[tuple](https://en.wikipedia.org/wiki/Tuple)**
- Inside the function, `args` becomes a tuple containing all the passed arguments
- You can pass any number of arguments (including zero)

**Simple demonstration:**

```python
def function(*args):
    print(args)

function(1, 2, 3, 4, 5)  # Output: (1, 2, 3, 4, 5)
```

**Reference:** This syntax is part of Python's **[argument unpacking](https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists)** features, documented in **[PEP 3132](https://www.python.org/dev/peps/pep-3132/)** and extensively covered in resources like **[Real Python's guide to *args and **kwargs](https://realpython.com/python-kwargs-and-args/)**.

---

#### The Explosion (Unpacking) Operator

The asterisk `*` serves **two roles** in Python:

1. **Packing (in function definitions):** `def f(*args)` — collects multiple arguments into one tuple
2. **Unpacking (in function calls):** `f(*my_list)` — expands a list/tuple into separate arguments

This second use is sometimes called the **"explosion operator"** because it "explodes" a collection into individual elements.

**Example:**
```python
def add_three(a, b, c):
    return a + b + c

numbers = [1, 2, 3]
result = add_three(*numbers)  # Same as add_three(1, 2, 3)
```

**Important:** The `*` here is NOT a dereference operator like in C/C++. In Python, it's specifically for argument packing/unpacking.

---

#### Helpers bin_args_to_int and closure fclosure

**Purpose**  
- `bin_args_to_int` converts a binary argument list into an integer index for tuple lookup.  
- `fclosure` creates a **[closure](https://en.wikipedia.org/wiki/Closure_(computer_programming))** that implements a fixed random oracle (constant or balanced) on $n$ bits.

**Reference:** The encoding technique converts **[binary numbers](https://en.wikipedia.org/wiki/Binary_number)** to **[decimal indices](https://en.wikipedia.org/wiki/Positional_notation)** for array lookup—a standard indexing strategy. The closure pattern captures the random tuple in its **[lexical scope](https://en.wikipedia.org/wiki/Scope_(computer_science)#Lexical_scope)**, modeling the "black box" oracle paradigm central to **[query complexity theory](https://en.wikipedia.org/wiki/Query_complexity)** and quantum algorithms (see [Quantum Algorithm Zoo](https://quantumalgorithmzoo.org/)).

**Why `bin_args_to_int` uses `*args`:**

The `bin_args_to_int` function demonstrates variadic functions perfectly:
- It doesn't know in advance how many binary inputs it will receive
- It accepts any number of boolean/truthy values
- The function works for 1 bit, 3 bits, 4 bits, or any $n$ bits
- This flexibility is essential for the quantum oracle abstraction

**How it works:**

1. **Accepts variadic input:** `*args` collects all binary arguments into a tuple
2. **Converts to binary string:** Each argument is evaluated as truthy (→ '1') or falsy (→ '0')
3. **Parses as integer:** The binary string is converted to a decimal index using `int(bits, 2)`
4. **Returns the index:** This index is used to look up the oracle's output in the pre-generated tuple

**Example:**
```python
bin_args_to_int(1, 0, 1)  # Args: (1, 0, 1)
# → bits = '101'
# → int('101', 2) = 5
# → returns 5
```

**Technical notes:**
- Closure pattern: `f_a` is fixed when `fclosure` is called; the returned `f` uses that fixed tuple, modeling an oracle.
- Indexing convention: the first argument corresponds to the most-significant bit in the binary string; maintain this ordering consistently.
- The variadic design allows the same `bin_args_to_int` function to work with any $n$-bit oracle without modification.

In [10]:
def bin_args_to_int(*args):
    """Convert binary args (truthy -> '1', falsy -> '0') to integer index."""
    bits = ''.join('1' if i else '0' for i in args) # build a binary string from positional args
    return int(bits, 2) # parse the binary string as base-2 integer

def fclosure(n):
    """Return a closure f(*args) that implements a random constant or balanced function on n bits."""
    f_a = random_tuple(n)                            # generate the underlying length-2^n tuple once

    def f(*args):
        if len(args) != n:
            return None  # guard: require exactly n input bits
        return f_a[bin_args_to_int(*args)]# map the input bits to an index and return the tuple element

    return f # return the inner function (closure)


---
#### try_all and is_constant_or_balanced

**Purpose**  
- `try_all` exhaustively evaluates the oracle on all $2^n$ inputs (classical brute-force).  
- `is_constant_or_balanced` classifies the oracle as `'constant'`, `'balanced'`, or `'neither'` using the exhaustive outputs.

**Reference:** This implements **[brute-force search](https://en.wikipedia.org/wiki/Brute-force_search)** over the input space—the classical baseline requiring $O(2^n)$ queries in the worst case. The **[itertools.product](https://docs.python.org/3/library/itertools.html#itertools.product)** generates the **[Cartesian product](https://en.wikipedia.org/wiki/Cartesian_product)** $(\{0,1\}^n)$ of all binary inputs. This exponential cost is what the Deutsch-Jozsa algorithm reduces to a single quantum query (see [Nielsen & Chuang, "Quantum Computation and Quantum Information", Section 1.4.3](https://en.wikipedia.org/wiki/Quantum_Computation_and_Quantum_Information)).

In [11]:
def try_all(f, n):
    """Try all 2**n binary inputs on f and return the list of outputs."""
    bin_tuples_n = it.product((0,1), repeat=n)  # generator for all binary tuples of length n
    vals = [f(*t) for t in bin_tuples_n] # evaluate f on each tuple and collect outputs
    return vals

def is_constant_or_balanced(f, n):
    """Return 'constant', 'balanced', or 'neither' for function f on n bits (naive test)."""
    results = np.array(try_all(f, n)) # get outputs and convert to numpy array for numeric checks

    # Check if all outputs are 0 or all outputs are 1
    if np.all(results == 0) or np.all(results == 1):
        return 'constant'

    # Check if the number of ones equals 2^(n-1) (balanced)
    elif results.sum() == 2**(n-1):
        return 'balanced'

    # Otherwise the function is neither constant nor balanced (should not occur under the promise model)
    else:
        return 'neither'

# Quick demonstration
f_test = fclosure(3) # create a random function on 3 bits
vals = try_all(f_test, 3) # enumerate outputs for all 8 inputs
print("outputs:", vals) # print the full output tuple
print("classification:", is_constant_or_balanced(f_test, 3))  # print the classification result


outputs: [0, 1, 0, 0, 0, 1, 1, 1]
classification: balanced


### Measuring probabilistic states

**Reference:** This section follows the pedagogical approach from **[IBM Quantum Learning: Measuring Probabilistic States](https://quantum.cloud.ibm.com/learning/en/courses/basics-of-quantum-information/single-systems/classical-information#measuring-probabilistic-states)**, which introduces **[measurement](https://en.wikipedia.org/wiki/Measurement_in_quantum_mechanics)** concepts for classical and quantum systems.

| **Bit** | **Common Labels** | **Truthy/Falsy** | **Ket Notation** | **Column Vector** |
|---------|-------------------|------------------|------------------|-------------------|
| 0       | 0, false, False   | falsy            | $\|0\rangle$     | $\begin{pmatrix}1 \\\\ 0\end{pmatrix}$ |
| 1       | 1, true, True     | truthy           | $\|1\rangle$     | $\begin{pmatrix}0 \\\\ 1\end{pmatrix}$ |

---

A reason **[binary functions](https://en.wikipedia.org/wiki/Boolean_function)** are discussed is that there are many different ways to represent the values **0** and **1**. People often say computers work with zeros and ones, but in practice programming languages expose **[Boolean types](https://en.wikipedia.org/wiki/Boolean_data_type)** with names such as `true` and `false`. The terminology varies, but the underlying idea — two distinct logical values — is the same.

Programming languages also differ in how they treat other values as **[truthy or falsy](https://en.wikipedia.org/wiki/Truthiness)**. For example, an empty list is commonly treated as falsy while a non-empty list is truthy. There is a body of programming-language theory about whether values such as `null` or `None` should be considered true or false. Treating `None` as falsy can hide bugs: a missing reference that should have raised an error may instead behave like `false` and only cause trouble later. For that reason some language designers and programmers prefer to treat `None` (or `null`) as a distinct exceptional case rather than as a simple falsy boolean. In Python, for instance, **[identity checks](https://docs.python.org/3/reference/expressions.html#is)** like `is None` are recommended when the distinction matters.

There are many theoretical and physical ways to represent zeros and ones. A standard mathematical representation uses two-dimensional **[column vectors](https://en.wikipedia.org/wiki/Row_and_column_vectors)** (one column, two rows) to represent the **[basis states](https://en.wikipedia.org/wiki/Basis_(linear_algebra))**. These are commonly written in **[bra–ket notation](https://en.wikipedia.org/wiki/Bra%E2%80%93ket_notation)** as `|0⟩` and `|1⟩`. **[Matrices](https://en.wikipedia.org/wiki/Matrix_(mathematics))** are used because they perform mathematical operations: matrices can be multiplied and manipulated to represent operations on these basis vectors. Matrix and vector representations are extremely useful because **[linear-algebra](https://en.wikipedia.org/wiki/Linear_algebra)** operations implement the transformations that are needed when modelling information and information-processing devices.

A practical example from **[cryptography](https://en.wikipedia.org/wiki/Cryptography)** and **[hashing](https://en.wikipedia.org/wiki/Cryptographic_hash_function)** was mentioned. In **[Bitcoin mining](https://en.wikipedia.org/wiki/Bitcoin#Mining)** and in secure-hash design (for example **[SHA-256](https://en.wikipedia.org/wiki/SHA-2)**), it is desirable that each output bit behaves like an **[unbiased random bit](https://en.wikipedia.org/wiki/Fair_coin)** — roughly a 50/50 distribution for each bit. The unpredictability of each bit prevents easy guessing of hash outputs; a uniform 50/50 distribution across bits is a design goal for cryptographic hash functions.

A pedagogical reference was made to researchers such as **[John Watrous](https://jhwatrous.github.io/)** who use the column-vector / matrix representation extensively. The shorthand ket notation (for example `|0⟩`, `|1⟩`) is convenient: it is easier to write the ket than the corresponding 2×1 column matrix every time. When using this notation it must be clear which matrix is meant by each ket; the ket is a compact label for the underlying vector and for the matrix operations that act on it.

---

#### Measuring a system in a probabilistic state — explanation and examples

Next, consider what happens when a system is **[measured](https://en.wikipedia.org/wiki/Measurement_in_quantum_mechanics)** while it is in a **[probabilistic state](https://en.wikipedia.org/wiki/Probability_distribution)**. Here, *measuring* means looking at the system and recognizing the **[classical state](https://en.wikipedia.org/wiki/Classical_information_theory)** it is in without ambiguity. Intuitively, a probabilistic state cannot be "seen" directly; when the system is **[observed](https://en.wikipedia.org/wiki/Observer_(quantum_physics))**, one sees one of the possible classical states.

When a measurement is made, the observer's knowledge about the system changes. If the measurement reveals that the system is in the classical state `a ∈ Σ`, then the **[probability vector](https://en.wikipedia.org/wiki/Probability_vector)** describing the observer's knowledge becomes the vector with a `1` in the entry for `a` and `0` for all other entries. That vector indicates the system is in the classical state `a` with certainty; this **[standard-basis vector](https://en.wikipedia.org/wiki/Standard_basis)** is denoted `|a⟩` (read "ket a"). Vectors of this form are called **[standard basis vectors](https://en.wikipedia.org/wiki/Standard_basis)**.

**Example (bit basis):** for a single bit the standard basis vectors are `|0⟩` and `|1⟩`, represented by the column vectors (1, 0)ᵀ and (0, 1)ᵀ respectively. Any two-dimensional column vector can be written as a **[linear combination](https://en.wikipedia.org/wiki/Linear_combination)** of these two basis vectors; this generalizes to larger classical state sets.

**Everyday analogy (coin):** suppose a **[fair coin](https://en.wikipedia.org/wiki/Fair_coin)** is flipped and then covered. The probabilistic state of the coin (from the observer's perspective) is "heads with probability 1/2, tails with probability 1/2." If the coin is uncovered and the observer sees tails, the observer updates the description of the coin to the standard basis vector `|tails⟩`. If the coin is covered again and later uncovered by the same observer, the classical state remains tails and the description `|tails⟩` remains correct for that observer.

This update-of-knowledge perspective is important: probabilistic states describe knowledge or belief, not necessarily an **[ontological](https://en.wikipedia.org/wiki/Quantum_mechanics#Philosophical_implications)** superposition of actual classical states. Measuring changes the observer's knowledge. For example, after the coin flip but before looking, the coin is actually either heads or tails — the observer just does not know which. When the observer looks and sees tails, the observer's probability vector becomes `|tails⟩`. To someone who did not see the coin when it was uncovered, the probabilistic state would remain the 50/50 description. Different observers can legitimately have different probability vectors for the same physical system because they have different information. This interpretation is related to **[Bayesian probability](https://en.wikipedia.org/wiki/Bayesian_probability)** and the **[Bayesian interpretation of quantum mechanics](https://en.wikipedia.org/wiki/Quantum_Bayesianism)**.

Although this behaviour is straightforward for classical probabilistic systems, quantum systems behave analogously in the sense that measurement yields a definite classical outcome and updates the observer's description. The difference is that **[quantum amplitudes](https://en.wikipedia.org/wiki/Probability_amplitude)** and **[quantum interference](https://en.wikipedia.org/wiki/Wave_interference#Quantum_interference)** make the underlying mathematics and the allowed transformations richer; establishing the classical analogue first helps make the quantum case less surprising.

---

#### Standard basis and column-vector decomposition

**Reference:** This discussion follows the **[linear algebra](https://en.wikipedia.org/wiki/Linear_algebra)** framework used in quantum information theory, particularly as presented by **[John Watrous](https://jhwatrous.github.io/)** in his work on quantum information (see [The Theory of Quantum Information](https://cs.uwaterloo.ca/~watrous/TQI/)).

These two matrices — the two-row, one-column **[column vectors](https://en.wikipedia.org/wiki/Row_and_column_vectors)** — play a special role. They are called the **[standard basis](https://en.wikipedia.org/wiki/Standard_basis)**. Any two‑by‑one column vector can be written as a **[linear combination](https://en.wikipedia.org/wiki/Linear_combination)** of these basis vectors. Concretely, for a column vector with entries `a` and `b`:

$$
\begin{bmatrix} a \\ b \end{bmatrix}
= a \begin{bmatrix} 1 \\ 0 \end{bmatrix}
+ b \begin{bmatrix} 0 \\ 1 \end{bmatrix}.
$$

When a **[scalar](https://en.wikipedia.org/wiki/Scalar_(mathematics))** is taken outside a matrix, it multiplies the whole matrix; this is standard **[scalar multiplication](https://en.wikipedia.org/wiki/Scalar_multiplication)** of matrices. That property is used in the decomposition above.

These two basis vectors are special, but they are not the only useful vectors. Other vectors can be used as **[basis elements](https://en.wikipedia.org/wiki/Basis_(linear_algebra))** in different contexts, but the standard basis is convenient and widely used.

One practical point to watch for (as emphasised by **[Watrous](https://cs.uwaterloo.ca/~watrous/)** and others) is **[normalization](https://en.wikipedia.org/wiki/Unit_vector#Normalized_vector)** or probability conventions. In some presentations the entries of a **[probability vector](https://en.wikipedia.org/wiki/Probability_vector)** are required to sum to one. When representing a classical probabilistic bit as a column vector, the two entries represent the probabilities of the two classical states (probability of 0 and probability of 1). That probability vector can be written in the standard basis as a **[weighted sum](https://en.wikipedia.org/wiki/Weighted_sum)** of the basis vectors. For example, if the probability of 0 is $p$ and the probability of 1 is $1-p$, then

$$
\begin{bmatrix} p \\ 1-p \end{bmatrix}
= p \begin{bmatrix} 1 \\ 0 \end{bmatrix}
+ (1-p) \begin{bmatrix} 0 \\ 1 \end{bmatrix}.
$$

This expresses the same **[probabilistic model](https://en.wikipedia.org/wiki/Probability_theory)** in the standard-basis vector form.

- The standard-basis vectors make decomposition and bookkeeping explicit: the first **[coordinate](https://en.wikipedia.org/wiki/Coordinate_vector)** multiplies the $|0\rangle$ basis vector and the second coordinate multiplies the $|1\rangle$ basis vector.  
- When a scalar multiplies a basis vector, the scalar scales every entry of that vector; multiplying by zero yields the **[zero-vector](https://en.wikipedia.org/wiki/Zero_element#Linear_algebra)** contribution.  
- The decomposition above is a purely **[algebraic identity](https://en.wikipedia.org/wiki/Identity_(mathematics))** and is valid for any two-component column vector.  
- In probabilistic contexts the entries are **[nonnegative](https://en.wikipedia.org/wiki/Sign_(mathematics)#Real_numbers)** and sum to one; in general linear-algebra contexts they are arbitrary **[scalars](https://en.wikipedia.org/wiki/Scalar_(mathematics))**.

---



### Probability Column Vectors

For quantum stuff, we're gonna need column vectors. We'll use numpy and matplotlib.pyplot as plt.

> Suppose $X$ is a bit. Based on what we know or expect about what has happened to in the past, we might perhaps believe that is in the classical state $0$ with probability $\frac{3}{4}$ and $1$ with probability $\frac{1}{4}$. We may represent these beliefs by... a column vector.
> 
> $ \begin{bmatrix} 0.75 \\ 0.25 \end{bmatrix} $
> 
> *From [Classical states and probability vectors](https://quantum.cloud.ibm.com/learning/en/courses/basics-of-quantum-information/single-systems/classical-information#classical-states-and-probability-vectors) on IBM Quantum Learning.*

So it's a **Probability column vector**.

In [12]:
# The column vector in numpy.
x = np.array([[0.75], [0.25]])

Obviously, to represent them in numpy, we just use an array. 

Now, just be careful, in numpy, one dimensional arrays can be messy. **When you're representing a column vector, you actually have a two dimensional array.**

For example, to represent "Always 0":

In [13]:
# Always 0.
zero = np.array([[1], [0]])

print(zero)

[[1]
 [0]]


In [14]:
# Show.
x

array([[0.75],
       [0.25]])

In [15]:
# Cleaner.
print(x)

[[0.75]
 [0.25]]


---

> We can represent any probabilistic state through a column vector satisfying two properties:
> 
> 1. All entries of the vector are nonnegative real numbers.
> 2. The sum of the entries is equal to 1.
> 
> *From [Classical states and probability vectors](https://quantum.cloud.ibm.com/learning/en/courses/basics-of-quantum-information/single-systems/classical-information#classical-states-and-probability-vectors) on IBM Quantum Learning.*

You can represent any probabilistic state through a column vector. All the entries are non-negative real numbers. It doesn't make any sense to have a probability of minus 0.5.

In [16]:
# Type.
x.dtype

dtype('float64')

In [17]:
# All nonnegative.
# This is going to check for me - numpy.all() goes into a numpy array 
# and checks that all the entries have whatever property.
# So if they're all true, you get true, if any of them are false, then you get false.
# Numpy does operator overloading. This goes into X, which is a numpy array,
# and it asks for each entry in X, is it greater than or equal to 0?
# Technically, this returns a two by one array - it actually returns the boolean array.
np.all(x >= 0.0)

np.True_

In [18]:
# The sum of all probabilities should be 1.
# Whether it's a two dimensional array, a three dimensional, one dimension, 
# all() values if there's no structure involved.
x.sum()

np.float64(1.0)

#### Zero and One

> For example, assuming that the system we have in mind is a bit, the standard basis vectors are given by
> 
> $\ket{0} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \qquad \ket{1} = \begin{bmatrix} 0 \\ 1 \end{bmatrix} $
> 
> *From [Measuring probabilistic states](https://quantum.cloud.ibm.com/learning/en/courses/basics-of-quantum-information/single-systems/classical-information#measuring-probabilistic-states) on IBM Quantum Learning.*

In [19]:
######33# Always 0.
zero = np.array([[1], [0]])

print(zero)

[[1]
 [0]]


In [20]:
# Always 1.
one = np.array([[0],[1]])

print(one)

[[0]
 [1]]


In [21]:
# Sometimes call these ket_0 and ket_1.
ket_0 = np.array([[1], [0]])
ket_1 = np.array([[0], [1]])

In [22]:
# ket_0 and ket_1 work well together to represent other states.
ket_x = 0.75 * ket_0 + 0.25 * ket_1

print(ket_x)

[[0.75]
 [0.25]]


#### Matrices

Matrices are two dimensional arrays.

In [23]:
# Matrices as two dimensional arrays.
A = [
  [5, 8, 7],
  [9, 2, 1],
  [4, 6, 3],
]

B = [
  [1, 4, 8],
  [2, 9, 6],
  [3, 5, 7],
]

In [24]:
# Number of rows in A.
len(A)

3

In [25]:
# Number of columns in B.
len(B[0])

3

#### Why Matrix Multiplication is Needed

**[Probability vectors](https://en.wikipedia.org/wiki/Probability_vector)** are combined with matrices to represent operations applied to bits. The matrix represents an operation that transforms the bit's state.

**The workflow:**
- **Start with:** A bit in some state (represented as a probability vector)
- **Apply operation:** Multiply by a matrix (the matrix goes in front)
- **Result:** Get a new state (a new probability vector)

**Real example - the NOT gate:**
- Start: A bit that's 0.75 probability of being 0 and 0.25 probability of being 1
- Apply a **[NOT operation](https://en.wikipedia.org/wiki/Inverter_(logic_gate))** to flip it, even without knowing its exact value
- Result: It becomes 0.25 probability of 0 and 0.75 probability of 1
- The probabilities swap!

**Why this matters:**
- Operations can be applied to bits even when their exact values are unknown
- In quantum computing, bits are often in uncertain states (**[superposition](https://en.wikipedia.org/wiki/Quantum_superposition)**)
- A mathematical way is needed to describe what happens when operations are applied to these uncertain states
- **[Matrix multiplication](https://en.wikipedia.org/wiki/Matrix_multiplication)** provides that tool

---

#### Matrix Multiplication - A Concrete Example

Here's exactly how matrix multiplication works with a real example using two 3×3 matrices 3 rows by 3 columns:

**Matrix A:**
```
[ 5  8  7 ]
[ 9  2  1 ]
[ 4  6  3 ]
```

**Matrix B:**
```
[ 1  4  8 ]
[ 2  9  6 ]
[ 3  5  7 ]
```

**The core concept: Dot Product**

The key operation is the **[dot product](https://en.wikipedia.org/wiki/Dot_product)**:
- Take a row from the first matrix (a list of numbers)
- Take a column from the second matrix (another list of numbers)
- Multiply corresponding entries together
- Add up all those products
- That sum becomes one entry in the result

---

**Step-by-step calculation for position (Row 2, Column 1):**

Calculate one specific entry to show the process:

1. **Identify the row and column:**
   - Target: row 2, column 1 of the result
   - Take **row 2 from matrix A**: [9, 2, 1]
   - Take **column 1 from matrix B**: [1, 2, 3]

2. **Multiply corresponding entries:**
   - First entries: 9 × 1 = 9
   - Second entries: 2 × 2 = 4
   - Third entries: 1 × 3 = 3

3. **Add them all up:**
   - Sum: 9 + 4 + 3 = **16**

4. **Place the result:**
   - The answer 16 goes into **row 2, column 1** of the result matrix

---

**The complete result matrix (A × B):**
```
[ 42  106  121 ]
[ 16   60   90 ]
[ 25   85   85 ]
```

**How all entries are calculated:**
- For **each position** (i, j) in the result matrix:
  - Take **row i** from matrix A
  - Take **column j** from matrix B
  - Do the dot product
  - Put the answer in position (i, j)
- Repeat this 9 times (because 3 rows × 3 columns = 9 entries total)

---

**Important rules:**

- **Dimension compatibility:** The number of columns in the first matrix MUST equal the number of rows in the second matrix
  - Why? Because when doing the dot product, the same number of values is needed in each list
  - Example: A is 3×3, B is 3×3
  - A has 3 columns, B has 3 rows → They match! 
  - The result will be 3×3 (rows from A × columns from B)

- **Order matters:** A × B is usually different from B × A
  - Matrix multiplication is NOT **[commutative](https://en.wikipedia.org/wiki/Commutative_property)**
  - Always be careful about which matrix goes first

- **The process is systematic:**
  - Row from first matrix, column from second matrix
  - Always in that order
  - The position (row, column) in the result corresponds to which row and column were picked

---

**Why this works for quantum computing:**

- When complex systems have numbers that interact in messy ways, matrix multiplication handles it systematically
- It's a universal mathematical tool that captures how **[quantum states](https://en.wikipedia.org/wiki/Quantum_state)** transform
- The matrices represent **[quantum gates](https://en.wikipedia.org/wiki/Quantum_logic_gate)** (operations)
- The vectors represent quantum states
- Multiplying them together shows what happens after the operation
- This framework is why **[linear algebra](https://en.wikipedia.org/wiki/Linear_algebra)** is so central to quantum computing!

In [26]:
# Multiply.
def matrix_multiply(X, Y):
  """
  Multiply matrices X and Y. Note we should check but don't, for each matrix:
    1. Each row has same number of elements.
    2. Each column has same number of elements.
  Also that:
    3. Number of columns in X equals number of rows in Y.
  """
  return [
    [
        sum([X[i][k] * Y[k][j] for k in range(len(X))])
        for j in range(len(Y[0]))
    ]
    for i in range(len(X))
  ]

In [27]:
# AB.
matrix_multiply(A, B)

[[42, 127, 137], [16, 59, 91], [25, 85, 89]]

In [28]:
# Display it a bit nicer.
for r in matrix_multiply(A, B):
    print('  '.join(f'{x:3}' for x in r))

 42  127  137
 16   59   91
 25   85   89


### Using NumPy

> Unlike in many matrix languages, the product operator `*` operates elementwise in NumPy arrays. The matrix product can be performed using the `@` operator...
> 
> *From [Basic operations](https://numpy.org/doc/stable/user/quickstart.html#basic-operations) in the [NumPy quickstart](https://numpy.org/doc/stable/user/quickstart.html)*

In [29]:
# Matrix multiplication.
np.array(A) @ np.array(B)

array([[ 42, 127, 137],
       [ 16,  59,  91],
       [ 25,  85,  89]])

In [30]:
# Problem 1 solution goes here
# Deterministic operations
def random_constant_balanced():
    """
    Returns a randomly chosen constant or balanced function
    that takes 4 Boolean inputs.
    """
    # Step 1: Randomly decide constant or balanced
    is_constant = random.choice([True, False])
    
    if is_constant:
        # Step 2a: Constant function - all 16 outputs are the same
        value = random.choice([0, 1])
        outputs = [value] * 16
    else:
        # Step 2b: Balanced function - 8 zeros and 8 ones, shuffled
        outputs = [0] * 8 + [1] * 8
        random.shuffle(outputs)
    
    # Step 3: Return a function that uses this lookup table
    def f(a, b, c, d):
        # Convert 4 bits to index (0-15)
        index = 8*a + 4*b + 2*c + d
        return outputs[index]
    
    return f

In [31]:
# Test the function
f = random_constant_balanced()

# Print all outputs to see the pattern
print("Input -> Output")
for i in range(16):
    a = (i >> 3) & 1
    b = (i >> 2) & 1
    c = (i >> 1) & 1
    d = i & 1
    print(f"({a},{b},{c},{d}) -> {f(a,b,c,d)}")

# Count outputs to verify constant/balanced
outputs = [f((i>>3)&1, (i>>2)&1, (i>>1)&1, i&1) for i in range(16)]
zeros = outputs.count(0)
ones = outputs.count(1)
print(f"\nZeros: {zeros}, Ones: {ones}")
print(f"Type: {'Constant' if zeros == 0 or zeros == 16 else 'Balanced'}")

Input -> Output
(0,0,0,0) -> 0
(0,0,0,1) -> 1
(0,0,1,0) -> 1
(0,0,1,1) -> 1
(0,1,0,0) -> 1
(0,1,0,1) -> 0
(0,1,1,0) -> 1
(0,1,1,1) -> 0
(1,0,0,0) -> 0
(1,0,0,1) -> 1
(1,0,1,0) -> 1
(1,0,1,1) -> 0
(1,1,0,0) -> 0
(1,1,0,1) -> 0
(1,1,1,0) -> 0
(1,1,1,1) -> 1

Zeros: 8, Ones: 8
Type: Balanced


## Problem 2: Classical Testing for Function Type

**Instructions:**  
Write a Python function `determine_constant_balanced(f)` that analyzes a function f and returns "constant" or "balanced". Include a brief note on efficiency.

In [32]:
# Problem 2 solution goes here

## Problem 3: Quantum Oracles

**Instructions:**  
Using **[Qiskit](https://qiskit.org/)**, create **[quantum oracles](https://en.wikipedia.org/wiki/Deutsch%E2%80%93Jozsa_algorithm#Algorithm)** for each of the possible single-Boolean-input functions in Deutsch's algorithm. Explain how each oracle works.

In [33]:
# Problem 3 solution goes here

## Problem 4: Deutsch's Algorithm with Qiskit

**Instructions:**  
Design a **[quantum circuit](https://en.wikipedia.org/wiki/Quantum_circuit)** that solves Deutsch's problem for single-input functions. Demonstrate its use with each oracle and explain the interference pattern.

In [34]:
# Problem 4 solution goes here

## Problem 5: Scaling to the Deutsch–Jozsa Algorithm

**Instructions:**  
Use Qiskit to create a quantum circuit for four-bit functions generated in Problem 1. Demonstrate it on constant and balanced functions and explain the results.

In [35]:
# Problem 5 solution goes here