# Exercise Sheet 4
## Instructions
The instructions for this exercise sheet are the same as the previous, so this section is just a summary to serve as a reminder. Refer to the material on Engage and at the start of Exercise Sheet 1 for more detail, and if you are unsure ask on the Q&A forum.

This exercise sheet counts towards your overall grade for the unit.

Complete each question in the code cell provided, which will usually contain some skeleton code to help you start the question. Unless specified otherwise you may change the skeleton code as you wish, but your code must pass the formatting tests in the following code cell – if the cell runs without errors then your code is eligible for submission. However these will only test the basics of the question. Your code will be subject to additional hidden tests which will determine your grade. Make sure you thoroughly test your own code to check you think it works correctly before submission.

## Questions
### Question 1 & 2
*(worth 25% each)*

In the cell below, implement 
1. the *bubble sort* and 
2. the *merge sort* algorithms.

Unlike the examples in the unit material, please make your functions *return new sorted versions* of the input lists, *without* modifying the input lists. In other words, do not worry about making your algorithms in-place.

If you find it easier to think about modifying the list in-place (more likely with bubble sort than merge sort), you can always write `new_list = in_list.copy()` on the first line, sort `new_list` in place, then return `new_list`.

#### Bubble Sort
Bubble sort works by iterating over the list, comparing adjacent items, and swapping them if they are out of order. 

Notice that after the first pass of a bubble sort, the biggest item is always moved to the end of the list. This means the next iteration does not need to check the final position of the list. 

In addition, you can keep track of how many items were swapped on each iteration. If this count is zero on any iteration, then the list must be fully sorted, in which case the algorithm can stop early.

#### Merge Sort
To understand merge sort, let's first understand the *merge* operation. The merge operation takes two *sorted* lists and combines them into another sorted list. Imagine we are merging `list1` and `list2`, both of which are sorted. Create two variables, one pointing to the first element (zero) of each list, call this `ptr1` and `ptr2`.

Then, go through each index of the merged list (e.g. `i in range(0, len(list1)+len(list2))`). Set position `i` of the merged list to the *smaller* value of `list1[ptr1]` and `list2[ptr2]`, then advance the corresponding pointer by one.

So initially you will compare the first element of each list, if the value in `list1` is smaller then copy this into the merged list, and then compare the second item of `list1` with the first item of `list2`.

Keep going until one pointer goes beyond its list – at this point, copy the rest of the other list into the merged list.

Have a look at an example of the merge operation below:

<br /><video controls loop autoplay width=600 src="./resources/merge.mp4">
</video>

Now, to understand *merge sort*, realise that any list of size 1 must be considered sorted. So, here is the process for the input list:
* If `in_list` is length 1, return `in_list`
* Otherwise, split `in_list` list into two halves: `left` and `right`
* Merge sort `left`
* Merge sort `right`
* Merge `left` and `right` into `merged`
* Return `merged`

For any size input list, this will be recursively split into two until both halves are length 1, and then the final sorted list will be rebuilt by merging sorted lists.

In [None]:
def bubble_sort(in_list):
    # 元のリストを変更しないように、コピーを作る
    new_list = in_list.copy()

    # リストの長さを取得
    n = len(new_list)

    # 空のリストならそのまま返す
    if n == 0:
        return new_list

    # 外側のループ：全体を何回繰り返すか
    for i in range(n):
        # スワップが起きたかどうかを記録する変数
        swapped = False

        # 内側のループ：隣同士を比較する
        # 最後のi個はもうソート済みなので見なくていい
        for j in range(n - 1 - i):
            # 左が右より大きかったら入れ替える
            if new_list[j] > new_list[j + 1]:
                # 入れ替え
                temp = new_list[j]
                new_list[j] = new_list[j + 1]
                new_list[j + 1] = temp
                swapped = True

        # もしスワップが一度も起きなかったら、もうソート完了
        if not swapped:
            break

    return new_list


def merge_sort(in_list):
    # リストの長さが1以下なら、もうソート済み
    if len(in_list) <= 1:
        return in_list

    # リストを半分に分ける
    mid = len(in_list) // 2
    left = in_list[:mid]
    right = in_list[mid:]

    # 左半分をソート（再帰呼び出し）
    sorted_left = merge_sort(left)
    # 右半分をソート（再帰呼び出し）
    sorted_right = merge_sort(right)

    # 2つのソート済みリストをマージする
    merged = []
    left_index = 0
    right_index = 0

    # 両方のリストにまだ要素が残っている間
    while left_index < len(sorted_left) and right_index < len(sorted_right):
        # 左の方が小さければ、左から取る
        if sorted_left[left_index] <= sorted_right[right_index]:
            merged.append(sorted_left[left_index])
            left_index += 1
        # 右の方が小さければ、右から取る
        else:
            merged.append(sorted_right[right_index])
            right_index += 1

    # 左に残りがあれば全部追加
    while left_index < len(sorted_left):
        merged.append(sorted_left[left_index])
        left_index += 1

    # 右に残りがあれば全部追加
    while right_index < len(sorted_right):
        merged.append(sorted_right[right_index])
        right_index += 1

    return merged


In [None]:
# Note that your code will be checked to ensure you are implementing the correct algorithms!
assert(bubble_sort([37, 42, 9, 19, 35, 4, 53, 22]) == [4, 9, 19, 22, 35, 37, 42, 53])
assert(bubble_sort([5, 4, 3, 2, 1]) == [1, 2, 3, 4, 5])
assert(bubble_sort([]) == [])

assert(merge_sort([37, 42, 9, 19, 35, 4, 53, 22]) == [4, 9, 19, 22, 35, 37, 42, 53])
assert(merge_sort([5, 4, 3, 2, 1]) == [1, 2, 3, 4, 5])
assert(merge_sort([]) == [])

### Question 3
*(worth 25%)*

In the markdown cell below, in your own words, explain the time complexity, space complexity, and stability of bubble sort and merge sort – both as algorithms, and your specific implementations.

Of course, these are well known algorithms, so we are aware you can look up the answers. The marks in this section are awarded for for explaining *why* these properties hold, in your own words. I encourage you to try to work them out yourself before you look up the answers, this might make explaining it easier.

## Question 3: Bubble Sort と Merge Sort の計算量と安定性について

### Bubble Sort（バブルソート）

#### 時間計算量（Time Complexity）
- **最悪の場合：O(n²)**
  - 理由：リストが逆順に並んでいる場合、外側のループがn回、内側のループも最大n回回るので、合計でn×n回の比較が必要になるから。例えば[5,4,3,2,1]のような完全に逆順のリストだと、一番時間がかかります。

- **最良の場合：O(n)**
  - 理由：リストが既にソート済みの場合、1回ループを回すだけで「スワップが発生しなかった」と判定できて終了するから。私の実装では`swapped`フラグで早期終了できるようにしています。

- **平均的な場合：O(n²)**
  - 理由：ランダムなデータの場合、だいたい半分くらいの要素をスワップする必要があるので、結局n²に近い回数の操作が必要になるから。

#### 空間計算量（Space Complexity）
- **O(n)**
  - 理由：私の実装では`in_list.copy()`で新しいリストを作っているので、元のリストと同じサイズのメモリが必要です。もし元のリストを直接変更する（in-place）実装にすれば、O(1)で済むはずですが、今回は「新しいリストを返す」という指示だったので、O(n)になりました。

#### 安定性（Stability）
- **安定（Stable）**
  - 理由：同じ値の要素の順序が変わらないから。私のコードでは`if new_list[j] > new_list[j + 1]:`という条件を使っていて、`>=`ではなく`>`なので、同じ値の場合はスワップしません。これによって、元々の順序が保たれます。

---

### Merge Sort（マージソート）

#### 時間計算量（Time Complexity）
- **最悪・最良・平均すべて：O(n log n)**
  - 理由：リストを毎回半分に分割していくので、分割の深さがlog nになります（2で割り続けるとlog回で1になる）。そして、各深さでn個の要素をマージする必要があるので、合計でn × log n回の操作になります。
  - データの並び方に関係なく、常に同じ回数分割とマージを行うので、どんな場合でもO(n log n)です。これがバブルソートより優れている点です。

#### 空間計算量（Space Complexity）
- **O(n)**
  - 理由：マージする時に毎回新しいリスト`merged`を作っているので、最終的に元のリストと同じくらいのサイズのメモリが必要になります。再帰呼び出しのスタックもlog n個積まれますが、これはnに比べると小さいので、全体としてはO(n)です。

#### 安定性（Stability）
- **安定（Stable）**
  - 理由：マージする時に、左右の値が同じ場合は左側から取るようにしているから（`if sorted_left[left_index] <= sorted_right[right_index]:`の`<=`の部分）。これによって、元々左側にあった要素が先に来るので、同じ値の順序が保たれます。

---

### まとめ

バブルソートは実装が簡単で分かりやすいですが、データ量が増えると急激に遅くなります。一方、マージソートは少し複雑ですが、大量のデータでも安定して速いです。どちらも安定ソートなので、例えば「名前順にソートした後に年齢順にソートしても、同じ年齢の人の名前順は保たれる」という特性があります。

実際の使い分けとしては、データが少ない時やほぼソート済みの時はバブルソート、大量のデータを扱う時はマージソートを使うのが良いと思います。


### Question 4
*(worth 25%)*

When we write simple arithmetic expressions, we use *infix notation*, the operator goes in the middle of the two arguments, such as `1 + 2`, or `5 - 3`.

An alternative is to use *prefix notation*; here the same expressions would be written `+ 1 2` and `- 5 3`. This is also called *polish notation*.

The reason this is useful is that it removes the need for parentheses in nested expressions. Suppose we want to multiply the two expressions. 

Using infix, we must use parentheses to ensure the operators are evaluated in the correct order: `(1 + 2) * (5 - 3)`.

Using prefix, we can write: `* + 1 2 - 5 3`. So long as all the operators are binary (two arguments), no parentheses are needed, no matter how complex the expression.

You can also build a *binary tree* to represent a nested expression unambiguously.

<img src="./resources/mathstree.png" width=400 />

Notice if you traverse the tree in-order (LNR) you get infix notation (without parentheses). If you traverse the tree pre-order (NLR) you get prefix notation. If you traverse the tree post-order (LRN) then you get *postfix notation*, which is also called *reverse polish notation*, which is also unambiguous under the same conditions.

In the cell below, write a function which takes a string containing a prefix notation mathematical expression, builds a binary tree, and then traverses it to produce an equivalent postfix notation expression.

The input expression will be valid. It will only contain positive integers and the operators `+`, `-`, `/`, and `*`. Terms will always be separated by a single space.

You are welcome to reuse the code from the unit material.

Hint: start simple, use the tests below to help you structure your thinking.

In [1]:
def prefix_to_postfix(expression):
    tokens = expression.split()
    stack = []

    def is_operator(token):
        return token in ['+', '-', '*', '/']

    # 前置記法は右から左に読む
    for token in reversed(tokens):
        if is_operator(token):
            # スタックから2つ取り出す（順番に注意！）
            operand1 = stack.pop()
            operand2 = stack.pop()

            # 後置記法: "operand1 operand2 operator"
            postfix = operand1 + " " + operand2 + " " + token
            stack.append(postfix)
        else:
            # 数字はそのまま積む
            stack.append(token)

    return stack[-1]


# テスト
assert(prefix_to_postfix("5") == "5")
assert(prefix_to_postfix("+ 10 20") == "10 20 +")
assert(prefix_to_postfix("* + 1 2 - 5 3") == "1 2 + 5 3 - *")

print("✓ 全部合格！")


✓ 全部合格！


In [2]:
assert(prefix_to_postfix("5") == "5")
assert(prefix_to_postfix("+ 10 20") == "10 20 +")
assert(prefix_to_postfix("* + 1 2 - 5 3") == "1 2 + 5 3 - *")