## 完全二叉树 Complete Binary Tree
必定是从上到下，从左到右排列完整的二叉树。也就是说对于任何一个node，要想有right child必定先有left child。
#### Tips:只有当N是2的指数时，层数才会增加

## Heap的特性
1. root一定是最小值（最小堆）或最大值（最大堆）
2. 父节点的键值不小于子节点的键值
3. Heap用动态数组实现
4. 存入heap的元素必定是可排序的。若想存入class，需要重写__ls__和__eq__方法。
5. 已知子节点为i，父节点则为(i-1)//2
6. 已知父节点为i，左子为2i+1

## 以下用最小堆举例
### Insert: 插入到heap的尾端，即在数组上append一个元素。然后对这个元素进行upheap操作。
            - 由于树的高度为Logn，所以insert的时间复杂度为logn
            - upheap: 当发现父节点比子节点大的时候，做swap。重复这个过程，知道堆的顺序恢复正常
### Delete：只能删除root。先将root和tail进行交换，然后对pop tail，最后对root进行downheap操作：
            - 同理，delete的时间复杂度为logn
            - downheap：判断node和left,right(如果有的话)，的大小，然后把最小的放在node的位置。重复这个过程直到Heap的顺序恢复。

### 建堆的时间复杂度：O(n)，
https://blog.csdn.net/qq_34228570/article/details/80024306

# Solution:
找前k个最大元素需要建立一个k大的最小堆，往堆里添加元素，当堆里的元素超过k个的时候进行pop使得堆的大小保持为k。遍历结束的时候，堆里将会剩下k个最大的元素，因为其他小的都已经被pop出去了。

In [2]:
import heapq

def findKthLargest(nums, k):
    heap = []
    for num in nums:
        heapq.heappush(heap, num)
        if len(heap) > k:
            heapq.heappop(heap)
    
    return heapq.heappop(heap)

nums = [5,11,3,6,12,9,8,10,14,1,4,2,7,15]
k = 5
findKthLargest(nums, k)

10

### Ex.2 Top K Frequent Words

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

# Solution:
解法同上，但是需要注意的点是，dict无法排序，因此需要提取出count放进堆里进行排序

In [36]:
import collections

def topKFrequent(words, k):
    heap = []
    counts = collections.Counter(words)
#     print(counts)
    for word, count in counts.items():
        heapq.heappush(heap, [count, word])
        if len(heap) > k:
            heapq.heappop(heap)
        
    res = []
    for _ in range(k):
        pair = heapq.heappop(heap)
        res.append(pair[1])
    return res

words = ["i", "love", "you", "i", "love", "coding","i","like","sports"]
k = 2
topKFrequent(words, k)

['love', 'i']

### Ex.4 Ugly Number II

Write a program to find the n-th ugly number.

Ugly numbers are positive numbers whose prime factors only include 2, 3, 5. For example, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12 is the sequence of the first 10 ugly numbers.

Note that 1 is typically treated as an ugly number.

# Solution
由于Ugly number必定是以2，3，5为因子，因此可以使用2，3，5去互相乘，对积也是不断的*2，3，5。这里采用的方法是，每pop出一个ugly(从小到大)就让它*一次2，3，5，可以看出乘出来的积必定比其原来大，因此不用担心heap会把积放到当前ugly的前面。但是乘出来的数可能有重复，比如2*3和3*2.因此需要排除重复的数。

In [8]:
def nthUglyNumber(n):
    q2, q3, q5 = [2], [3], [5]
    ugly = 1
    for u in heapq.merge(q2, q3, q5):
        if n == 1:
            return ugly
        if u > ugly: # 用于防止把重复的数计入ugly
            ugly = u
            n -= 1
            q2 += [2 * u]
            q3 += [3 * u]
            q5 += [5 * u]

nthUglyNumber(10)

12

### Ex.5 Find K Pairs with Smallest Sums

You are given two integer arrays nums1 and nums2 sorted in ascending order and an integer k.

Define a pair (u,v) which consists of one element from the first array and one element from the second array.

Find the k pairs (u1,v1),(u2,v2) ...(uk,vk) with the smallest sums.

<img src="../images/ch15/heap4.png" width="460"/>

# Solution:
最简单的就是用brute force，把所有可能的pair算出来，然后算出sum最小的几个。

快速的方法：

使用堆和BFS，以nums1和nums2分别为二维矩阵的特征。可以发现当(nums1[i], nums[j])作为最小对被提取出来的时候，candidates只能从矩阵中往右和下走中提取，当然candidates也包括前面的。比如（1，0）是当前最小的pair，那么(2,0)和(1,1)就会成为candidates，因为其他的点都不可能比这两个小。
<img src="../images/heap+bfs.png" width="460"/>

In [16]:
def kSmallestPairs(nums1, nums2, k):
    candidates = []
    if nums1 and nums2:
        heapq.heappush(candidates, [nums1[0] + nums2[0], 0, 0])
    visited = [(0, 0)]
    res = []
    while candidates and len(res) < k:
        _, i, j = heapq.heappop(candidates)
        res.append([nums1[i], nums2[j]])
        if (i+1, j) not in visited and len(nums1) > i+1:
            heapq.heappush(candidates, [nums1[i+1]+nums2[j], i+1, j])
            visited.append((i+1, j))
        if (i, j+1) not in visited and len(nums2) > j+1:
            heapq.heappush(candidates, [nums1[i]+nums2[j+1], i, j+1])
            visited.append((i, j+1))
    return res


nums1 = [1,7,11]
nums2 = [2,4,6]
k = 5
kSmallestPairs(nums1, nums2, k)

[[1, 2], [1, 4], [1, 6], [7, 2], [7, 4]]

# Heap Practice II #

### Ex.1 Merge K Sorted List   

Merge k sorted linked lists and return it as one sorted list. Analyze and describe its complexity.

# Solution:
初始参数一开始会存k个sorted list的head, 
1. 可以先将k个list的（head.value， head）存进最小堆里。每个Node为（value, LLNode）
2. 维持一个指针cur, 然后pop出最小的node.value，cur.next = node
3. cur = cur.next走到刚刚那个node上
4. 用cur.next判断一下当前所在的这个LL是否走完，没走完就把cur.next放到heap中
5. 重复2-4, 直到heap为空

### Tips:
这里Heap中存（value, LLNode）是因为cur会改变链表的结构，需要先将当前node的下一个node存起来

In [40]:
from LinkedList import LinkedList
from LinkedList import Node

def mergeKLists(lists):
    heap = []
    for node in lists:
        if node is not None:
#             print([node.value, node])
            tmp = [node.value, node]
            heapq.heappush(heap, tmp)
    dummy = Node()
    cur = dummy
    while heap:
        node = heapq.heappop(heap)[1]
        cur.next = node
        cur = cur.next
        
        if cur.next:
#             print(cur.next.value)
            tmp = [cur.next.value, cur.next]
            print(tmp[0])
            heapq.heappush(heap, tmp)
    
    return dummy

lst1 = LinkedList()
lst1.add_last(1)
lst1.add_last(4)
lst1.add_last(5)

lst2 = LinkedList()
lst2.add_last(2)
lst2.add_last(3)
lst2.add_last(6)

lst3 = LinkedList()
lst3.add_last(3)
lst3.add_last(7)

lists = [lst1.head.next, lst2.head.next, lst3.head.next]
node = mergeKLists(lists)
result = LinkedList()

result.head.next = node
result.printlist()

4
3


TypeError: '<' not supported between instances of 'Node' and 'Node'

### Ex.2 Find Median from Data Stream 

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Examples: 

[2,3,4] , the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

void addNum(int num) - Add a integer number from the data stream to the data structure.

double findMedian() - Return the median of all elements so far.

# Solution:
可以使用两个heap,一个是min，另一个是max。让max的大小始终与min相等或相差1（要么max始终等于或比Min大一，要么反过来）。且min中的所有值都比Max的大。这样就可以把stream分成两部分，小的一半在max，大的一部分在min。那么，当stream的大小是偶数的时候，只需要(pop(max)+pop(min))/2就是中间数了。当steam是奇数的时候，中间数为pop(max)（如果选择maxmax始终等于或比Min大一）或者中间数为pop(min)（反之）

In [44]:
def findMedian(nums):
    maxHeap = []
    minHeap = []
    for num in nums: # 只要保证每一次push数的时候，两个Heap都流动一次，就可以确保一半大一半小
        tmp = -heapq.heappushpop(minHeap, num)
        heapq.heappush(maxHeap, tmp)
        if len(maxHeap) > len(minHeap): # 确保min的大小始终等于或比max大一
            heapq.heappush(minHeap, -heapq.heappop(maxHeap))
    if len(maxHeap) < len(minHeap):
        return heapq.heappop(minHeap)
    else:
        return (-heapq.heappop(maxHeap) + heapq.heappop(minHeap)) / 2.0


nums = [2,2,3,3]
findMedian(nums)
# 纠错一次，有一个heapq.heappop写成 heappop  

2.5

### Ex.3 Manage Your Project (IPO)

You are given several projects. For each project i, it has a pure profit Pi and a minimum capital of Ci is needed to start the corresponding project. Initially, you have W capital. When you finish a project, you will obtain its pure profit and the profit will be added to your total capital.

To sum up, pick a list of at most k distinct projects from given projects to maximize your final capital, and output your final maximized capital.

Input: k=2, W=0, Profits=[1,2,3], Capital=[0,1,1]. 

Output: 4 

Explanation: Since your initial capital is 0, you can only start the project indexed 0. After finishing it you will obtain profit 1 and your capital becomes 1. With capital 1, you can either start the project indexed 1 or the project indexed 2. Since you can choose at most 2 projects, you need to finish the project indexed 2 to get the maximum capital. Therefore, output the final maximized capital, which is 0 + 1 + 3 = 4. 

# Solution:
1. 全部以zip(capital, profits)的形式存入list future
2. 设定k次循环
3. 从future里面找出哪些project是符合W，并将对应的profits存入maxheap current中，因为如果现在可以存进heap中的，之后投资也肯定可以存进去（W只会越来越高）
4. 更新W
5. 返回W

In [48]:
def findMaximizedCapital(k, W, Profits, Capital):
    future = sorted(zip(Capital, Profits))[::-1] # 倒序，方便pop
    current = []
    for i in range(k):
        while future and future[-1][0] <= W:
            heapq.heappush(current, -future.pop()[1]) #只取出profits即可，无需担心之后w会降低导致本profits无用，因为没老板会投亏欠项目
                                            # 且因为需要用的是max heap，所以需要补上负号
        W -= heapq.heappop(current)
    return W

k=2
W=0
Profits=[1,2,3]
Capital=[0,1,1]

findMaximizedCapital(k, W, Profits, Capital)
# 首次出错：W -= heapq.heappop(current)写在了while里面

4