# Python数据结构基础  


## List 
### Definition 
* Stores data elements based on an sequential, most commonly 0 based, index. 
* Based on tuples from set theory. （元组 + 可变的集合 -> 可变的元组）。
* They are one of the oldest, most commonly used data structures. 

### Key:
* Advantages: indexing  
* Disadvantages: inserting(except append), and deleting(except pop).
* Linear data structures, are the most basic. 最基本结构：线性数组.
    * Stack 栈 
    * Queue 队列 First-In-First-Out (FIFO) principle 先进先出
* Non-linear data structures.非线性数据结构
    * Graphs 图 [[1,2], [3, 4]]
    * Trees  树 中序遍历 [1,[2,4,5],3]

### Time complexity (linear structure) (增删改查排）
* Append： O(1)
* Insertion: O(N) (l[a:b] = ...) 
* Pop：O(1)
* Delete: O(N) depends on i; O(N) in worst case
* Indexing: O(1)
* Search: traversal list O(N)
* Sort: O(NlogN)



## Linked list
### Definition:
* Stores data with nodes that point to other nodes. 用节点储存数据并指向其他节点
    * Nodes, at its most basic it has one datum and one reference (another node) 在最基础的情况下，节点拥有一个数据和一个参考（其他节点）
    * A linked list chains nodes together by pointing one node's reference to another. 节点通过指向另一个节点从而使链表连接在一起

### Key: 
* Advantage: Add and deletion 
* Disadvantage: indexing and searching. 
* Doubly linked list has nodes that also references the previous node. 双向链表有两个参考，一个是之前的node，一个是之后的node
* Circularly linked list is simple linked list whose tail, the last node, references the head, the first node. 成环链表是由简单的链表首末相连组成
* 可以实现栈和队列

### Time Complexity:(增删改查排）
* Add node: O(1)
* Deletion: O(1)
* indexing: O(N)
* Search: traversal linklist O(N)
* Sort: O(NlogN)

## Stack and Queue
* Stack 栈 
    * Last-In-First-Out (LIFO) concept 后进先出
    * append() / pop() 添加/删除末尾
    * can be made from **list**
    * can be made from **linklist**, by having the head be the only place for insertion and removal. 可用链表构成,插入：建立新节点，连接到头节点；移除：移除头节点，返回下一个
        * 参见 <链表实现stack和Queue>
* Queue 队列 First-In-First-Out (FIFO) principle 先进先出
    * Lists are not efficient to implement a queue 列表不能有效的执行队列
        * 故调用容器-双端队列deque(double-end-queue) 
        * from collections import deque d = deque(iterable) ("deck")，If iterable is not specified, the new deque is empty.
        * append(x) / appendleft(x) / pop() / popleft() / extend(iterable) / extendleft(iterable) O(1) 双向append or pop, O(1)
    * can be made from **linklist** that only removes from head and adds to tail 也可用链表构成, 从头部移除，从尾部添加
        * 参见 <链表实现stack和Queue>
        

## Hash Table/Map (dictionary)
### Definition
* Stores data with key value pairs.
* Hash function accept a key and return an output unique only to that specific key
    * This is known as hashing, which is the concept that an input and an output have a one-to-one correspondence to map information.
    * Hash function has a unique address in memory for that data.

### Key 
* Advantages: insertion, deletion, and searching
* Hash collisions(哈希冲突) are when a hash function returns the **same output** for **two distinct inputs**.
    * All hash function have its problem
    * This is often accommodated(解决) for **having the hash table being very large**
* Hashes are important for **dictionary** and **database indexing**.

### Time complexity (增删查）
* Store：O(1) d[k] = v
* Pop: O(1) d.pop(k) 删除key对应的pair，并返回value
* Pop item: O(1) d.popitem() 删除并返回最后一个pair
* Delete: O(1) del d[k]
* Search: O(1) d[k]


# Efficient Sorting Basic 高效分类基础


## Merge sort 合并排序
### Definition
A comparison based sorting algorithm  
  * Divide entire dataset into groups of at most two
  * Compares each number one at a time, moving the smallest number to left of the pair.
  * Once all pairs sorted it, then compares left most elements of the two leftmost pairs to create sorted group of four with the smallest numbers on the left and the largest ones on the right.
  * This process is repeated until there in only one set.
  
### Key
* This is one of most basic sorting algorithm.
* Know that it divides all the data into small possible sets then compares them.

### Time complexity
* Best Case Sort: O(N)
* Average Case Sort: O(NlogN)
* Worst Case Sort: O(NlogN)


In [None]:
# 基本算法程序
# edge case: list is None / only one element 
class SortList:
    # Merge sort
    def mergeSort(self, arr):
        if len(arr) == 0 or len(arr) == 1:
            return
        
        
        if len(arr) >= 2:
            # 普通划分
            mid = len(arr) // 2  # find the mid of the array
            L = arr[:mid]  # divide the array elements
            R = arr[mid:]  # into 2 haves
            
            # 递归划分
            self.mergeSort(L)  # Sorting the first half
            self.mergeSort(R)  # Sorting the second half

            # 合并
            i = j = k = 0
            while i < len(L) and j < len(R):  # copy data to temp arrays L and R
                if L[i] < R[j]:
                    arr[k] = L[i]
                    i += 1
                else:
                    arr[k] = R[j]
                    j += 1
                k += 1

            while i < len(L):  # process left data in L part
                arr[k] = L[i]
                k += 1
                i += 1

            while j < len(R):  # process right data in R part
                arr[k] = R[j]
                k += 1
                j += 1

        return arr
            
x = SortList() 
x.mergeSort([2,1,0])

### 算法理解
如图 Merge-Sort.png  
1. 普通划分：把arr划分成两个部分
2. 递归划分：把部分划分到最小的单位 
3. 递归合并：排序并合并两个数组

核心（主要特征）：
1. 排序（结果有序）
2. 合并


### 出题类型
* 合并排序题  
不同处：改输入量形式，要求有序的合并  
解决方案：  
* 思考输入量数据结构，得出新edge case
* 思考划分和合并方式
* 思考输出值形式
题目列表：23. Merge k Sorted Lists.py
 


## Quick sort 快速排序
### Definition
### Key
### Time complexity


## Bubble sort 气泡排序
### Definition
### Key
### Time complexity


