# Solution Notebook

## Problem: Implement Bucket sort.

* [Constraints](#Constraints)
* [Test Cases](#Test-Cases)
* [Algorithm](#Algorithm)
* [Code](#Code)
* [Pythonic-Code](#Pythonic-Code)
* [Unit Test](#Unit-Test)

## Constraints

* Is a naive solution sufficient (ie not in-place)?
    * Yes
* Are duplicates allowed?
    * Yes
* Can we assume the input is valid?
    * No
* Can we assume this fits memory?
    * Yes

## Test Cases

* None -> Exception
* Empty input -> []
* One element -> [element]
* Two or more elements

## Algorithm

Wechat's animation:
![alt text](https://mmbiz.qpic.cn/mmbiz_gif/D67peceibeIRxSzm8QgeCjyuoZkKQTwmH2JMcBXRPyZGWFJnslhoGNFyyhS7q0wP23CfTlGY6vwkvjG4GoklERw/640?wx_fmt=gif&wxfrom=5&wx_lazy=1)


Complexity:
* Time: O(n + k) average, O(n + k)best, O(n^2)worst
* Space: O(n + k)


See [Bucketsort on wikipedia](https://zh.wikipedia.org/wiki/%E6%A1%B6%E6%8E%92%E5%BA%8F):


See: [deep understand Bucketsort](https://mp.weixin.qq.com/s?__biz=MzUyNjQxNjYyMg==&mid=2247484058&idx=1&sn=a7aa08a7decbba40d8af7b3c6a62cf5a&chksm=fa0e6d1bcd79e40d6ec21c64efb2115b3ffa51b3128814e0702f35682b7c85e4c5edb03e3ad1&scene=21#wechat_redirect)

## Code

In [88]:
from __future__ import division
import math 

DEFAULT_BUCKET_SIZE = 5

class BucketSort(object):

    def sort(self, data, bucketSize = DEFAULT_BUCKET_SIZE):
        
        if data is None:
            raise TypeError('data cannot be None')
        if len(data)<=1:
            return data
        
        len_data = len(data)
        max_digit = max(data)
        min_digit = min(data)
        
        # 分bucket_number个桶
        bucket_number = math.floor((max_digit - min_digit)/bucketSize) + 1
        buckets = []
        for i in range(bucket_number):
            buckets.append([])

        
        for d in range(len_data):
            
            data_value = data[d]
            
            # 找到所在桶的index
            index = (data_value - min_digit)//bucketSize
            # 找到所在的桶
            bucket = buckets[index]
            
            # 桶中排序，插入
            self._bucket_sort(bucket, data_value)
            
        # print(buckets)
        # merge bucket
        new_data = []
        for buck in buckets:
            new_data += buck
                
        return new_data
            
    def _bucket_sort(self, bucket, data_value):
        
        if len(bucket) == 0:
            bucket.insert(0, data_value)
            return 

        for i in range(len(bucket)):

            if data_value<=bucket[i]:
                bucket.insert(i, data_value)
                break

            elif data_value>bucket[len(bucket)-1]:
                bucket.insert(len(bucket), data_value)
                break

        
                
   

## Unit Test



In [89]:
# %%writefile test_bucket_sort.py
from nose.tools import assert_equal, assert_raises


class TestBucketSort(object):

    def test_bucket_sort(self):
        bucket_sort = BucketSort()
        
        print('None input')
        assert_raises(TypeError, bucket_sort.sort, None)

        print('Empty input')
        assert_equal(bucket_sort.sort([]), [])

        print('One element')
        assert_equal(bucket_sort.sort([5]), [5])

        print('Two or more elements， have negative')
        data = [5, 1, 7, 2, 6, -3, 5, 7, -1]
        assert_equal(bucket_sort.sort(data), sorted(data))
        print('Success negative data: test_bucket_sort\n')
        
        print('Two or more elements')
        data1 = [5, 1, 7, 2, 6, 3, 5, 7, 10]
        assert_equal(bucket_sort.sort(data1), sorted(data1))
        print('Success: test_bucket_sort\n')


def main():
    test = TestBucketSort()
    test.test_bucket_sort()


if __name__ == '__main__':
    main()

None input
Empty input
One element
Two or more elements， have negative
[[-3, -1, 1], [2, 5, 5, 6], [7, 7]]
Success negative data: test_bucket_sort

Two or more elements
[[1, 2, 3, 5, 5], [6, 7, 7, 10]]
Success: test_bucket_sort



In [47]:
%run -i test_bucket_sort.py

None input
Empty input
One element
Two or more elements， have negative
Success negative data: test_count_sort

Two or more elements
Success: test_count_sort

