## **Optimized, Production-Quality Rust Implementation of Bucket Sort**

### **🔹 Overview**

Bucket Sort is a **distribution-based sorting algorithm** that:

1. **Distributes** elements into multiple buckets based on value range.
2. **Sorts** each bucket individually (typically using Insertion Sort).
3. **Concatenates** sorted buckets into the final sorted array.

✅ **Best suited for floating-point numbers, uniformly distributed data.**  
✅ **Can achieve O(N) complexity for special cases.**

---

## **📌 Optimized Rust Code**

```rust
/// Performs Bucket Sort on a slice and returns a sorted vector.
/// Assumes input is within a known range (e.g., 0.0 to 1.0 for floating-point numbers).
pub fn bucket_sort(arr: &[f64]) -> Vec<f64> {
    if arr.is_empty() {
        return vec![];
    }

    let num_buckets = arr.len();
    let mut buckets: Vec<Vec<f64>> = vec![vec![]; num_buckets];

    // Step 1: Distribute elements into buckets
    for &num in arr {
        let index = (num * num_buckets as f64) as usize;
        buckets[index.min(num_buckets - 1)].push(num);
    }

    // Step 2: Sort individual buckets
    for bucket in &mut buckets {
        bucket.sort_by(|a, b| a.partial_cmp(b).unwrap()); // Insertion Sort for small buckets
    }

    // Step 3: Concatenate sorted buckets
    buckets.into_iter().flatten().collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_bucket_sort() {
        let arr = vec![0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51];
        assert_eq!(bucket_sort(&arr), vec![0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]);

        let arr = vec![0.9, 0.1, 0.5, 0.7, 0.3];
        assert_eq!(bucket_sort(&arr), vec![0.1, 0.3, 0.5, 0.7, 0.9]);

        let arr = vec![0.1, 0.1, 0.1, 0.1];
        assert_eq!(bucket_sort(&arr), vec![0.1, 0.1, 0.1, 0.1]);

        let arr = vec![];
        assert_eq!(bucket_sort(&arr), vec![]);

        let arr = vec![0.8];
        assert_eq!(bucket_sort(&arr), vec![0.8]);
    }
}
```

---

## **📊 Time & Space Complexity Analysis**

### **Time Complexity**

| Case             | Complexity         | Explanation                                                                        |
| ---------------- | ------------------ | ---------------------------------------------------------------------------------- |
| **Best Case**    | **O(N)**           | Uniform distribution, evenly filled buckets, O(N) bucket placement, O(N) sorting.  |
| **Average Case** | **O(N + K log K)** | Sorting within buckets (Insertion Sort: O(K log K)).                               |
| **Worst Case**   | **O(N log N)**     | If all elements land in one bucket, reduces to a sorting algorithm like QuickSort. |

✅ **Ideal when input data is uniformly distributed.**  
❌ **Degrades to O(N log N) for skewed distributions.**

### **Space Complexity**

| Storage             | Complexity | Reason                                           |
| ------------------- | ---------- | ------------------------------------------------ |
| **Extra Buckets**   | **O(N)**   | Requires separate storage for each bucket.       |
| **Recursive Calls** | **O(1)**   | Uses iterative approach (no recursion overhead). |

✅ **Requires extra space for buckets but avoids deep recursion overhead.**  
✅ **More memory-intensive than Quick Sort (O(1) extra space).**

---

## **📌 Algorithm Explanation**

### **Core Idea**

1. **Bucket Allocation:**

   - Divide the input range into **N buckets**.
   - Assign each element to its respective bucket using **indexing function**.

2. **Sorting Each Bucket:**

   - Sort each bucket using **Insertion Sort** (efficient for small lists).

3. **Concatenation:**
   - Combine all sorted buckets into a single sorted array.

### **Why Use Bucket Sort?**

- **Best for floating-point numbers or uniformly distributed data.**
- **Faster than O(N log N) sorting for specific cases.**
- **Avoids worst-case O(N²) behavior of Quick Sort.**

---

## **📌 Example Walkthrough**

#### **Input:** `[0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51]`

##### **Step 1: Bucket Allocation**

```
Bucket 0: [0.23, 0.25]
Bucket 1: [0.32]
Bucket 2: [0.42]
Bucket 3: [0.47]
Bucket 4: [0.51, 0.52]
```

##### **Step 2: Sort Each Bucket**

```
Bucket 0: [0.23, 0.25] (Sorted)
Bucket 1: [0.32] (Sorted)
Bucket 2: [0.42] (Sorted)
Bucket 3: [0.47] (Sorted)
Bucket 4: [0.51, 0.52] (Sorted)
```

##### **Step 3: Concatenation**

```
Final Sorted Output: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]
```

---

## **🛠 Edge Cases Considered**

✅ **Empty array (`[]`)** → Returns `[]`.  
✅ **Single-element array (`[0.8]`)** → Returns `[0.8]`.  
✅ **Sorted input (`[0.1, 0.2, 0.3, 0.4]`)** → Efficiently handled in O(N).  
✅ **Reverse sorted input (`[0.9, 0.8, 0.7, 0.6]`)** → Correctly sorted.  
✅ **Contains duplicates (`[0.5, 0.5, 0.5]`)** → Preserves duplicates.  
✅ **Uniform distribution (`[0.1, 0.9, 0.2, 0.8]`)** → Works optimally.  
❌ **Highly skewed distribution (`[0.1, 0.1, 0.1, 0.9]`)** → Can degrade performance.

---

## **🔹 DSA Tags**

- **Sorting**
- **Bucket Sort**
- **Distribution-based Sorting**
- **Divide and Conquer**

---

## **📈 Constraints & Scalability**

✅ **Handles large datasets efficiently (O(N) complexity in best cases).**  
✅ **Scalable for large floating-point datasets.**  
❌ **Struggles with non-uniform distributions.**  
✅ **Good for parallel execution due to independent bucket processing.**

---

## **🚀 Follow-up Enhancements**

### **1️⃣ Dynamic Bucket Count**

- Use **adaptive bucket sizing** to handle non-uniform distributions.
- Estimate bucket count using **statistical analysis**.

### **2️⃣ Parallel Bucket Sorting**

- Implement **multi-threading** to sort buckets in parallel.
- Use **Rayon (`rayon::join()`)** to improve performance.

### **3️⃣ In-Place Bucket Sort**

- Modify algorithm to sort elements **in-place** to reduce memory usage.

---

## **🎯 Real-World Applications**

✅ **Sorting floating-point numbers (e.g., GPA scores, probabilities).**  
✅ **Histogram-based data processing (e.g., image processing, density estimation).**  
✅ **Used in **external sorting** when dealing with large-scale datasets (e.g., distributed databases).**  
✅ **Employed in networking (e.g., packet scheduling based on priority buckets).**

---

## **✅ Final Verdict**

✅ **Best for:** **Floating-point numbers, uniform distributions, parallel sorting.**  
❌ **Not ideal for:** **Highly skewed distributions, integer sorting (Radix Sort is better).**  
💡 **Use Counting Sort for small integer ranges, Quick Sort for general-purpose sorting.** 🚀


In [None]:
/// Performs Bucket Sort on a slice and returns a sorted vector.
/// Assumes input is within a known range (e.g., 0.0 to 1.0 for floating-point numbers).
pub fn bucket_sort(arr: &[f64]) -> Vec<f64> {
    if arr.is_empty() {
        return vec![];
    }

    let num_buckets = arr.len();
    let mut buckets: Vec<Vec<f64>> = vec![vec![]; num_buckets];

    // Step 1: Distribute elements into buckets
    for &num in arr {
        let index = (num * num_buckets as f64) as usize;
        buckets[index.min(num_buckets - 1)].push(num);
    }

    // Step 2: Sort individual buckets
    for bucket in &mut buckets {
        bucket.sort_by(|a, b| a.partial_cmp(b).unwrap()); // Insertion Sort for small buckets
    }

    // Step 3: Concatenate sorted buckets
    buckets.into_iter().flatten().collect()
}
