IndexSort

Description:

Reference implementation for IndexSort algorithm and compare the runtime with quick sort and the linear time counting sort and bin sort.

QuickSort: https://en.wikipedia.org/wiki/Quicksort
CountingSort: https://opendatastructures.org/ods-java/11_2_Counting_Sort_Radix_So.html
BinSort: https://opendsa-server.cs.vt.edu/ODSA/Books/CS3/html/BinSort.html

Index sort is an sorting algorithm what works on set of distinct natural numbers and can deliver better runtime performance than QuickSort, CountingSort and BinSort in many cases. This is not a generic sorting algorithm but works very effectively with positive unique integer values what is used in many business scenario. It is based on the bin (bucket) sorting algorithm but utilize the unique numbers in the sortable array.

How to run reference implementation:

Download the /bin/IndexSorter.jar
Call: java -jar IndexSorter.jar param1 param2 param3 param4

Parameters:
param1: number of samples to be generated (like 5)
param2: size of the array what will be sorted (like 100)
param3: range of the natural numeric numbers (like 1000)
param4: print statistic only and skip the array printouts. (like true)

Algorithm with pseudocode:

array a; // sortable array
array b; // helper array
n = length(a) // length of the sortable array
max = 0 // store the maximum value in the sortable array
min = infinite // store the minimum value in the sortable array
Loop i = 1 .. n { if a[i] > max then max = a[i]; if a[i] < min then min = a[i]; } // look for maximum and minimum
b = new array[max] // helper array creation
Loop i = 1 .. n { b[a[i]] = true } // set the true by value as index
pos = 0
Loop i = min .. max {if (b[i]) then a[pos++] = i} // rewrite the sortable array with position pointer

Algorithm execution by a sample

Sample to sort [8, 5, 3, 7]
Create helper boolean array where length of the array is the maximum number in the sample (8):
[false, false, false, false, false, false, false, false]
Use the number from input array as index and set true by this index in the helper array.
[false, false, true, false, true, false, true, true]
Loop over the helper array and write back to the index value to the original array left to right if the helper array contains true in the index position.
[3, 5, 7, 8]

Formal analysis:

Sort an array of n distinct elements, quicksort takes O( n log n ) time in best case and O ( n2 ) in worst case.
Index sorting needs three loops over the the array.

Selecting the maximum and minimum element: n step.
Setting the helper array value: n step.
Writing back the index values to original: (max - min) step, where max and min are the maximum and minimum values in the array.

The time is always linear and O ( 2n + max ) in every case.
n = 1000 then QuickSort needs 3000 steps in best case. IndexSort needs 2000 + (maximum - minimum value) steps in every case.
CountingSort has O (n + max) what is comparable with IndexSort but IndexSort on random datasets shows better performance.

Statistics:
Index sort shows the best execution time in case of big array sizes. There is no big exection time difference in the small arrays. Index sort delivers always better results than BinSort in any sample size

Number of Sample	Size of array	Number intervall 1..n	QuickSort Wins	IndexSort Wins	CountingSort Wins
100	10	100	81	16	3
100	10	100	92	6	2
100	10	100	89	8	3
100	100	100	38	55	7
100	100	100	50	45	5
100	100	100	46	50	4
100	100	1000	51	43	6
100	100	1000	46	44	10
100	100	1000	48	46	6
100	1000	1000	3	65	32
100	1000	1000	3	64	33
100	1000	1000	2	76	22
100	1000	10000	7	81	12
100	1000	10000	8	77	15
100	1000	10000	6	81	13
100	10000	10000	0	96	4
100	10000	10000	0	92	8
100	10000	10000	0	99	1
100	10000	100000	0	100	0
100	10000	100000	0	98	2
100	10000	100000	0	100	0
10	100000	1000000	0	10	0

Disadvantages and improvement oportunities

It works only with distinct values in the array.
Runtime depends on the distance between maximum and minimum value in the array.
It can work with negative numbers with transforming the values to positive by adding the minimum value to each element.
It can work with real numbers with transforming the values to integer whole number with multiplication by 10s.
Better representation of the helper array to avoid big size array (memory usage and loop size). Current represenation is good for natural business numbers becasue the values in the helper array are 1 bits in memory only.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
.settings		.settings
bin		bin
src/com/holczmann		src/com/holczmann
.classpath		.classpath
.gitattributes		.gitattributes
.project		.project
IndexSorter.pom		IndexSorter.pom
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IndexSort

About

Releases

Packages

Languages

License

holczmannb/IndexSorter

Folders and files

Latest commit

History

Repository files navigation

IndexSort

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages