# Determine Max S Value

In this Jupyter Notebook, we will go into details on how we determine maximum S value so that we can set the range of S values for plotting the graph of `Average Key Comparisons VS S Values`.

First, we set the minimum value of S to be 2 since there is no need to even sort if the array has 1 or lesser number of element. We decided to set the maximum value of S to be 1000 to be safe, since insertion sort on 1000 elements must be really slow, so we decided to set a large initial maximum S value and slowly narrow it down.

In [None]:
from hybrid_sort_key_cmp import HybridSortKeyCmp
import matplotlib.pyplot as plt

MIN_S = 2
MAX_S = 1000
INPUT_SIZE = 1000
RANDOM_TESTS = 10

In [None]:
s_values, average_key_cmps = HybridSortKeyCmp.average_key_cmps_with_n_fixed(MIN_S, MAX_S, INPUT_SIZE, RANDOM_TESTS)

plt.plot(s_values, average_key_cmps, linestyle='-')

plt.xlabel('S Values')
plt.ylabel('Average Key Comparisons')
plt.title('Average Key Comparisons vs. S Values')

plt.show()

### General Upwards Trend

As seen from the graph, there is a general upwards trend, such that the higher the value of S, the greater the average key comparisons.

However, we are interested in finding the optimal S value for the best performance of the hybrid sort algorithm, and this trend is not very informative because the range is too big. 

We need to further narrow the range of the S values in order to omit the S values with higher numbers of average key comparisons (bad performance) since they won't be useful in finding the optimal S value.

On further observation, we can see that the least number of key comparisons revolve around the 2-100 range, so we can reduce MAX_S to 100 to zoom into the portion where the S values perform better.

In [None]:
s_values, average_key_cmps = HybridSortKeyCmp.average_key_cmps_with_n_fixed(MIN_S, 100, INPUT_SIZE, RANDOM_TESTS)

plt.plot(s_values, average_key_cmps, linestyle='-')

plt.xlabel('S Values')
plt.ylabel('Average Key Comparisons')
plt.title(f'Average Key Comparisons vs. S Values')

plt.show()

### Similar Upwards Trend

Again, we are seeing a similar upwards trend as when the range was 2-1000, which means we can further narrow the range to filter out the S values with poor performance.

Fortunately, from the graph, we can see that between the range 2-20, the number of average key comparisons seems to be the lowest, which means the S values within this range perform the best. Therefore, we are going to set the MAX_S to be 20 and narrow the range further for a more informative and conclusive graph.

In [None]:
s_values, average_key_cmps = HybridSortKeyCmp.average_key_cmps_with_n_fixed(MIN_S, 20, INPUT_SIZE, RANDOM_TESTS)

plt.plot(s_values, average_key_cmps, linestyle='-')

plt.xlabel('S Values')
plt.ylabel('Average Key Comparisons')
plt.title(f'Average Key Comparisons vs. S Values')

plt.show()