[Overview of Algorithms and Data Structures](../../algorithms_overview.ipynb) / Algorithms / [Arrays](../../data_structures/arrays.ipynb) / Count Unique Values

# Count Unique Values

Problem: Given an array of integers, count the number of unique values.
For example, if we have the following array:

[1, 1, 1, 1, 1, 1, 3, 5]

the count of original values is 3 => 1, 3, and 5.



* Create an new empty list
* Loop through the list
    * If the number is not in the new list, append it there
* Return the lenght of the new list

In [31]:
def count_unique_values(arr):
    
    unique_values = []
    
    for i in arr:
        if i not in unique_values:
            unique_values.append(i)
            
    return len(unique_values)

In [57]:
arr = [1, 1, 1, 1, 1, 1, 3, 5]
arr2 = [3,3,3,4,4,4,5,5,5,5,5,6]
arr3 = [2,2,2,2,2,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,7,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,10,10,10,10,10,10]

print(count_unique_values(arr))
print(count_unique_values(arr2))
print(count_unique_values(arr3))

3
4
7


The function is straightforward and easy to code. However, such brute force approach leaves us with a really slow O(n^2) time since both traversing the original list and new list lookup are O(n) in worst case. We can do better. 

We can even go as far as drop the for loop altogether by using sets, which is a data structure in Python that can only hold unique values. We still are left with O(n) because of the way list is transformed into set, but we don't need to hold additional list in our memory, making this approach more space efficient.

In [48]:
def count_unique_values_set(arr):
    return len(set(arr))

In [58]:
print(count_unique_values_set(arr))
print(count_unique_values_set(arr2))
print(count_unique_values(arr3))

3
4
7


Let's do some speed tests:

In [61]:
import timeit

if __name__ == '__main__':
    
    print('Using lists')
    print(timeit.timeit("count_unique_values(arr)", setup="from __main__ import count_unique_values, arr"))
    print(timeit.timeit("count_unique_values(arr2)", setup="from __main__ import count_unique_values, arr2"))
    print(timeit.timeit("count_unique_values(arr3)", setup="from __main__ import count_unique_values, arr3"))
    print("------------------")
    print('Using sets')
    print(timeit.timeit("count_unique_values_set(arr)", setup="from __main__ import count_unique_values_set, arr"))
    print(timeit.timeit("count_unique_values_set(arr2)", setup="from __main__ import count_unique_values_set, arr2"))
    print(timeit.timeit("count_unique_values_set(arr3)", setup="from __main__ import count_unique_values_set, arr3"))
    print("------------------")

Using lists
0.5691923818698115
0.8425182184540745
3.5604711546785097
------------------
Using sets
0.8189961768220542
1.0709439708947457
3.6913683317463324
------------------
