Here is another installment in our series on HackerRank problem's.  This time, we are looking at the 'medium' rated [Count Triplets](https://www.hackerrank.com/challenges/count-triplets-1/problem?h_l=interview&playlist_slugs%5B%5D=interview-preparation-kit&playlist_slugs%5B%5D=dictionaries-hashmaps).  This one took me some time to see through.

__Problem.__

You are given an array and you need to find number of triplets of indices $(i, j, k)$ with $i<j<k$ such that the elements at those indices are in geometric progression for a given common ratio $r\ge 1$.

__Solution.__

First off:  what exactly is a geometric progression? Well, integers $a_1, a_2$ and $a_3$ are a geometric progression if 

$$(a_1, a_2, a_3) = \left(c\cdot r^i, c\cdot r^{i+1}, c\cdot r^{i+2}\right)$$ 

for some nonnegative integer $i$ and constant $c$.  Here are a few examples:

\begin{align}
2, 4, 8 &= 2^1, 2^2, 2^3 \\
2, 10, 25 &= 2\cdot 5^0, 2\cdot 5^1, 2\cdot 5^2 \\
2, 2, 2 &= 2\cdot 1^0, 2\cdot 1^1, 2\cdot 1^2 \\
\end{align}

Let's start with the imports that we will need.  The `testing` module is my (rudimentary) custom testing suite.  

In [62]:
import shelve
import requests

import itertools as it

from collections import Counter
from functools import reduce

import testing

import matplotlib.pyplot as plt
%matplotlib inline

__Attempt 1__

My first attempt involved two functions:  one to determine whether an array (sorted) was a geometric progression by checking that the ratios of all its successive elements are all the same, and one to count how many such subarrays of length 3 there are in the given array.  The latter function works by running every combination in `itertools.combinations(array)` of length 3 through the first function.  I was not surprised to find that, while it worked for some smaller test cases, the majority failed due to timeouts.  That is a recurring theme in my experience of HackerRank.

In [63]:
def is_geometric_progression(array):
    """Returns if array (sorted) is a geometric progression."""
    return all([array[i+1]/array[i] == array[1]/array[0] for i in range(len(array)-1)])


def count_geo_progs(array, length):
    """Counts number of geometrical of given length in array."""
    count = 0
    for combo in it.combinations(array, length):
        count += is_geometric_progression(combo)
    return count


Before optimizing, however, I noted that one of the small sized test cases was also failing, so some debugging was in order.  

__Testing 1__

In [64]:
# test cases

arrays = [
    [1,4,16,64],
    [1,3,9,9,27,81],
    [1,2,2,4],
    [1,5,5,25,125]
         ]

args = [(array, 3) for array in arrays]

expected = [2, 6, 2, 4]


In [66]:
import testing
testing.test(count_geo_progs, args, expected)

   TESTING count_geo_progs    
Test 0:  count_geo_progs(([1, 4, 16, 64], 3)...)
  Computed: 2
  Expected: 2
------------------------------
Test 1:  count_geo_progs(([1, 3, 9, 9, 27, 81], 3)...)
  Computed: 8
  Expected: 6
8 6
        TEST 1 FAILED         
------------------------------
Test 2:  count_geo_progs(([1, 2, 2, 4], 3)...)
  Computed: 2
  Expected: 2
------------------------------
Test 3:  count_geo_progs(([1, 5, 5, 25, 125], 3)...)
  Computed: 4
  Expected: 4
------------------------------
The following 1 test is failing:
             [1]              


__Debugging 1__

The offending array was `[1, 3, 9, 9, 27, 81]`, but after a bit of thought I realized that the offender was me.  I had not read the instructions carefully enough, and was counting _all_ geometric progressions in the array.  This included two instances of `(1, 9, 81)`, which is a only a progression with $r=9$, and so should no be counted when $r=3$.  This was a straightforward fix.

In [78]:
# debugging 1

def is_geometric_progression(array, r=None):
    """Returns if array (sorted) is a geometric progression."""
    if not r:
        r = array[1]/array[0]
    return all([array[i+1]/array[i] == r for i in range(len(array)-1)])

def count_geo_progs(array, r, length):
    """Counts number of geometrical progressions with common ratio r
    of given length in array."""
    count = 0
    for combo in it.combinations(array, length):
        count += is_geometric_progression(combo, r)
    return count

In [79]:
# updating test cases

arrays = [
    [1,4,16,64],
    [1,3,9,9,27,81],
    [1,2,2,4],
    [1,5,5,25,125]
         ]

ratios = [4, 3, 2, 5]

args = [(array, ratio, 3) for array, ratio in zip(arrays, ratios)]

expected = [2, 6, 2, 4]

testing.test(count_geo_progs, args, expected)

   TESTING count_geo_progs    
Test 0:  count_geo_progs(([1, 4, 16, 64], 4, 3)...)
  Computed: 2
  Expected: 2
------------------------------
Test 1:  count_geo_progs(([1, 3, 9, 9, 27, 81], 3,...)
  Computed: 6
  Expected: 6
------------------------------
Test 2:  count_geo_progs(([1, 2, 2, 4], 2, 3)...)
  Computed: 2
  Expected: 2
------------------------------
Test 3:  count_geo_progs(([1, 5, 5, 25, 125], 5, 3...)
  Computed: 4
  Expected: 4
------------------------------
The following 0 tests are failing:
              []              


So now it is working for the limited amount of small test cases that I am working with here.  But as I said, most tests on HackerRank fail due to timeout.  Tomorrow I will begin to work towards optimizing my functions.

__Day 2__


This is a continuation of [my previous article](https://kvnloughead.github.io/HR-countTriplets-1.html#HR-countTriplets-1) on HackerRank's `countTriplets` problem.  It occurred to me last night after going to bed that perhaps we can decomplexify the algorithm in the following way

1.  Make a `Counter` object of numbers in the array.
2.  For each geometrically progressive triplet with common ratio $r$
    * get a count of the number of triplets by way of combinatorics.

For the last part, we are dealing with combinations:  since everything should stay in sorted order, we've no need to count different orderings separately.  Obviously, there must be at least one of each of the objects to make a valid triple.  If there are one of each object, there is one triple.  If the respective counts (up to reordering) are (1,1,2), there are $1\cdot 1\cdot 2=2=1+1$ combinations.  If the counts are $(1, 2, 2)$ we have $1\cdot 2\cdot 2\cdot = 4 = 2+2$; if they are $(1, 1, 3)$ we have $1\cdot 1\cdot 3 = 3 = 2 + 1$.  In short, every time we add one to one of the counters, we must increase our number of triples by the product of the other two counters, which is equivalent to the number of triples equalling the product of all three counts.


In [18]:
# proof of concept

from collections import Counter
from functools import reduce

def is_geometric_progression(array, r=None):
    """Returns if array (sorted) is a geometric progression of the form
            
            a*r**i, a*r**{i+1}, ...
        
        for nonnegative integers r and i.  The function simple checks 
        whether each previous element is divisible by its predecessor.
    """
    if not r:
        r = array[1]/array[0]
    return all([array[i+1]/array[i] == r for i in range(len(array)-1)])


def countTriples(array, r, length):
    total = 0
    num_counts = Counter(array)
    for combo in it.combinations(num_counts.keys(), length):
        if is_geometric_progression(combo, r):
            total += reduce((lambda x, y: x*y), [num_counts[c] for c in combo])
    return total
                
    

In [19]:
testing.test(countTriples, args, expected)

     TESTING countTriples     
Test 0:  countTriples(([1, 4, 16, 64], 4, 3)...)
  Computed: 2
  Expected: 2
------------------------------
Test 1:  countTriples(([1, 3, 9, 9, 27, 81], 3,...)
  Computed: 6
  Expected: 6
------------------------------
Test 2:  countTriples(([1, 2, 2, 4], 2, 3)...)
  Computed: 2
  Expected: 2
------------------------------
Test 3:  countTriples(([1, 5, 5, 25, 125], 5, 3...)
  Computed: 4
  Expected: 4
------------------------------
The following 0 tests are failing:
              []              


Still lots of timeouts, but also lots of wrong answers.  My combinatoric reasoning is faulty.  But here are some hints from the forum


    Can be done in O(n) -> single pass through data
    No division necessary and single multiplications by R are all that's needed
    Using map(C++) or dict(Java, Python) is a must -> can be unordered map (saves O(logN))
    Try to think forward when reading a value -> will this value form part of a triplet later?
    No need to consider (R == 1) as a corner case

Single pass through of the data!  The key phrase here is 'try to think forward...will this value form part of a triplet later?'  

In [20]:
t = {}
t['a'] = {'count':1}
t

{'a': {'count': 1}}

In [21]:
# not working for [1, 2, 1, 2, 4]

def countTriplets(array, r, debug=False):
    counts, triplets = 0, {}
    for i, a in enumerate(array):
        
        # 1st in triplet
        try:
            triplets[a]['count'] += 1
        except KeyError:
            triplets[a] = {'count':1}
            
        # 2nd in triplet
        if a/r in triplets:
            triplets[a/r][a] = triplets[a/r].get(a, 0) + 1
        if a/r**2 in triplets and a/r in triplets[a/r**2]:
            counts += triplets[a/r**2][a/r] * triplets[a/r**2]['count']
            
        if debug == True:
            print("a=", a)
            print("current state:", triplets)
            print("computed count:", counts)
            expected_counts = [0, 0, 0, 0, 3]
            print("expected count:", expected_counts[i])
            print("========================")
        
    return counts
        
countTriplets([1, 2, 1, 2, 4], 2, debug=True)
    

a= 1
current state: {1: {'count': 1}}
computed count: 0
expected count: 0
a= 2
current state: {1: {'count': 1, 2: 1}, 2: {'count': 1}}
computed count: 0
expected count: 0
a= 1
current state: {1: {'count': 2, 2: 1}, 2: {'count': 1}}
computed count: 0
expected count: 0
a= 2
current state: {1: {'count': 2, 2: 2}, 2: {'count': 2}}
computed count: 0
expected count: 0
a= 4
current state: {1: {'count': 2, 2: 2}, 2: {'count': 2, 4: 1}, 4: {'count': 1}}
computed count: 4
expected count: 3


4

In [22]:
# Now working for all but 4 test cases, and no timeouts!

def countTriplets(array, r, debug=False):
    counts, triplets = 0, {}
    for i, a in enumerate(array):
        
        triplets[a] = triplets.get(a, 0) + 1
        
        if a/r in triplets:
            triplets[(int(a/r), a)] = triplets.get((int(a/r), a), 0) + triplets[a/r]
            #print("Here", triplets[(int(a/r), a)])
            
        if (int(a/r**2), int(a/r)) in triplets:
            counts += triplets[(int(a/r**2), (a/r))]
        
            
        if debug == True:
            print("a=", a)
            print("current state:", triplets)
            print("computed count:", counts)
            expected_counts = [0, 0, 1, 2, 4, 6]
            print("expected count:", expected_counts[i])
            print("========================")
        
    return counts
        
#countTriplets([1, 2, 2, 4], 2, debug=True)
countTriplets([1, 3, 9, 9, 27, 81], 3, debug=True)
    

a= 1
current state: {1: 1}
computed count: 0
expected count: 0
a= 3
current state: {1: 1, 3: 1, (1, 3): 1}
computed count: 0
expected count: 0
a= 9
current state: {1: 1, 3: 1, (1, 3): 1, 9: 1, (3, 9): 1}
computed count: 1
expected count: 1
a= 9
current state: {1: 1, 3: 1, (1, 3): 1, 9: 2, (3, 9): 2}
computed count: 2
expected count: 2
a= 27
current state: {1: 1, 3: 1, (1, 3): 1, 9: 2, (3, 9): 2, 27: 1, (9, 27): 2}
computed count: 4
expected count: 4
a= 81
current state: {1: 1, 3: 1, (1, 3): 1, 9: 2, (3, 9): 2, 27: 1, (9, 27): 2, 81: 1, (27, 81): 1}
computed count: 6
expected count: 6


6

In [23]:
# updating test cases

arrays = [
    [1,4,16,64],
    [1,3,9,9,27,81],
    [1,2,2,4],
    [1,5,5,25,125],
    [1,2,1,2,4]
         ]

ratios = [4, 3, 2, 5, 2]

args = [(array, ratio) for array, ratio in zip(arrays, ratios)]

expected = [2, 6, 2, 4, 3]

testing.test(countTriplets, args, expected)

    TESTING countTriplets     
Test 0:  countTriplets(([1, 4, 16, 64], 4)...)
  Computed: 2
  Expected: 2
------------------------------
Test 1:  countTriplets(([1, 3, 9, 9, 27, 81], 3)...)
  Computed: 6
  Expected: 6
------------------------------
Test 2:  countTriplets(([1, 2, 2, 4], 2)...)
  Computed: 2
  Expected: 2
------------------------------
Test 3:  countTriplets(([1, 5, 5, 25, 125], 5)...)
  Computed: 4
  Expected: 4
------------------------------
Test 4:  countTriplets(([1, 2, 1, 2, 4], 2)...)
  Computed: 3
  Expected: 3
------------------------------
The following 0 tests are failing:
              []              


In [24]:
# debugging test6 runtime error
#   traceback points to `counts += triplets[(int(a/r**2), a/r)]`
#   with KeyError (2, 6.66667).
#
# runtime error resolved by removing all conversions to integers.  They
# aren't necessary anyway.

def countTriplets(array, r, debug=False):
    counts, triplets = 0, {}
    for i, a in enumerate(array):
        
        triplets[a] = triplets.get(a, 0) + 1
        
        if a/r in triplets:
            triplets[(a/r, a)] = triplets.get((a/r, a), 0) + triplets[a/r]
            
        if (a/r**2, a/r) in triplets:
            counts += triplets[(a/r**2, a/r)]
        
            
        if debug == True:
            print("a=", a)
            print("current state:", triplets)
            print("computed count:", counts)
            expected_counts = [0, 0, 1, 2, 4, 6]
            print("expected count:", expected_counts[i])
            print("========================")
        
    return counts
        
#countTriplets([1, 2, 2, 4], 2, debug=True)
#countTriplets([1, 3, 9, 9, 27, 81], 3, debug=True)
testing.test(countTriplets, args, expected)

    TESTING countTriplets     
Test 0:  countTriplets(([1, 4, 16, 64], 4)...)
  Computed: 2
  Expected: 2
------------------------------
Test 1:  countTriplets(([1, 3, 9, 9, 27, 81], 3)...)
  Computed: 6
  Expected: 6
------------------------------
Test 2:  countTriplets(([1, 2, 2, 4], 2)...)
  Computed: 2
  Expected: 2
------------------------------
Test 3:  countTriplets(([1, 5, 5, 25, 125], 5)...)
  Computed: 4
  Expected: 4
------------------------------
Test 4:  countTriplets(([1, 2, 1, 2, 4], 2)...)
  Computed: 3
  Expected: 3
------------------------------
The following 0 tests are failing:
              []              


In [32]:
# edge case:  r=1
#   I guess I should be able to do this without specifically handling the edge case
#   but I am looking for the easy way out at this point

import math

def choose(n, k):
    return (math.factorial(n)
            /math.factorial(k)*math.factorial(n-k))

def countTriplets(array, r, debug=False):
    counts, triplets = 0, {}
  

 # Integer division too large for float
#     if r == 1:
#         num_counts = Counter(array)
        
#         return sum([choose(num_counts[num], 3) for num in num_counts if 
#                       num_counts[num] >= 3])
    
    for i, a in enumerate(array):
        
        triplets[a] = triplets.get(a, 0) + 1
        
        if (a/r**2, a/r) in triplets:
            counts += triplets[(a/r**2, a/r)]
        
        if a/r in triplets:
            triplets[(a/r, a)] = triplets.get((a/r, a), 0) + triplets[a/r]
            
        
        
            
        if debug == True:
            print("a=", a)
            print("current state:", triplets)
            print("computed count:", counts)
            #expected_counts = [0, 0, 0, 0, 0, 1]
            #print("expected count:", expected_counts[i])
            print("========================")
        
    return counts
        
#countTriplets([1, 2, 2, 4], 2, debug=True)
#countTriplets([1, 3, 9, 9, 27, 81], 3, debug=True)
#countTriplets([1, 10, 100, 1000, 1, 1], 1, debug=True)

In [39]:
# parsing large test cases

# import shelve
# import requests

# urls = [
#         'https://hr-testcases-us-east-1.s3.amazonaws.com/70878/input02.txt?AWSAccessKeyId=AKIAJ4WZFDFQTZRGO3QA&Expires=1563291040&Signature=z2g0QtJ0cq0lj%2FhSwBr6OWBwSms%3D&response-content-type=text%2Fplain',
#         'https://hr-testcases-us-east-1.s3.amazonaws.com/70878/input03.txt?AWSAccessKeyId=AKIAJ4WZFDFQTZRGO3QA&Expires=1563290917&Signature=LlwGAopC1sKOc4GbqKXExtEv3mo%3D&response-content-type=text%2Fplain',
#         'https://hr-testcases-us-east-1.s3.amazonaws.com/70878/input06.txt?AWSAccessKeyId=AKIAJ4WZFDFQTZRGO3QA&Expires=1563290974&Signature=nGZxKYX6s4fp6Igu9%2BSZU5m7j7k%3D&response-content-type=text%2Fplain',
#         'https://hr-testcases-us-east-1.s3.amazonaws.com/70878/input11.txt?AWSAccessKeyId=AKIAJ4WZFDFQTZRGO3QA&Expires=1563291004&Signature=HpQ2PIMicIVDkA%2BQ3HP%2BFa75VFg%3D&response-content-type=text%2Fplain'
# ]
# names = ['test2', 'test3', 'test6', 'test11']


# shelf = shelve.open('shelf')

# data = {}

# for i, url in enumerate(urls):

#     raw_test = requests.get(url)

#     with raw_test as f:
#         test = [int(x) for x in f.text.replace('1\n', '').replace('3\n', '').split(' ')[1:] if x]
#         data[names[i]] = test 

        
# shelf['countTriplets'] = data

test2, test3, test6, test11 = (shelf['countTriplets']['test2'], shelf['countTriplets']['test3'], 
                               shelf['countTriplets']['test6'], shelf['countTriplets']['test11'])

# a hack to fix the length of this test case:  somehow a '1' got cut off
test2.append(1)
shelf['countTriplets']['test2'] = test2

In [40]:
# updating test cases

arrays = [
    [1,4,16,64],
    [1,3,9,9,27,81],
    [1,2,2,4],
    [1,5,5,25,125],
    [1,2,1,2,4],
    test2,
    test3,
    test6,
    test11
         ]

ratios = [4, 3, 2, 5, 2, 1, 1, 3, 1]

args = [(array, ratio) for array, ratio in zip(arrays, ratios)]

expected = [2, 6, 2, 4, 3, 161700, 166661666700000, 2325652489, 1667018988625]

testing.test(countTriplets, args, expected)

    TESTING countTriplets     
Test 0:  countTriplets(([1, 4, 16, 64], 4)...)
  Computed: 2
  Expected: 2
------------------------------
Test 1:  countTriplets(([1, 3, 9, 9, 27, 81], 3)...)
  Computed: 6
  Expected: 6
------------------------------
Test 2:  countTriplets(([1, 2, 2, 4], 2)...)
  Computed: 2
  Expected: 2
------------------------------
Test 3:  countTriplets(([1, 5, 5, 25, 125], 5)...)
  Computed: 4
  Expected: 4
------------------------------
Test 4:  countTriplets(([1, 2, 1, 2, 4], 2)...)
  Computed: 3
  Expected: 3
------------------------------
Test 5:  countTriplets(([1, 1, 1, 1, 1, 1, 1, 1,...)
  Computed: 166650
  Expected: 161700
166650 161700
        TEST 5 FAILED         
------------------------------
Test 6:  countTriplets(([1237, 1237, 1237, 1237,...)
  Computed: 166666666650000
  Expected: 166661666700000
166666666650000 166661666700000
        TEST 6 FAILED         
------------------------------
Test 7:  countTriplets(([1, 17, 80, 68, 5, 5, 58...)
  Compu

Removing a single '1' from `test2` gives us the expected value 161,700.  This is a clue.  I realized this through a fortuituous mistake.  When parsing through the large test cases, my code accidentally removed one of the `1`s.  So, I was getting a correct answer when running it locally (with my mistaken input), but getting it marked wrong by HackerRank.  The lengths of the other $r=1$ test cases are correct.

In [52]:
for i in range(3, 11):
    print(choose(i,2))

3.0
6.0
10.0
15.0
21.0
28.0
36.0
45.0


In [61]:
# edge case:  r=1
#   I guess I should be able to do this without specifically handling the edge case
#   but I am looking for the easy way out at this point

import math

def choose(n, k):
    return (math.factorial(n)
            /(math.factorial(k)*math.factorial(n-k)))

def countTriplets(array, r, debug=False):
    counts, triplets = 0, {}
  

 # Integer division too large for float
#     if r == 1:
#         num_counts = Counter(array)
        
#         return sum([choose(num_counts[num], 3) for num in num_counts if 
#                       num_counts[num] >= 3])
    
    for i, a in enumerate(array):
        
        if (a/r**2, a/r) in triplets:
            counts += triplets[(a/r**2, a/r)]
            
        if a/r in triplets:
            triplets[(a/r, a)] = triplets.get((a/r, a), 0) + triplets[a/r]
        
        triplets[a] = triplets.get(a, 0) + 1
        
        if debug == True:
            print("i=", i)
            print("a=", a)
            print("current state:", triplets)
            print("computed count:", counts)
            #expected_counts = [0, 0, 0, 0, 0, 1]
            #print("expected count:", expected_counts[i])
            print("========================")
        
    return counts
        
#countTriplets([1, 2, 2, 4], 2, debug=True)
#countTriplets([1, 3, 9, 9, 27, 81], 3, debug=True)
#countTriplets([1, 10, 100, 1000, 1, 1], 1, debug=True)
countTriplets(test2, 1, debug=True)

#[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
#[0, 0, 1, 4, ]

i= 0
a= 1
current state: {1: 1}
computed count: 0
i= 1
a= 1
current state: {1: 2, (1.0, 1): 1}
computed count: 0
i= 2
a= 1
current state: {1: 3, (1.0, 1): 3}
computed count: 1
i= 3
a= 1
current state: {1: 4, (1.0, 1): 6}
computed count: 4
i= 4
a= 1
current state: {1: 5, (1.0, 1): 10}
computed count: 10
i= 5
a= 1
current state: {1: 6, (1.0, 1): 15}
computed count: 20
i= 6
a= 1
current state: {1: 7, (1.0, 1): 21}
computed count: 35
i= 7
a= 1
current state: {1: 8, (1.0, 1): 28}
computed count: 56
i= 8
a= 1
current state: {1: 9, (1.0, 1): 36}
computed count: 84
i= 9
a= 1
current state: {1: 10, (1.0, 1): 45}
computed count: 120
i= 10
a= 1
current state: {1: 11, (1.0, 1): 55}
computed count: 165
i= 11
a= 1
current state: {1: 12, (1.0, 1): 66}
computed count: 220
i= 12
a= 1
current state: {1: 13, (1.0, 1): 78}
computed count: 286
i= 13
a= 1
current state: {1: 14, (1.0, 1): 91}
computed count: 364
i= 14
a= 1
current state: {1: 15, (1.0, 1): 105}
computed count: 455
i= 15
a= 1
current state: {1

161700

In [54]:
arr = [0, 1, 4, 10, 20, 35, 56, 84, 120, 165]
diffs = [arr[i] - arr[i-1] for i in range(len(arr)-1, 0, -1)]
diffs

[45, 36, 28, 21, 15, 10, 6, 3, 1]

In [None]:
def countTriplets(array, r, debug=False):
    counts, triplets = 0, {}    
    for i, a in enumerate(array):
        
        if (a/r**2, a/r) in triplets:
            counts += triplets[(a/r**2, a/r)]
            
        if a/r in triplets:
            triplets[(a/r, a)] = triplets.get((a/r, a), 0) + triplets[a/r]
        
        triplets[a] = triplets.get(a, 0) + 1
        
    return counts