## *Defining INPUT-STREAM*

In [1]:
ip = [3, 1, 2, 1, 27, 8, 7, 8, 2, 3, 1, 3, 6, 2, 1, 1, 2, 3, 4, 7, 8, 1, 2, 3, 7, 8, 1, 2, 3, 4]

## **Hash Functions**

In [2]:
h1 = lambda x: (3*x + 1) % 5
h2 = lambda x: (6*x + 1) % 16
h3 = lambda x: (2*x + 1) % 32

## **FlajoLet Martin Algorithm**

In [3]:
from tabulate import tabulate
from math import log
def flajolet_martin(ip,hash_fn,hash_f_v):
    fm_array = []
    main_array = []

    for h in [hash_fn]:
        r = []
        for i in ip:
            a = h(i)
            # Taking Binary
            b = bin(a)[2:][::-1]
            if '1' in b:
                r.append(b.index('1'))
                ra = b.index('1') 
            else:
                r.append(0)
                ra = 0
            main_array.append([i,a,b[::-1],ra])    
        fm_array.append(2**max(r))
    # print(fm_array)
  
    headings = ["Input","h(x) = "+str(hash_f_v),"Binary","Trailing zeros r(a)"]
    print(tabulate(main_array,headers=headings))
    print("--------------------------------------------------------------------------------------------------")
    print("We know R = max(r(a))")
    if (sum(fm_array)/len(fm_array) == 1.0):
      print("Hence here , R == 0.0")
    else: 
      print("Hence here , R == ",log(sum(fm_array)/len(fm_array),2))
    print("Now the Estimate is found using 2^R ")
    print(f"Hence , Estimated unique values that is Distinct Count is >>> {sum(fm_array)/len(fm_array)}")
    print("--------------------------------------------------------------------------------------------------")

## **Using the hash function  that is 3x+5 % 5**

In [4]:
flajolet_martin(ip,h1,"3x+5 % 5")

  Input    h(x) = 3x+5 % 5    Binary    Trailing zeros r(a)
-------  -----------------  --------  ---------------------
      3                  0         0                      0
      1                  4       100                      2
      2                  2        10                      1
      1                  4       100                      2
     27                  2        10                      1
      8                  0         0                      0
      7                  2        10                      1
      8                  0         0                      0
      2                  2        10                      1
      3                  0         0                      0
      1                  4       100                      2
      3                  0         0                      0
      6                  4       100                      2
      2                  2        10                      1
      1                  4       100    

## **Using the hash function  that is 6x+1 % 16**

In [5]:
flajolet_martin(ip,h2,"6x+1 % 16")

  Input    h(x) = 6x+1 % 16    Binary    Trailing zeros r(a)
-------  ------------------  --------  ---------------------
      3                   3        11                      0
      1                   7       111                      0
      2                  13      1101                      0
      1                   7       111                      0
     27                   3        11                      0
      8                   1         1                      0
      7                  11      1011                      0
      8                   1         1                      0
      2                  13      1101                      0
      3                   3        11                      0
      1                   7       111                      0
      3                   3        11                      0
      6                   5       101                      0
      2                  13      1101                      0
      1                 

## **Using the hash function  that is 2x+1 % 32**

In [6]:
flajolet_martin(ip,h3,"2x+1 % 32")

  Input    h(x) = 2x+1 % 32    Binary    Trailing zeros r(a)
-------  ------------------  --------  ---------------------
      3                   7       111                      0
      1                   3        11                      0
      2                   5       101                      0
      1                   3        11                      0
     27                  23     10111                      0
      8                  17     10001                      0
      7                  15      1111                      0
      8                  17     10001                      0
      2                   5       101                      0
      3                   7       111                      0
      1                   3        11                      0
      3                   7       111                      0
      6                  13      1101                      0
      2                   5       101                      0
      1                 

## **Using the hash function  that is 4x+1 % 3**

In [7]:
h4 = lambda x: (4*x + 1) % 3
flajolet_martin(ip,h4,"4x+1 % 3")

  Input    h(x) = 4x+1 % 3    Binary    Trailing zeros r(a)
-------  -----------------  --------  ---------------------
      3                  1         1                      0
      1                  2        10                      1
      2                  0         0                      0
      1                  2        10                      1
     27                  1         1                      0
      8                  0         0                      0
      7                  2        10                      1
      8                  0         0                      0
      2                  0         0                      0
      3                  1         1                      0
      1                  2        10                      1
      3                  1         1                      0
      6                  1         1                      0
      2                  0         0                      0
      1                  2        10    

# **Analysis and Conclusion**

### We used 4 hash functions 
 1. h(x) = [ (3x+5) % 5 ] with count of distinct elements in the stream found to be = 4
 2. h(x) = [ (6x+1) % 16 ] with count of distinct elements in the stream found to be = 1
 3. h(x) = [ (2x+1) % 32 ] with count of distinct elements in the stream found to be = 1
 4. h(x) = [ (4x+1) % 3 ] with count of distinct elements in the stream found to be = 2


For 1st and 4th hash functions we got a bunch of odd and even binaries which gave us some trailing zeros leading to a estimate value. But in case of 2nd and 3rd hash functions we never got any trailing zeros.

Hence we analyze that, when we used ax+b mod 2^k and we got all odd binary numbers which had no trailing zeros like in 2nd and 3rd hash functions above so those are not appropriate hash functions.