<a href="https://colab.research.google.com/github/gt-cse-6040/comparing-base-objects/blob/main/comparing_base_objects.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Comparing Basic Python Objects

We can check whether two basic Python objects are equivalent by verifying that they are the same `type` and verifying that they have the same `value`. We will define some objects and demonstrate comparing them.

In [35]:
int_a = 2004
int_b = 2012
int_c = 2004

float_a = 2004.0
float_b = 2012.0
float_aa = 2004 + 10**(-12)

bool_a = True
bool_b = True
bool_c = False

str_a = "2004"
str_b = "2012"
str_c = "2004"


Now we can put them in an object to make it easier to compare them later.

In [36]:
test_vars = {'int_a':int_a, 'int_b':int_b, 'int_c':int_c, 
             'float_a':float_a, 'float_b':float_b, 'float_aa':float_aa, 
             'bool_a':bool_a, 'bool_a':bool_a, 'bool_c': bool_c, 
             'str_a':str_a, 'str_b':str_b, 'str_c':str_c}


Now let's define a function to verify that both are the same type. We will assume that the first variable (`x`) is known to be the desired type and value and (`y`) is the variable we're checking against it. Here's the plan:
- Check if `y` is an instance of whatever class `x` is an instance of.
 - If there is a mismatch, return `False`.
- Check to see if `x` is a `bool`, `str`, or `int` type. 
  - If it is one of those types, return `False` if the values are not the same.
- Check to see if `x` is a `float`.
  - Since `float` variables are approximations, we should specify some tolerance. If the absolute difference between `x` and `y` is less than or equal to the tolerance we will consider the values equal.
  - If the absolute difference between `x` and `y` is above the tolerance we will return `False`.
- If none of the previous steps returned `False`, return `True`

We are going to need additional variables `tol` to specify the tolerance for floats and `verbose` to toggle some debugging messages.

In [44]:
def compare_basic(x: any, y: any, verbose: bool=False, tol: float=0.0)->bool:

  if not isinstance(y, type(x)):
    if verbose: print(f'{type(x)} x is not the same type as {type(y)} y.')
    return False

  if isinstance(x, (int, bool, str)):
    if x != y:
      if verbose: print(f'{x} != {y}')
      return 
      
  if isinstance(x, float):
    if abs(x-y) > tol:
      if verbose: print(f'{x} != {y}')
      return False

  if verbose: print(f'{x} == {y}')
  return True

Now let's go through the combinations and compare.

In [45]:
from itertools import combinations
for k1, k2 in combinations(test_vars, 2):
  print()
  print(f'Comparing {k1} and {k2}')
  compare_basic(test_vars[k1], test_vars[k2], verbose=True)


Comparing int_a and int_b
2004 != 2012

Comparing int_a and int_c
2004 == 2004

Comparing int_a and float_a
<class 'int'> x is not the same type as <class 'float'> y.

Comparing int_a and float_b
<class 'int'> x is not the same type as <class 'float'> y.

Comparing int_a and float_aa
<class 'int'> x is not the same type as <class 'float'> y.

Comparing int_a and bool_a
2004 != True

Comparing int_a and bool_c
2004 != False

Comparing int_a and str_a
<class 'int'> x is not the same type as <class 'str'> y.

Comparing int_a and str_b
<class 'int'> x is not the same type as <class 'str'> y.

Comparing int_a and str_c
<class 'int'> x is not the same type as <class 'str'> y.

Comparing int_b and int_c
2012 != 2004

Comparing int_b and float_a
<class 'int'> x is not the same type as <class 'float'> y.

Comparing int_b and float_b
<class 'int'> x is not the same type as <class 'float'> y.

Comparing int_b and float_aa
<class 'int'> x is not the same type as <class 'float'> y.

Comparing int_

That's a lot to go through, but with the verbose output it is clear that we are getting what we expect. We should probably see if changing the tolerance has an effect on comparing `float_a` and `float_aa`

In [46]:
compare_basic(float_a, float_aa, verbose=True, tol=0.00001)

2004.0 == 2004.000000000001


True

Changing the threshold had the desired effect of ignoring the small difference between floating point numbers.

# Comparing Nested Objects

In [None]:
def compare_nested(x: any, y: any, verbose: bool=False, tol: float=0.0)->bool:
  if not isinstance(y, type(x)):
    if verbose: print(f'{type(x)} x is not the same type as {type(y)} y.')
    return False
  if isinstance(x, (int, float, bool, str)):
    return compare_basic(x, y, verbose=verbose, tol=tol)
  if isinstance(x, set):
    if len(x) != len(y):
      if verbose: print(f'x ({len(x)}) and y ({len(y)}) have different sizes.')
      return False
    if len(x - y) > 0:
      if verbose: print('The two sets contain different items.')
      return False
  if isinstance(x, (list, tuple)):
    if len(x) != len(y):
      if verbose: print(f'Lists x ({len(x)}) and y ({len(y)}) have different lengths.')
      return False
    for x_i, y_i in zip(x, y):
      if not compare_nested(x_i, y_i, verbose=verbose, tol=tol):
        if verbose: print('Child objects in lists not equal.')
        return False
  if isinstance(x, dict):
    if x.keys() != y.keys():
      if verbose: print('Dict keys do not match.')
      return False
    for k in x:
      if not compare_nested(x[k], y[k], verbose=verbose, tol=tol):
        if verbose: print('Child objects in dicts not equal.')
        return False
  if verbose: print('Nested Structures Equivalent')
  return True