### Project 1

In this project our goal is to validate one dictionary structure against a template dictionary.

A typical example of this might be working with JSON data inputs in an API. You are trying to validate this received JSON against some kind of template to make sure the received JSON conforms to that template (i.e. all the keys and structure are identical - value types being important, but not the value itself - so just the structure, and the data type of the values).

To keep things simple we'll assume that values can be either single values (like an integer, string, etc), or a dictionary, itself only containing single values or other dictionaries, recursively. In other words, we're not going to deal with lists as possible values. Also, to keep things simple, we'll assume that all keys are **required**, and that no extra keys are permitted.

In practice we would not have these simplifying assumptions, and although we could definitely write this ourselves, there are many 3rd party libraries that already exist to do this (such as `jsonschema`, `marshmallow`, and many more, some of which I'll cover lightly in some later videos.)

For example you might have this template:

In [1]:
template = {
    'user_id': int,
    'name': {
        'first': str,
        'last': str
    },
    'bio': {
        'dob': {
            'year': int,
            'month': int,
            'day': int
        },
        'birthplace': {
            'country': str,
            'city': str
        }
    }
}

So, a JSON document such as this would match the template:

In [2]:
john = {
    'user_id': 100,
    'name': {
        'first': 'John',
        'last': 'Cleese'
    },
    'bio': {
        'dob': {
            'year': 1939,
            'month': 11,
            'day': 27
        },
        'birthplace': {
            'country': 'United Kingdom',
            'city': 'Weston-super-Mare'
        }
    }
}

But this one would **not** match the template (missing key):

In [3]:
eric = {
    'user_id': 101,
    'name': {
        'first': 'Eric',
        'last': 'Idle'
    },
    'bio': {
        'dob': {
            'year': 1943,
            'month': 3,
            'day': 29
        },
        'birthplace': {
            'country': 'United Kingdom'
        }
    }
}

And neither would this one (wrong data type):

In [4]:
michael = {
    'user_id': 102,
    'name': {
        'first': 'Michael',
        'last': 'Palin'
    },
    'bio': {
        'dob': {
            'year': 1943,
            'month': 'May',
            'day': 5
        },
        'birthplace': {
            'country': 'United Kingdom',
            'city': 'Sheffield'
        }
    }
}

Write a function such this:

In [24]:
def flatten_dict(d):
    res = {}
    for k, v in d.items():
        if isinstance(v, dict):
            res.setdefault(k,dict)
            res.update(flatten_dict(v))
        else:
            if type(v) == type:
                res[k] = v
            else:
                res[k] = type(v)
    return res



def validate(d, template):
    dt = flatten_dict(d)
    templ = flatten_dict(template)
    # keys_diff = dt.keys() ^ templ.keys()
    
    
    for k,v in templ.items():
        if k not in dt.keys():
            raise KeyError(f'{k} does not exist')
        
        # print(v, dt[k])
        # if type(v) != type(dt[k]):
        #     raise TypeError(f'Data type for key {k}: {v} does not match')
            
    return templ
        

validate(michael, template)

SyntaxError: cannot assign to conditional expression (1004940857.py, line 8)

That should return this:
* `validate(john, template) --> True, ''`
* `validate(eric, template) --> False, 'mismatched keys: bio.birthplace.city'`
* `validate(michael, template) --> False, 'bad type: bio.dob.month'`

Better yet, use exceptions instead of return codes and strings!

# Sandbox

In [None]:
#     for k, v in flat_d.items():
#         if v == dict:
#             path += f'{k}.'
            
#         if k not in flat_t.keys():
#             raise KeyError(f'{k} not in template')
            
#         elif v != flat_t[k]:
#             raise TypeError(
#                 # f'Dtype missmatch for key:{k}. d={v}, template={flat_t[k]}'
#                 f'Missmatch data type: d.{path+k}={v}, template.{path+k}={flat_t[k]}'
#             )
    
#     return flat_d


In [234]:
def flatten_dict(d):
    res = {}
    for k, v in d.items():
        if isinstance(v, dict):
            res.setdefault(k,dict)
            res.update(flatten_dict(v))
        else:
            if type(v) == type:
                res[k] = v
            else:
                res[k] = type(v)
    return res

def get_path(d, missing_key):
    p = f''
    for k, v in d.items():
        if v == dict:
            p += f'{k}.'
            
        elif k in missing_key:
            p+= f'{k}'
            break
    return p

def validate_key(flat_d1, flat_d2):
    validate_key = flat_d1.keys() - flat_d2.keys()
    
    missing_key = []
    if validate_key:
        for i in validate_key:
            missing_key.append(get_path(flat_d1, i))
    
    return missing_key
            
    

def validate(d, template):
    flat_d = flatten_dict(d)
    flat_t = flatten_dict(template)
    
    if (r_key:= validate_key(flat_t, flat_d)):
        raise KeyError(f'{",".join(r_key)} not in dictionary')
        
    if (r_key:= validate_key(flat_d, flat_t)):
        raise KeyError(f'{",".join(r_key)} not in template')
    
validate(d1, d2)    



d1 = {'a':1,
     'b':2,
     'c':{'x':1,
          'y':2
         }
    }

d2 = {'a':int,
     'b':int,
     'c':{'x':int,
          'y':int,
          'z':int,
         }
    }

# flatten_dict(d2)

# validate_key(flat_t,flat_d)
# get_path(flat_t, {'a'})

KeyError: 'c.z not in dictionary'

In [215]:
d2 = {'a':int,
     'b':int,
     'c':{'x':int,
          'y':int,
          'z':int,
         }
    }

def path(d, missing_key):
    p = f''
    for k, v in d.items():
        if v == dict:
            p += f'{k}.'
            
        elif k in missing_key:
            p+= f'{k}'
            break
    return p

path(flatten_dict(d2), {'z'})

'c.z'

In [212]:
t1 = {'a','b'}
t2 = {'a'}

t1 - t2

{'b'}