<a href="https://colab.research.google.com/github/Komal77rao/Data-Eng-Modules/blob/main/index.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Saving Records with Objects

### Introduction

As we may have noticed, certain SQL operations are a little tedious and tricky to perform.  For example, creating a new record in a database requires an INSERT INTO statement that is not so easy to get quite right.  Let's make it easier.  

In this lesson, we'll walk through some operations for saving our records.

### Saving Made Easier

Instead of writing a separate INSERT INTO statement for each table, we'll write a `save` function that will allow us to create an instance of a class, and then automatically store a new record with the correct attributes in the new table.  For example, if we create an instance of User with the following attributes:

```python
user = User()
user.name = 'bob'
user.birthday = '8/3/1997'
```

Then, ideally, we could just call the function `save` like so:
    
```python
save(user, test_conn, test_cursor)
```

And save will execute the following command.

```python
insert_str = f"""INSERT INTO users (name, birthday) VALUES (%s, %s);"""
cursor.execute(insert_str, ('bob', '8/3/1997'))
```

### How it works

The key to automating this is to realize that every time we save a user instance, we are always inserting into the same table, and we want the attributes of the user instance, to be stored in a respective column in the database.  So we begin by telling each instance of user class about it's table, and the columns in the database with the following:

In [None]:
class User():
    __table__ = 'users'
    columns = ['id', 'name', 'birthday']

In [None]:
user = User()
user.name = 'bob'
user.birthday = '3/5/1997'

In [None]:
user.__table__

'users'

In [None]:
user.columns

['id', 'name', 'birthday']

And so, if we call a save function, we can begin to remove some of the hardcoding, here the table name.

```python
def save(obj):
    insert_str = f"""INSERT INTO {obj.__table__} (name, birthday) VALUES (%s, %s);"""
    cursor.execute(insert_str, ('bob', '8/3/1997'))
    cursor.commit()
````

So above, the tablename comes from the `__table__` attribute.  

Now going forward, we'll want to move the remaining hard coded values, that will change from table to table.  So this means we'll need write functions to return:
* the names of the columns we want to update,
* the number of `%s` values we need after the word VALUES, and
* the tuple that we pass into `cursor.execute()` (eg. ('bob', '8/3/1997') ).

### Begin with the Tuple

Let's start with the tuple `('bob', '8/3/1997')`.  Given our User class, and user instance below, we'll need to find a way to automatically generate this tuple.

In [None]:
class User():
    __table__ = 'users'
    columns = ['id', 'name', 'birthday']

user = User()
user.name = 'bob'
user.birthday = '3/5/1997'

A first though may be to simply gather the values on the user instance like so:

In [None]:
user.__dict__.values()

dict_values(['bob', '3/5/1997'])

But we should also make sure that each of these attributes have corresponding columns in the database.  This way, when we create a new record in a table, we make sure our attribute lines up with a column in that table.

Well remember that we have hardcoded the list of columns in our class with `columns`.

In [None]:
user.columns

['id', 'name', 'birthday']

So we can go one by one through those columns, and retrieve the corresponding value in the user attributes.

In [None]:
obj_attrs = user.__dict__
obj_attrs

{'name': 'bob', 'birthday': '3/5/1997'}

> Below, we access the values of the dictionary that correspond to our database columns with the following.

In [None]:
obj_attrs = user.__dict__
[obj_attrs[attr] for attr in user.columns if attr in obj_attrs.keys() ]

['bob', '3/5/1997']

So that can be our `values` function.

In [None]:
def values(obj):
    obj_attrs = obj.__dict__
    return [obj_attrs[attr] for attr in obj.columns if attr in obj_attrs.keys()]

And now, given an instance, we can find the correct values to insert a new record with:

In [None]:
values(user)

['bob', '3/5/1997']

And use this `values` function in our `cursor.execute` statement.

```python
insert_str = f"""INSERT INTO {obj.__table__} (name, birthday) VALUES (%s, %s);"""
cursor.execute(insert_str, values(user))
```

### Finding the columns to update

The next part of the INSERT statement to tackle are the column names -- `(name, birthday)`.  We'd like to avoid hard coding them.

We can get the name and birthday columns in a similar way to how we wrote the values function.  This time, we just return the key itself.  

In [None]:
def keys(obj):
    obj_attrs = obj.__dict__
    return [attr for attr in obj.columns if attr in obj_attrs.keys()]

In [None]:
keys(user)

['name', 'birthday']

> Compare this with our values function.

In [None]:
# def values(obj):
#     obj_attrs = obj.__dict__
#     return [obj_attrs[attr] for attr in obj.columns if attr in obj_attrs.keys()]

And then to add in our string, we join them together by a comma.

In [None]:
def keys(obj):
    obj_attrs = obj.__dict__
    selected = [attr for attr in obj.columns if attr in obj_attrs.keys()]
    return ', '.join(selected)

In [None]:
keys(user)

'name, birthday'

So now if we update our INSERT INTO code to use our `keys` function, we'll see that we are almost there.

```python
insert_str = f"""INSERT INTO {obj.__table__} ({keys(obj)}) VALUES (%s, %s);"""
cursor.execute(insert_str, values(obj))
```

The last item is to place in the correct number of `%s` values in our insert string.  We can place in the correct number of `%s` values with the following:

In [None]:
', '.join(len(values(user)) * ['%s'])

'%s, %s'

So putting all of this together, we have a save function that looks like the following:

In [None]:
def save(obj, conn, cursor):
    s_str = ', '.join(len(values(obj)) * ['%s'])
    insert_str = f"""INSERT INTO {obj.__table__} ({keys(obj)}) VALUES ({s_str});"""
    cursor.execute(insert_str, list(values(obj)))
    conn.commit()

And it relies on the keys and values functions.

In [None]:
def keys(obj):
    obj_attrs = obj.__dict__
    return [attr for attr in obj.columns if attr in obj_attrs.keys()]

In [None]:
# def values(obj):
#     obj_attrs = obj.__dict__
#     return [obj_attrs[attr] for attr in obj.columns if attr in obj_attrs.keys()]

As well as defining the `__table__` and `columns` on each class.

In [None]:
class User():
    __table__ = 'users'
    columns = ['id', 'name', 'birthday']

### Summary

In this lesson, we walked through the fundamentals of writing a save function.  The key is to add the `__table__` attribute and list of columns to each class.  Then we can retreive the table name, columns to update and the values to insert into the table.  Our relevant code looks like the following:

In [None]:
class User():
    __table__ = 'users'
    columns = ['id', 'name', 'birthday']

And the code for our save function looks like:

In [None]:
def values(obj):
    obj_attrs = obj.__dict__
    return [obj_attrs[attr] for attr in obj.columns if attr in obj_attrs.keys()]

def keys(obj):
    obj_attrs = obj.__dict__
    return [attr for attr in obj.columns if attr in obj_attrs.keys()]

def save(obj, conn, cursor):
    s_str = ', '.join(len(values(obj)) * ['%s'])
    insert_str = f"""INSERT INTO {obj.__table__} ({keys(obj)}) VALUES ({s_str});"""
    cursor.execute(insert_str, values(obj))
    conn.commit()