# Data Representation
- We will use object oriented programming (OOP) to handle our data
- Up to this point, we scraped data and put them in csv file
- Now we want to load them from csv files.
- Rather than representing the data as a "row" in the database, we can encapsulate them in objects so we can deal with "objects" and "interactions" with the object

- We will represent our data as ``Media`` object
  - ``Media`` object has attributes such as `media_id`, `image`, etc..
  - ``Media`` object also has references to `User` and `Location`
  - ``User`` and ``Location`` are also objects
  
  - There are three types of ``Media``, which we are using design pattern called [inheritance](https://www.w3schools.com/python/python_inheritance.asp)
      - `MediaImage`
      - `MediaVideo`
      - `MediaCarousel`

Let's import libraries and the models

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import os, sys, time, datetime, pathlib
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline

PROJECT_ROOT = pathlib.Path('../../../project-TT')
sys.path.append(str(PROJECT_ROOT / 'backend'))

from core.db.models import User, Location, Media, MediaCarousel, MediaImage, MediaVideo

# Object definitions

- We briefly describe how each object looks like, and what kind of attributes they have
- Firstly, `User` is an object which has `id`, `username` and `full_name`.

```python
class User:
    def __init__(self, id, name, full_name):
        # ...
```

- Secondly, `Location` is an object which has `id`, `name` and `slug`.

```python
class Location:
    def __init__(self, id, name, slug):
        # ...
```

### Media class and subclasses
- ANY Instagram feed will contain following attributes:
    - user,
    - location,
    - id,
    - url,
    - time created
    - caption,
    - comments,
    - number of likes
    - etc.. 
- Media might be image, video or carousel, which we don't know yet. 
- But in order to represent the underlying "template" for instagram posts, we define following abstract class called `Media`

Parent class:
```python
class Media(abc.ABCMeta):
    """ Abstract class representing Media object """
    def __init__(
        self,
        id,
        url,
        time,
        likes_count,
        caption,
        comments,
        user,
        location,
    ):
        # ...
```

- In order to represent the concrete instance of media -- e.g., a media containing image -- we have `MediaImage` which "subclasses" `Media`


Child class:
```python
class MediaImage(Media):
    """ Concrete class representing image media """
    def __init__(
        self,
        id,
        url,
        time,
        likes_count,
        caption,
        comments,
        user,
        location,
        image_url,
        thumbnail_url,
        image=None,
        thumbnail=None,
    ):
        # ...
```

- We also have `MediaVideo` and `MediaCarousel`

- Notice that `Media` object has references to `User` and `Location`.
- Now, how can we convert each row of pandas table into this `Media` object? We need to perform several things:
    - We first need to create `User` and `Location` object
    - we then need to parse convert `comments` into a list of comments
    - Convert raw "time" representation into python `DateTime` object
    - load image from database
    
- These are the underlying things you have to perform before loading into a model. We do this in a method called `MediaImage.create_from_row(dataframe_row)`

Let's try loading a row in dataset into an object!

In [None]:
# load dataset csv
dataset = pd.read_csv('../../data/data.csv', quotechar="'")

In [None]:
# load first element
media = dataset.iloc[0]

In [None]:
# load the first element into MediaImage
media_image = MediaImage.create_from_row(media)
print(media_image)

that's it!

In [None]:
# we can access attributes like this:
print(media_image.id)
print(media_image.caption)
print(media_image.comments)
print(media_image.user)
print(media_image.location)

In [None]:
# load the thumbnail
media_image.load_thumbnail(root_dir=PROJECT_ROOT)

In [None]:
# load the image
media_image.load_image(root_dir=PROJECT_ROOT)

We can load 100 subjects

In [None]:
# list to keep track of them
media_images = []
for i in range(100):
    media_image = MediaImage.create_from_row(dataset.iloc[i])
    media_image.load_image(root_dir=PROJECT_ROOT)
    media_image.load_thumbnail(root_dir=PROJECT_ROOT)
    media_images.append(media_image)

In [None]:
# lets visualize the thumbnails!
fig, axes = plt.subplots(nrows=10, ncols=10, figsize=(15,15))
axes = axes.ravel()
for i, media in enumerate(media_images):
    axes[i].imshow(media.thumbnail);
    axes[i].axis('off');
plt.subplots_adjust()
plt.show()