# Task No.2 Normalization

**Student:** Jingyu Yan

## 0. Task

1. Read Lectures #2 Normalization

2. Normalize the dataset from Task1 and create a relational database scheme for the OLTP system based on it.

3. Created relational database. You can use any RDMS like Oracle, MS SQL Server, Postgres, MySQL, MariaDB etc. 

4. Provide the created relational database schema (attached schema as example).

![](images/database_schema.jpg)

## 1. Analysis

Through the Task_1 case, I have found a dataset named "eCommerce Events History in Cosmetics Shop" and conducted a basic data analysis.

Task_1 url: [https://github.com/tunmx/Databases_and_Data_Storages_Technologies/blob/main/task_1/Task1_OLTP_Systems.ipynb](https://github.com/tunmx/Databases_and_Data_Storages_Technologies/blob/main/task_1/Task1_OLTP_Systems.ipynb)

This dataset only has one 'events' table and may have been derived from a joint table query, making it suitable for data analysis. 

In [1]:
import pandas as pd
from IPython.display import display

# read the csv
orders_data = pd.read_csv('../dataset/2020-Jan.csv')
# show top 5
display(orders_data.head(5))

Unnamed: 0,event_time,event_type,product_id,category_id,category_code,brand,price,user_id,user_session
0,2020-01-01 00:00:00 UTC,view,5809910,1602943681873052386,,grattol,5.24,595414620,4adb70bb-edbd-4981-b60f-a05bfd32683a
1,2020-01-01 00:00:09 UTC,view,5812943,1487580012121948301,,kinetics,3.97,595414640,c8c5205d-be43-4f1d-aa56-4828b8151c8a
2,2020-01-01 00:00:19 UTC,view,5798924,1783999068867920626,,zinger,3.97,595412617,46a5010f-bd69-4fbe-a00d-bb17aa7b46f3
3,2020-01-01 00:00:24 UTC,view,5793052,1487580005754995573,,,4.92,420652863,546f6af3-a517-4752-a98b-80c4c5860711
4,2020-01-01 00:00:25 UTC,view,5899926,2115334439910245200,,,3.92,484071203,cff70ddf-529e-4b0c-a4fc-f43a749c0acb


By analyzing the table headers, I discarded the 'user_session' field and used this table to build a database, expanding into multiple tables to form a complete e-commerce system.

## 2. Design

Through this dataset, I think it's necessary to design the following tables: User, Product, Category, Order, OrderDetail, Event, Cart, and CartItem, to form a complete e-commerce database.

In [2]:
from IPython.display import display, HTML

css = '''
<style>
    .rendered_html table, .rendered_html th, .rendered_html tr, .rendered_html td {
        text-align: left !important;
        border: none !important;
        margin-left: 0 !important;
        margin-right: 0 !important;
    }
    .rendered_html h3, .rendered_html h2, .rendered_html h1 {
        margin-left: 0 !important;
        margin-right: 0 !important;
    }
</style>
'''
display(HTML(css))

### User Table
| Field             | Type         | Description            |
|-------------------|--------------|------------------------|
| id                | Integer      | The unique identifier for the user, primary key. |
| username          | String(100)  | The username, must be non-null and unique in the table. |
| email             | String(100)  | The user's email address, must be non-null and unique in the table. |
| password_hash     | String(255)  | The hash of the user's password, must be non-null. |
| registration_date | DateTime     | The date and time of user registration, defaults to the current time. |
| last_login        | DateTime     | The date and time of the user's last login, updated upon each login. |


### Product Table

| Field          | Type         | Description                              |
|----------------|--------------|------------------------------------------|
| id             | Integer      | Unique identifier for the product, primary key.                    |
| name           | String(100)  | Product name, cannot be null.                        |
| category_id    | Integer      | Foreign key reference to the product category.                    |
| brand          | String(50)   | Product brand.                                 |
| price          | Float        | Product price, cannot be null.                        |
| description    | String(255)  | Product description.                                 |
| stock_quantity | Integer      | Product stock quantity, cannot be null.        |

### Category Table

| Field              | Type    | Description                                       |
|--------------------|---------|---------------------------------------------------|
| id                 | Integer | The unique identifier for the category, primary key. |
| name               | String(100) | The category name, must be non-null and unique within the table. |
| code               | String(50)  | The category code, must be non-null and unique within the table. |
| parent_category_id | Integer | Foreign key reference to the parent category (if applicable). |

### Order Table

| Field       | Type    | Description                                      |
|-------------|---------|--------------------------------------------------|
| id          | Integer | The unique identifier for the order, primary key. |
| user_id     | Integer | Foreign key reference to the user who created the order. |
| order_date  | DateTime | The date and time when the order was created, defaults to the current time. |
| status      | String(50) | The status of the order.                        |
| total_price | Float   | The total price of the order, must be non-null.  |

### OrderDetail Table

| Field      | Type    | Description                                      |
|------------|---------|--------------------------------------------------|
| id         | Integer | The unique identifier for the order detail, primary key. |
| order_id   | Integer | Foreign key reference to the associated order.   |
| product_id | Integer | Foreign key reference to the purchased product.  |
| quantity   | Integer | The quantity of the product purchased, must be non-null. |
| price      | Float   | The price of the product purchased, must be non-null. |
| discount   | Float   | The discount on the purchased product (if applicable). |

### Event Table

| Field      | Type    | Description                                       |
|------------|---------|---------------------------------------------------|
| id         | Integer | The unique identifier for the event, primary key.  |
| user_id    | Integer | Foreign key reference to the user related to the event. |
| product_id | Integer | Foreign key reference to the product related to the event. |
| event_type | String(50) | The type of event (e.g., view, add to cart, etc.). |
| event_time | DateTime | The date and time when the event occurred, defaults to the current time. |

### Cart Table

| Field   | Type    | Description                                       |
|---------|---------|---------------------------------------------------|
| id      | Integer | The unique identifier for the cart, primary key.   |
| user_id | Integer | Foreign key reference to the user owning the cart, unique. |

### CartItem Table

| Field      | Type     | Description                                       |
|------------|----------|---------------------------------------------------|
| id         | Integer  | The unique identifier for the cart item, primary key. |
| cart_id    | Integer  | Foreign key reference to the cart's identifier.   |
| product_id | Integer  | Foreign key reference to the product's identifier.|
| quantity   | Integer  | The quantity of the product in the cart, must be non-null. |
| added_time | DateTime | The date and time when the product was added to the cart, defaults to the current time. |

### ERD

I have designed an Entity-Relationship (ER) diagram for these tables, which roughly looks like the following image.

![](ERD.drawio.png)

## Implement

For the implementation of the database, I have chosen MySQL and decided to carry out the implementation in a Python environment. SQLAlchemy, being the most popular ORM tool in Python, offers a convenient way to define models in code and map them to database tables. I will use it to fulfill the task at hand. The necessary configuration parameters are outlined in the table below.

| Parameter      | Value       | Description                     |
|----------------|-------------|---------------------------------|
| System         | Like-Unix   | The MacOS.      |
| Container      | Docker      | Indicates that MySQL is running in a Docker container. |
| Database       | MySQL 5.7       | The chosen database system.     |
| Environment    | Python 3.7      | The programming environment.    |
| ORM Tool       | SQLAlchemy  | The tool for object-relational mapping. |


### 2.1 ORM Model Definition with SQLAlchemy

I use Object-Relational Mapping (ORM) to design the database, aiming to expand or annotate the original text of abbreviations where possible.

In [3]:
# %load models.py
from sqlalchemy import create_engine, Column, Integer, String, Float, DateTime, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql import func
from setting import get_engine

Base = declarative_base()

class User(Base):
    """User model"""
    __tablename__ = 'users'
    id = Column(Integer, primary_key=True)  # User ID, primary key
    username = Column(String(100), nullable=False, unique=True)  # Username, must be non-null and unique
    email = Column(String(100), nullable=False, unique=True)  # Email, must be non-null and unique
    password_hash = Column(String(255), nullable=False)  # Password hash, must be non-null
    registration_date = Column(DateTime, default=func.now())  # Registration date and time, defaults to now
    last_login = Column(DateTime, onupdate=func.now())  # Last login date and time, updates on login

    # Relationships with other tables
    orders = relationship('Order', back_populates='user')  # User's orders
    events = relationship('Event', back_populates='user')  # User's events
    cart = relationship('Cart', uselist=False, back_populates='user', cascade='all, delete-orphan')  # User's cart

class Product(Base):
    """Product model"""
    __tablename__ = 'products'
    id = Column(Integer, primary_key=True)  # Product ID, primary key
    name = Column(String(100), nullable=False)  # Product name, must be non-null
    category_id = Column(Integer, ForeignKey('categories.id'))  # Foreign key to category
    brand = Column(String(50))  # Brand
    price = Column(Float, nullable=False)  # Price, must be non-null
    description = Column(String(255))  # Description
    stock_quantity = Column(Integer, nullable=False)  # Stock quantity, must be non-null

    # Relationships with other tables
    order_details = relationship('OrderDetail', back_populates='product')  # Product's order details
    events = relationship('Event', back_populates='product')  # Product's events
    cart_items = relationship('CartItem', back_populates='product')  # Products in the cart

class Category(Base):
    """Category model"""
    __tablename__ = 'categories'
    id = Column(Integer, primary_key=True)  # Category ID, primary key
    name = Column(String(100), nullable=False, unique=True)  # Category name, must be non-null and unique
    code = Column(String(50), nullable=False, unique=True)  # Category code, must be non-null and unique
    parent_category_id = Column(Integer, ForeignKey('categories.id'))  # Parent category ID, foreign key

    # Relationship with the product table
    products = relationship('Product', back_populates='category')  # Products belonging to this category

class Order(Base):
    """Order model"""
    __tablename__ = 'orders'
    id = Column(Integer, primary_key=True)  # Order ID, primary key
    user_id = Column(Integer, ForeignKey('users.id'))  # Foreign key to user who created the order
    order_date = Column(DateTime, default=func.now())  # Order creation date and time, defaults to now
    status = Column(String(50))  # Order status
    total_price = Column(Float, nullable=False)  # Total price of the order, must be non-null

    # Relationships with other tables
    user = relationship('User', back_populates='orders')  # User who placed the order
    order_details = relationship('OrderDetail', back_populates='order')  # Order details

class OrderDetail(Base):
    """Order detail model"""
    __tablename__ = 'order_details'
    id = Column(Integer, primary_key=True)  # Order detail ID, primary key
    order_id = Column(Integer, ForeignKey('orders.id'))  # Foreign key to the order
    product_id = Column(Integer, ForeignKey('products.id'))  # Foreign key to the product
    quantity = Column(Integer, nullable=False)  # Quantity of the product, must be non-null
    price = Column(Float, nullable=False)  # Price of the product, must be non-null
    discount = Column(Float)  # Discount on the product, if applicable

    # Relationships with other tables
    order = relationship('Order', back_populates='order_details')  # The order to which this detail belongs
    product = relationship('Product', back_populates='order_details')  # The product in this order detail

class Event(Base):
    """Event model"""
    __tablename__ = 'events'
    id = Column(Integer, primary_key=True)  # Event ID, primary key
    user_id = Column(Integer, ForeignKey('users.id'))  # Foreign key to the user related to the event
    product_id = Column(Integer, ForeignKey('products.id'))  # Foreign key to the product related to the event
    event_type = Column(String(50))  # Type of event (e.g., view, add to cart, etc.)
    event_time = Column(DateTime, default=func.now())  # Date and time of the event, defaults to now

    # Relationships with other tables
    user = relationship('User', back_populates='events')  # User related to this event
    product = relationship('Product', back_populates='events')  # Product related to this event

class Cart(Base):
    """Cart model"""
    __tablename__ = 'carts'
    id = Column(Integer, primary_key=True)  # Cart ID, primary key
    user_id = Column(Integer, ForeignKey('users.id'), unique=True)  # Foreign key to the user, unique

    # Relationships with other tables
    user = relationship('User', back_populates='cart')  # User owning this cart
    cart_items = relationship('CartItem', back_populates='cart', cascade='all, delete-orphan')  # Items in the cart

class CartItem(Base):
    """Cart item model"""
    __tablename__ = 'cart_items'
    id = Column(Integer, primary_key=True)  # Cart item ID, primary key
    cart_id = Column(Integer, ForeignKey('carts.id'))  # Foreign key to the cart
    product_id = Column(Integer, ForeignKey('products.id'))  # Foreign key to the product
    quantity = Column(Integer, nullable=False)  # Quantity of the product in the cart, must be non-null
    added_time = Column(DateTime, default=func.now())  # Date and time the product was added to the cart, defaults to now

    # Relationships with other tables
    cart = relationship('Cart', back_populates='cart_items')  # The cart to which this item belongs
    product = relationship('Product', back_populates='cart_items')  # The product in this cart item

if __name__ == "__main__":
    engine = get_engine()
    Base.metadata.create_all(engine)


### 2.2 Using Navicat to Create A Relational Entity Model.

After successfully using the code to map and create these tables in MySQL, I can view the relational entity diagram using Navicat software: 

![](navicat_erd.png)