<a href="https://colab.research.google.com/github/brendanpshea/database_sql/blob/main/Database_05_Data_Modeling_and_ER_Diagrams.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Data Modeling and E-R Diagrams
### Brendan Shea, PhD


**Data modeling** is like building the blueprints for a house. But instead of creating spaces for people to live in, we're making space for data to live in a database. This chapter will guide you through the steps of drawing up these blueprints for your database, starting from your initial concept to the final design.

First, we'll get to grips with the basics of data modeling, explaining why it's so important for any database project. You'll learn how business rules, which are essentially the "house rules" for your data, help shape the structure and organization of your database.

Next, we'll venture into the realms of conceptual and logical modeling. Conceptual modeling is like sketching your house’s layout, giving us a high-level overview of how everything fits together. Logical modeling, on the other hand, is where we decide what materials to use and how to construct the house, providing a detailed view of the data structures and relationships.

We'll also introduce you to two commonly used tools for data modeling - Chen's Entity-Relationship (E-R) Diagrams and Crow's Foot E-R Diagrams. These are like different styles of architectural drawings, each with their unique ways of showing how the data in your database are connected.

To make things more interesting and practical, we're going to walk through this process using a case study. We'll be helping Wednesday Addams, who wants to build a database for her online shop selling quirky magical items. By the end of this chapter, you'll be equipped with the knowledge to create your own robust and efficient database designs.

## Case Study: Wednesday designs a Database
Wednesday Addams, a unique entrepreneur known for her love of all things eccentric and magical, has embarked on an exciting digital venture. She's bringing her unique taste to the digital marketplace with an online store offering an assortment of quirky magical items, ranging from enchanted crystals to rare spell books, and even mysterious potion ingredients.

Being a detail-oriented individual, Wednesday is well aware that a robust and efficient database is essential for managing her diverse inventory, handling customer details, and streamlining transactions. However, she's facing a challenge: how to structure and organize her data in a manner that suits her distinct business rules, effectively serving her customers while maintaining the unique charm of her store.

This database isn't just about cataloging items for sale; it's about tracking customers' purchase history, preferred items, payment methods, shipping details, and so much more. Each of these elements of data needs to be accessible and interrelated in a way that makes sense for her business.

Over the course of this chapter, we're going to assist Wednesday in her journey from conceptualizing to finalizing her database structure. We'll apply the principles of data modeling, keeping in mind her specific business rules and needs. Along the way, we'll be using both Chen's Entity-Relationship (E-R) Diagrams and Crow's Foot E-R Diagrams, giving you an understanding of how these tools can aid in the data modeling process.

Through Wednesday's case study, we'll explore the trials, triumphs, and critical decision points of designing a database. The goal? To provide you with a real-world application of the concepts discussed in this chapter, equipping you with the tools and knowledge to take on your own data modeling challenges.

## What is Data Modeling?
**Data modeling** is the process of constructing a framework that represents data, their relationships, and the rules that govern these relationships. It’s akin to creating a blueprint, or map, that provides an overview of how the data within a system will interact. The different types of data models are like different kinds of maps, each with its own unique representation of the data landscape.

Now, you may wonder, why go through all the trouble of creating a model? Can't we dive right in and start building the database? Well, the answer lies in the famous quote usually attributed to the statistician George E.P. Box: "All models are wrong, but some are useful."

Even though all models are simplifications of reality and hence technically "wrong," the utility of a model comes from its ability to effectively represent the most important aspects of that reality. They help us understand complex systems and communicate that understanding to others. In the context of databases, a good data model serves as a useful guide. It gives the database designer a clear vision of the structure to be created and helps them avoid errors that could be expensive to fix later.

There are many types of data models, and each has its unique strengths and applications. For example, the **Entity-Relationship (E-R) model** uses entities, attributes, and relationships to represent data. In our case study of Wednesday Addams' online store, an entity might be a 'Customer,' an attribute could be the 'CustomerName,' and a relationship could be a 'Purchase' linking the 'Customer' and a 'Product.' This model is often depicted visually (and we'll see examples later).

**Relational models**, on the other hand, represent data in tables, or relations. Each row in a table represents a unique entity, and each column represents an attribute of that entity. For instance, in Wednesday’s store, a 'Product' table might have rows representing each product and columns for attributes like 'ProductID,' 'ProductName,' 'Price,' and 'Stock.' We've already encountered relational data models when working with SQL.

**Graph data models**, increasingly popular in the era of social networks, emphasize the relationships between data. **Nodes** represent entities, while **edges** represent relationships. In the context of Wednesday's store, a customer could be a node, and their purchasing behavior could be represented by edges connecting them to different product nodes.

The type of model used can change throughout the database design process, often progressing from more abstract to more detailed models. This progression mirrors the stages of data modeling: conceptual, logical, and physical.

1. In the **conceptual modeling** stage, we create a high-level design that represents the major entities, their attributes, and relationships. This is usually technology-independent and focuses more on describing the data from a business perspective. An E-R model is often used in this stage.

2. **Logical modeling** goes a step further to specify how the system will be implemented, detailing data types, indexes, primary keys, foreign keys, and other constraints. This is where a relational model often comes into play.

3. Finally, the **physical modeling** stage is where the specific technical details of the database are decided, including the physical storage method, access paths, and optimization for performance. This could involve detailed schemas and definitions that are specific to the chosen database management system (DBMS).

In essence, data modeling is a critical step in the database creation process, and a wide variety of models can be used at different stages. The model serves as a guide, helping us navigate the complex landscape of data and create an efficient, effective database system. Despite the fact that all models are simplifications and are hence "wrong" in a sense, the utility of a model lies in its capacity to encapsulate the crucial aspects of reality, helping us to understand and manage complex systems.

## Business Rules are the "Base" of Models
**Business rules** are integral to the data modeling process. They are the guiding principles of an organization, defining the operations, logic, and transformations of the business. Simply put, business rules are the "rules of the game," dictating how data should be handled, related, and manipulated within a system. Incorporating business rules into your data model ensures that the resulting database will align with your business operations and fulfill its needs.

For Wednesday Addams' online store, her business rules might be influenced by her specific product range, her customer base, and her operational strategy. Let's consider seven business rules that Wednesday may establish:

1.  Customers must create an account before making a purchase.
2.  Each product can belong to multiple categories (e.g., "potions", "books", "crystals").
3.  Customers can place multiple orders, but each order is associated with one customer.
4.  An order can contain multiple products.
5.  Customers can write a review for a product only if they have purchased that product.
6.  Each product must have an inventory count that decreases with each purchase.
7.  Customer's payment information must be stored securely for future purchases but should not be visible to the store employees.

These rules emerge from Wednesday's understanding of her business needs. For instance, Rule 1 is essential for order tracking and personalized customer experience. Rule 2 reflects the nature of her products, and Rule 7 is a common requirement for businesses to ensure customer trust and meet data protection regulations.

In the data modeling process, these rules help define the entities, attributes, and relationships. For instance, Rule 1 suggests a 'Customer' entity with attributes like 'CustomerName', 'Email', etc., and a 'has' relationship with an 'Account' entity. Rule 2 hints at a many-to-many relationship between 'Product' and 'Category' entities. Rule 3 and 4, dealing with orders, point towards one-to-many relationships. Rule 5 influences a conditional relationship between 'Customer', 'Product', and 'Review'. Rule 6 guides the need for an 'Inventory' attribute in the 'Product' entity. Finally, Rule 7 impacts the design of secure 'Payment' information storage.

Thus, business rules directly shape the data model by defining the **entities** (the nouns of the business, like 'Customer', 'Product', 'Order'), **attributes** (properties of these entities), and **relationships** (how these entities interact). By understanding and implementing these rules, Wednesday ensures that her database fits her business like a glove, enabling smooth operations and efficient use of data.

## Conceptual Modeling
Determining the business rules is part of the **conceptual modeling** stage. In this first step in the data modeling process, the goal is to figure out broad strokes of your database design. It's the time to identify the crucial components of the business: the key entities, their attributes, and the relationships between them. At this stage, you're not worrying about the finer details like the specific data types of attributes or the database management system you'll use. Instead, you're focusing on the "big picture" of your data landscape.

When you are doing conceptual modeling, there are three key questions you should consider:

1.  What are the significant entities? Identifying the entities, or the "nouns" of the business, is an essential starting point. An entity is any significant object or concept that the business needs to keep track of. It could be a tangible object like a 'Product' or an 'Order', or an abstract concept like an 'Account' or a 'Category'. A thorough understanding of your business rules will guide you in identifying these entities.

2.  What are the attributes of these entities, and how do they relate to each other? Once you've identified your entities, you'll need to pinpoint their attributes, or the properties that define them. For instance, a 'Customer' entity might have attributes like 'CustomerName', 'Email', and 'Address'. Also, think about the relationships between these entities. How do they interact? Is there a one-to-one, one-to-many, or many-to-many relationship?

3.  Are there any potential issues or conflicts with these entities, attributes, or relationships? This stage is also about problem recognition. Are there any ambiguities or conflicts in how entities relate to each other? Are there any entities or relationships that could potentially violate the business rules? Addressing these issues early on can save a lot of trouble in the later stages of design.

During this stage, a tool like an Entity-Relationship (E-R) diagram can be helpful. It visually depicts your entities (usually represented by rectangles), attributes (depicted by ovals connected to their entities), and relationships (shown as diamonds connecting the entities).

By answering these three key questions during conceptual modeling, you lay a strong foundation for your database design. This high-level understanding of your business needs will guide the detailed design and implementation in the following stages of data modeling.


## Example: Conceptual Modeling
To begin the conceptual modeling stage for Wednesday Addams' online store, let's go back to the business rules we discussed earlier and the key entities, attributes, and relationships they suggest.

1.  Entities: From the business rules, we can identify several entities such as 'Customer', 'Product', 'Order', 'Category', and 'Review'.

2.  Attributes: Each of these entities will have specific attributes. For example, 'Customer' might have 'CustomerID', 'Name', 'Email', etc. 'Product' may include 'ProductID', 'ProductName', 'Price', etc.

3.  Relationships: The relationships between these entities can be inferred from the business rules. For instance, 'Customer' and 'Order' have a one-to-many relationship (a customer can place multiple orders), and 'Order' and 'Product' also share a many-to-many relationship (an order can include multiple products and a product can be part of multiple orders).

Now, let's create Entity-Relationship (E-R) diagrams using two different notations: Chen's notation and Crow's Foot notation, both widely used but with slight differences in their visual representation.

## Chen's Notation:

In Chen's notation, entities are represented by rectangles, attributes by ovals, and relationships by diamonds. For our case, the 'Customer' and 'Order' entities and their relationship could be represented like this: