Creating a great data model is one of the most important tasks that a data analyst can perform in Microsoft Power BI. By doing this job well, we help make it easier for people to understand our data, which will make building valuable Power BI reports easier for them and for us.

A good data model offers the following benefits:

* Faster data exploration
* Simpler aggregation building
* More accurate reporting
* Quicker report writing
* Easier report maintenance

Providing set rules for what makes a good data model is difficult because all data is different, and the usage of that data varies. Generally, a smaller data model is better because it will perform faster, and it will be simpler to use. However, defining what a smaller data model entails is equally as problematic because it's a heuristic and subjective concept.

Typically, a smaller data model is comprised of fewer tables and fewer columns in each table that the user can see. If we import all necessary tables from a sales database, but the total table count is 30 tables, the user won't find that intuitive. Collapsing those tables into five tables will make the data model more intuitive to the user, whereas if the user opens a table and finds 100 columns, they might find it overwhelming. Removing unneeded columns to provide a more manageable number will increase the likelihood that the user will read all column names. To summarize, we should aim for simplicity when designing our data models.

The following image is an example data model. The boxes contain tables of data, where each line item within the box is a column. The lines that connect the boxes represent relationships between the tables. These relationships can be complex, even in such a simplistic model. The data model can become easily disorganized, and the total table count in the model can gradually increase. Keeping a data model simple, comprehensive, and accurate requires constant effort.

![image.png](attachment:image.png)

Relationships are defined between tables through primary and foreign keys. Primary keys are column(s) that identify each unique, non-null data row. For instance, if we have a Customers table, we could have an index that identifies each unique customer. The first row will have an ID of `1`, the second row an ID of `2`, and so on.

Each row is assigned a unique value, which can be referred to with this simple value: the **primary key**. This process becomes important when we are referencing rows in a different table, which is what foreign keys do. Relationships between tables are formed when we have primary and foreign keys in common between different tables.

Power BI allows relationships to be built from tables with different data sources, a powerful function that enables us to pull one table from Microsoft Excel and another from a relational database. We would then create the relationship between those two tables and treat them as a unified dataset.

Now that we have learned about the relationships that make up the data schema, we can explore a specific type of schema design, the star schema, which is optimized for high performance and usability. We'll get to that next, but first, let's load some sample data provided by Microsoft.

We can design a star schema to simplify our data. It's not the only way to simplify our data, but it's a popular method; therefore, every Power BI data analyst should understand it. In a star schema, each table within our dataset is defined as a dimension or a fact table, as we see in the following visual.

![image.png](attachment:image.png)

**Fact tables** contain observational or event data values: 
* sales orders, 
* product counts, 
* prices, 
* transactional dates and times, and 
* quantities. 

Fact tables can contain several repeated values. For example, one product can appear multiple times in multiple rows, for different customers on different dates. These values can be aggregated to create visuals. For instance, a visual of the total sales orders is an aggregation of all sales orders in the fact table. With fact tables, it is common to see columns that are filled with numbers and dates. The numbers can be units of measurement, such as sale amount, or they can be keys, such as a customer ID. The dates represent time that is being recorded, like order date or shipped date.

**Dimension tables** contain the details about the data in fact tables: products, locations, employees, and order types. These tables are connected to the fact table through key columns. Dimension tables are used to filter and group the data in fact tables. The dimension tables, by contrast, contain unique values, for instance, one row for each product in the Products table and one row for each customer in the Customer table. For the total sales orders visual, we could group the data so that we see total sales orders by product, in which product is data in the dimension table.

Fact tables are usually much larger than dimension tables because many events occur in fact tables, such as individual sales. Dimension tables are typically smaller because we are limited to the number of items that we can filter and group on. For instance, a year contains only so many months, and the United States is comprised of only a certain number of states.

Considering this information about fact tables and dimension tables, we might wonder how we can build this visual in Power BI.

The pertinent data resides in two tables, **Employee and Sales**, as shown in the following data model.
* Because the **Sales** table contains the sales order values, which can be aggregated, it is considered a fact table. 
* The **Employee** table contains the specific employee name, which filters the sales orders, so it would be a dimension table. 

The common column between the two tables, which is the primary key in the **Employee** table, is `EmployeeID`, so we can establish a relationship between the two tables based on this column.

![image.png](attachment:image.png)

When creating this relationship, we can build the visual according to the requirements, as shown in the following figure. If we didn't establish this relationship, while keeping in mind the commonality between the two tables, we would have had more difficulty building our visual.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

When users see fewer tables, they will enjoy using our data model considerably more. For example, suppose we've imported dozens of tables from many data sources and now the visual appears disorderly. In this case, we need to ensure (before we begin working on building reports) that our data model and table structure are simplified.

A simple table structure will have the following features:

* Simple to navigate because of column and table properties that are specific and user-friendly
* Merged or appended tables to simplify the tables within our data structure
* Good-quality relationships between tables (that make sense)

There are two ways to work with tables:

* Configuring data model and building relationships between tables
* Configuring table and column properties

![image.png](attachment:image.png)

Assuming that we've already retrieved our data and cleaned it in Power Query, we can then go to the **Model** tab, where the data model is located. The following image shows how we can see the relationship between the **Order and Sales tables** through the `OrderDate` column.

![image.png](attachment:image.png)

To manage these relationships, go to **Manage Relationships** on the ribbon, where the following window will appear:

![image.png](attachment:image.png)

In this view, we can create, edit, and delete relationships between tables — and also autodetect relationships that already exist. When we load our data into Power BI, the **Autodetect** feature will help us establish relationships between columns that are named similarly. Relationships can be inactive or active. Only one active relationship can exist between tables, which we will discuss in a future module.

While the **Manage Relationships** feature allows us to configure relationships between tables, we can also configure table and column properties to ensure organization in our table structure.

Let's inspect the relationship of the tables in the **Sales and Marketing sample report**.

![image.png](attachment:image.png)

![image.png](attachment:image.png)