When we import data from multiple sources into Power BI Desktop, the data retains its predefined table and column names. We might want to change some of these names so that they are in a consistent format, easier to work with, and more meaningful to a user. We can use Power Query Editor in Power BI Desktop to make these name changes and simplify our data structure.

In this file, we will continue to clean and transform our data. We will learn how to do the following:

* Apply user-friendly value replacements
* Profile data so we can learn more about a specific column before using it
* Evaluate and transform column data types
* Apply user-friendly naming conventions to queries

To continue with the previous scenario where we shaped the initial data in our model, we need to take further action to simplify the structure of the sales data and get it ready for developing reports for the sales team. We have already renamed the columns, but now we need to examine the names of the queries (tables) to determine if we can make any improvements. We also need to review the contents of the columns and replace any values that require correction.

It's good practice to change uncommon or unhelpful query names to names that are more obvious or that the user is more familiar with. For instance, if we import a product fact table into Power BI Desktop, and the query name displays as FactProductTable, we might want to change it to a more user-friendly name, such as Products.

Similarly, if we import a view, the view might have a name that contains a prefix of `v`, such as `vProduct`. People might find this name unclear and confusing, so we might want to remove the prefix.

In this example, we have examined the name of the `TargetSales` query and realized that this name is unhelpful because we'll have a query with this name for every year. To avoid confusion, we want to add the year to the query name.

In Power Query Editor, in the **Queries** pane to the left of our data, we select the query that we want to rename. **Right-click** the query, and select **Rename**. Edit the current name or type a new name, and then press **Enter**.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

We can also remove duplicates from columns to only keep unique names in a selected column by using the **Remove Duplicates** feature in Power Query.

In this example, notice that the Category Name column contains duplicates for each category. As a result, we want to create a table with unique categories and use it in our data model. We can achieve this action by selecting a column, **right-clicking** on the header of the column, and then selecting the **Remove Duplicates** option.

We might consider copying the table before removing the duplicates. The Copy option is at the top of the context menu, as shown in the following screenshot. Copying the table before removing duplicates will give us a comparison of the tables, and it will let us use both tables, if necessary.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

We can use the **Replace Values** feature in Power Query Editor to replace any value with another value in a selected column.

In this example, we notice that, in the `Attribute` column, the month **December** is misspelled. We need to correct this spelling mistake. For that, we select the column that contains the value that we want to replace (`Attribute` in this case), and then we select **Replace Values** on the **Transform** tab. Note that **Replace Values** is located on the right side of the ribbon in the Dataquest Power BI interface. Sometimes, only the **Replace Values** icon will appear, depending on the screen size.

![image.png](attachment:image.png)

In the **Value to Find** box, we enter the name of the value that we want to replace, and then in the **Replace With** box, enter the correct value name and then select **OK**. In Power Query, we can't select one cell and change one value, like we might have done in Excel.

![image.png](attachment:image.png)

We can review the list of steps that we took to restructure and correct our data in the **Query Settings** pane. When we have completed all steps that we want to take, we can select **Close & Apply** to close Power Query Editor and apply our changes to our data model. However, we can take further action to clean and transform our data.

Occasionally, we might find that our data sources contain **null values**. For example, a freight amount on a sales order might have a null value if it's synonymous with zero. If the value stays null, the averages will not calculate correctly. One solution would be to change the nulls to zero, which will produce the more accurate freight average. In this instance, using the same steps that we followed previously will help us replace the null values with zero.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

Naming conventions for tables, columns, and values have no fixed rules; however, we recommend that we use the language and abbreviations that are commonly used within our organization and that everyone agrees on and considers common terminology.

A best practice is to give our tables, columns, and measures descriptive business terms and replace underscores (`_`) with spaces.

Be consistent with abbreviations, prefaces, and words like `number` and `ID`. Excessively short abbreviations can cause confusion if they aren't commonly used within the organization.

Also, by removing prefixes or suffixes that we might use in table names and instead naming them in a simple format, we will help avoid confusion.

When replacing values, try to imagine how those values will appear on the report.

* Values that are too long might be difficult to read and fit on a visual.
* Values that are too short might be difficult to interpret.
* Avoiding acronyms in values is also a good idea, provided that the text will fit on the visual.

When we import a table from any data source, Power BI Desktop automatically starts scanning the first 1,000 rows (default setting) and tries to detect the type of data in the columns. Some situations might occur when Power BI Desktop doesn't detect the correct data type. When incorrect data types occur, we will experience performance issues.

We have a higher chance of getting data type errors when we're dealing with flat files, such as comma-separated values (`.csv`) files and Excel workbooks (`.xlsx`), because data was entered manually into the worksheets creating mistakes. Conversely, in databases, the data types are predefined when tables or views are created.

A best practice is to evaluate the column data types in Power Query Editor before we load the data into a Power BI data model. If we determine that a data type is incorrect, we can change it. We might also want to apply a format to the values in a column and change the summarization default for a column.

To continue with the scenario wherein we are cleaning and transforming sales data in preparation for reporting, we now need to evaluate the columns to ensure that they have the correct data type. We need to correct any errors that we identify.

We evaluate the `OrderDate` column. As expected, it contains numeric data, but Power BI Desktop has incorrectly set the column data type to Text. To report on this column, we need to change the data type of this column from `Text` to `Date`.

![image.png](attachment:image.png)

We can change the data type of a column in two places: 
* in Power Query Editor and 
* in the Power BI Desktop Report view by using the column tools. 

It's best to change the data type in the Power Query Editor before we load the data.

We can change the column data type in Power Query Editor in two ways.

* One way is to **select the column** that has the issue, select **Data Type** in the **Transform** tab, and then **select the correct data type** from the list.

![image.png](attachment:image.png)

* Another method is to **select the data type icon** next to the column header and then **select the correct data type** from the list.

![image.png](attachment:image.png)

As with any other changes that we make in Power Query Editor, the change that we make to the column data type is saved as a programmed step. This step is called **Changed Type**, and it will be iterated every time the data is refreshed.

After we have completed all steps to clean and transform our data, we select **Close & Apply** to close Power Query Editor and apply the changes to the data model. At this stage, our data should be in great shape for analysis and reporting.

For more information, see [Data types in Power BI Desktop](https://docs.microsoft.com/en-us/power-bi/connect-data/desktop-data-types/).

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

The following information provides insight into problems that can arise when Power BI doesn't detect the correct data type.

Incorrect data types will prevent us from creating certain calculations or creating proper relationships with other tables. For example, if we try to calculate the Quantity of `Orders YTD`, we'll get the following error stating that the `OrderDate` column data type isn't `Date`, which is required in time-based calculations.

`Quantity of Orders YTD = TOTALYTD(SUM('Sales'[OrderQty]), 'Sales'[OrderDate])`

![image.png](attachment:image.png)

Another issue with having an incorrect data type applied on a date field is the inability to create a date hierarchy, which would allow us to analyze our data on a yearly, monthly, or weekly basis. The following screenshot shows that the `SalesDate` field isn't recognized as type `Date` and will only be presented as a list of dates in the **Table** visual. However, it's a best practice to use a date table and turn off the Auto date/time to get rid of the auto generated hierarchy.

For more information about this process, see [**Auto generated data type** documentation](https://docs.microsoft.com/en-us/power-bi/guidance/auto-date-time/).

This lesson explained how we can take data that is difficult to read, calculate with, and discover and simplify it for report authors and others to use.

Additionally, we learned how to replace values and change data types.