Let's say our job requires us to build Microsoft Power BI reports. The data is in several different databases and files. These data repositories are different from each other; some are in Comma Separated Value (CSV) files, some are in Microsoft Excel, but all the data is related.

**Before we can create reports, we must first extract data from the various data sources**. Interacting with a data source is different from another one, so we should learn the nuances of all systems. After we’ve learned the particulars of each system, we can use **Power Query** (the query engine used by Power BI and Excel) to help us clean the data, such as renaming columns, replacing values, removing errors, and combining query results.

After the data has been cleaned and organized, you are ready to build reports in Power BI. Finally, you will publish your combined dataset and reports to **Power BI service (PBIS)**.

From there, other people can use your dataset and build their own reports, or they can use the reports that you’ve already built. Additionally, if someone else built a dataset that you'd like to use, you can build reports from that, too!

This lesson will focus on the first step: getting the data from the most common data sources (Excel and CSV files) and importing it into Power BI using **Power Query Editor**.

By the end of this file, we’ll be able to do the following:

* Identify and connect to a data source
* Retrieve data from Microsoft Excel and CSV files
* Load data into the **Power Query Editor**
* Change the path to a file

Organizations often export and store data in files. One possible file format is a **flat file**.

A flat file is a type of file that has only one data table and every row of data is in the same structure. The file does not contain hierarchies. We may be familiar with the most common types of flat files:

* Comma-separated values (`.csv`) files
* Delimited text (`.txt`) files
* Fixed-width files

Another type of file would be the output files from different applications, like Microsoft Excel workbooks (`.xlsx`).

![image.png](attachment:image.png)

Power BI Desktop allows us to retrieve data from many types of files. You can find a list of the available options when we use the Get data feature in Power BI Desktop. The following sections explain how we can import data from an Excel file that is stored on a local computer.

**Scenario**

The Human Resources (HR) team at Tailwind Traders has prepared a flat file that contains some of our organization's employee data, such as employee name, hire date, position, and manager. They've requested that we build Power BI reports using this data and data that is located in several other data sources.

**Flat file location**: The first step is to determine which file location we want to use to export and store our data.

Our Excel files might exist in one of the following locations:

* **Local** - We can import data from a local file into Power BI. The file isn't moved into Power BI, and a link doesn't remain to it. Instead, a new dataset is created in Power BI, and data from the Excel file is loaded into it. Accordingly, changes to the original Excel file are not reflected in our Power BI dataset. We can use local data import for data that doesn't change.

* **OneDrive for Business** - We can pull data from OneDrive for Business into Power BI. This method is effective in keeping an Excel file and our dataset, reports, and dashboards in Power BI synchronized. Power BI connects regularly to our file on OneDrive. If any changes appear, our dataset, reports, and dashboards automatically update in Power BI.

* **OneDrive** - Personal - We can use data from files on a personal OneDrive account and get many of the same benefits that we would with OneDrive for Business. However, we'll need to sign in with our personal OneDrive account and select the Keep me signed in option. Check with our system administrator to determine whether this type of connection is allowed in our organization.

* **SharePoint - Team Sites** - Saving our Power BI Desktop files to SharePoint Team Sites is similar to saving to OneDrive for Business. The main difference is how we connect to the file from Power BI. we can specify a URL or connect to the root folder.

![image.png](attachment:image.png)

Using a cloud option such as OneDrive or SharePoint Team Sites is the most effective way to keep our file and our dataset, reports, and dashboards in Power BI in-sync. However, if our data does not change regularly, saving files on a local computer is a suitable option.

![image.png](attachment:image.png)

In Power BI, to connect to data in a file, we can follow these steps.

1. On the **Home** tab, click on **Get data**.
2. In the list that displays, select the type of the file (`Text/CSV` or `XML` or `Excel` etc).
3. Select the data file to import.

**Note**: The **Home** tab contains quick access data source options (such as Excel) next to the **Get data** button.

For this example, we will select **Excel**.

![image.png](attachment:image.png)

Depending on the selection, we need to find and open our data source. We might be prompted to sign in to a service, such as OneDrive, to authenticate our request. In this example, we will open the **Employee Data Excel workbook** that is stored on the Desktop.

![image.png](attachment:image.png)

After the file has connected to Power BI Desktop, the Navigator window opens. This window shows us the data that is available in our data source (the Excel file in this example).

We can select a table or entity to preview its contents, to ensure that the correct data is loaded into the Power BI model.

We select the check box(es) of the table(s) that we want to bring in to Power BI. This selection activates the **Load and Transform Data buttons** as shown in the following image.

![image.png](attachment:image.png)

We now have the option to select the **Load** button to automatically load our data into the *Power BI* model or select the **Transform Data** button to launch the Power Query Editor, where we can review and clean our data before loading it into the Power BI model.

We often recommend transforming data, but we'll discuss that process later in this file. For this example, we can select **Load**.

To explore the data, select **Data** on the left side of the Power BI Desktop window.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

We might have to change the location of a source file for a data source during development, or if a file storage location changes. To keep our reports up to date, we'll need to update our file connection paths in Power BI.

**Power Query** provides a number of ways for us to do this so we can make this type of change:

* Data source settings
* Query settings
* Advanced editor

**Warning**: If we're changing a file path, make sure that we reconnect to the same file with the same file structure. Any structural changes to a file (such as deleting or renaming columns in the source file) will break the reporting model.

For example, let's try changing the data source file path in the data source settings:

* We click on the **Transform Data** icon to open **Power BI Editor**.
* We select **Data source settings** in the **Home** tab.
* In the **Data source settings window**, we select a file, and then select **Change Source**.
* We now update the File path or use the **Browse** option to locate the file, select **OK**, and then select **Close**.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

After we've connected the File to Power BI Desktop, the Navigator window displays the data available in our data source. We can select a sheet to preview its contents and make sure that the correct data will be loaded into the Power BI model.

After selecting the sheets that we want to bring in to **Power BI Desktop**, we then have to select either the **Load** or **Transform Data** option. What's the difference between these options?

* **Load** - Automatically load our data into a Power BI model in its current state
* **Transform Data** - Open our data in **Power Query Editor**, where we can perform actions such as deleting unnecessary rows or columns, grouping our data, removing errors, and many other data quality tasks

![image.png](attachment:image.png)

We generally recommend selecting the **Load** if we only plan to explore data. When we know that data manipulation will be required, we recommend selecting the **Transform Data** option.

![image.png](attachment:image.png)

To connect to a CSV file, we can use the **Get data** feature in Power BI Desktop and select the applicable option for CSV. This process is really similar to how we import Excel file. Here are the steps.

1. On the **Home** tab, click on **Get data**.
2. In the list that displays, select **Text/CSV**.
3. Select the data file to import.
4. Click on **Load** or **Transform** data.

Let's use this to connect the `Finance Sample To Clean.csv` file to Power BI.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

In this file, we learned about pulling data from a number of different data sources into Power BI. We can pull data from files. Retrieving data from different data sources requires treating each data source differently. For instance, Microsoft Excel data should be pulled in from an Excel table.

It's important to select the correct storage mode for our data. Do we need visuals that interact quickly but don’t mind possibly refreshing the data when the underlying data source changes? If so, select Import to import data into Power BI. If we prefer to see updates to data as soon as they happen at the cost of interactivity performance, then choose Direct Query for our data instead.

In this lesson, we learned how to identify and connect to a data source, and pull data from a Excel and CSV files into Power BI. We learned that **Load** automatically loads data into a Power BI model in its current state, whereas **Transform Data** opens data in **Power Query Editor**. Opening data **Power Query Editor** is preferred when we need to perform actions such as deleting unnecessary rows or columns, grouping our data, removing errors, and for performing many other data quality tasks. In this lesson we also learned how to change the path to a file.