# Basics 9: Keep Power BI dataset up-to-date

## Step 1: Data pipeline implementation

Our data pipeline implementation has to keep the Power BI data up-to-date.

### a) Inserting new data

If you only need to insert new data, it's pretty straightforward:


In [24]:
# Imports
%run ../../common/jupyter.ipynb
import src_common_database as db
import src_common_powerbi as bi

# Define database connection
database = db.create_engine()

# Define Power BI dataset based on environment variables
dataset = bi.PowerBIDataset(
    api_headers=bi.get_api_headers(bi.get_app()),
    group_id=os.environ['POWERBI_GROUP_ID'],
    dataset_id=os.environ['POWERBI_DATASET_ID'],
    table_name_prefix="public "
)

# Define new sales rows (e.g. read from a CSV file)
new_sales = {
  "rows": [
    {
      "key": "00000000100.1-1",
      "date_key": "2021-05-26",
      "store_key": "Finland - Super Shop",
      "product_key": "1-1",
      "order_number": "00000000100",
      "quantity": 10,
      "price": 2129.0
    },
    {
      "key": "00000000101.1-2",
      "date_key": "2021-05-27",
      "store_key": "Finland - Super Shop",
      "product_key": "1-2",
      "order_number": "00000000101",
      "quantity": 2,
      "price": 2659.0
    }
  ]
}

# Insert new data
dataset.insert_table_data(table_name='fact_sales', data=new_sales)

## b) Inserting new data and updating existing data

If you also need to update existing data, you need the refresh a whole Power BI table because Power BI API does not support updating existing data. For example:

In [3]:
# Update data in database first, for example by reading data from a CSV file like in lesson 1.

# And then instead of insert, copy all data from the database table to the Power BI dataset table with delete=true
dataset.copy_table_data(table_name='fact_sales', order_by='key', database=database, delete_data=True)

NameError: name 'bi' is not defined

## Step 2: Update schema

If we add new columns to our database tables or make some other changes to the database schema, we need to update also the Power BI schema. This can be the following way.

In [2]:
# Read schema from the database table and update Power BI dataset schema based on it
fact_sales_schema = bi.as_powerbi_table_schema(table_name='fact_sales', database=database)
dataset.update_table_schema(table_name='fact_sales', table_schema=fact_sales_schema)

# Copy all data from the database table to the Power BI dataset table
dataset.copy_table_data(table_name='fact_sales', order_by='key', database=database, delete_data=True)

NameError: name 'as_powerbi_table_schema' is not defined

Unfornately you cannot add new database tables or relationships on an existing Power BI push dataset using the Power BI REST API. If such need arises, you need to create a new dataset, and update you Power BI dashboard to use the new dataset instead of the old one.

## Next lesson: [Basics 10 - Real implementation and project configuration](10.ipynb)