# **Hands-on Lab: Setting up a Staging Area**

https://www.coursera.org/learn/getting-started-with-data-warehousing-and-bi-analytics/ungradedLti/ixPVN/hands-on-lab-setting-up-a-staging-area


## **Purpose of the Lab:**

The purpose of the lab is to equip you with practical skills in setting up and managing a staging server for a data warehouse, specifically using PostgreSQL. The lab focuses on teaching how to design and implement a database schema, load data into tables, and run sample queries to interact with the data. This is aimed at providing you with a hands-on understanding of the intricacies involved in preparing and managing a data warehouse environment.

## **Benefits of Learning the Lab:**

The key benefit of this lab is the hands-on experience it offers in data warehousing concepts and practices. By working in the Skills Network Cloud IDE, an interactive environment, you can practice and hone your database management skills. This practical exposure is crucial for developing a solid foundation in data management and analysis, skills that are highly valued in real-world business scenarios. The lab helps you to understand the operational aspects of data warehouses, making you better prepared for challenges in data management and analysis in professional settings.

## **Objectives**

In this lab you will:

- Setup a staging server for a data warehouse
- Create the schema to store the data
- Load the data into the tables
- Run a sample query

# **Exercise 1 - Start the PostgreSQL server**

We will be using the PostgreSQL server as our staging server.

Start the PostgreSQL server.

Open a new terminal, by clicking on the menu bar and selecting **Terminal**->**New Terminal**.

This will open a new terminal at the bottom of the screen.

Run the commands below on the newly opened terminal. (You can copy the code by clicking on the little copy button on the bottom right of the codeblock below and then paste it, wherever you wish.)

Start the PostgreSQL server

# **Exercise 2 - Create Database**

Create the database on the data warehouse.

Using the createdb command of the PostgreSQL server, we can directly create the database from the terminal.

Run the command below to create a database named billingDW.

drop if exist

In [3]:
#dropdb -h localhost -U postgres -p 5432 billingDW

In [1]:
#createdb -h localhost -U postgres -p 5432 billingDW

# **Exercise 3 - Create data warehouse schema**

Step 1:

Download the schema files.

The commands to create the schema are available in the file below.

https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DB0260EN-SkillsNetwork/labs/Setting%20up%20a%20staging%20area/billing-datawarehouse.tgz

Run the commands below to download and extract the schema files.

In [4]:
#curl -O https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DB0260EN-SkillsNetwork/labs/Setting%20up%20a%20staging%20area/billing-datawarehouse.tgz

extract the schema files

In [5]:
#tar -xvzf billing-datawarehouse.tgz

You should see 4 .sql files listed in the output

In [7]:
!ls *.sql

DimCustomer.sql DimMonth.sql    FactBilling.sql star-schema.sql verify.sql


Step 2: Create the schema

Run the command below to create the schema in the `billingDW` database.

In [8]:
#psql  -h localhost -U postgres -p 5432 billingDW < star-schema.sql

# **Exercise 4 - Load data into Dimension tables**

When we load data into the tables, it is a good practice to load the data into dimension tables first.

Step 1: Load data into DimCustomer table

Run the command below to load the data into DimCustomer table in `billingDW` database.

In [9]:
#psql  -h localhost -U postgres -p 5432 billingDW < DimCustomer.sql

Step 2: Load data into DimMonth table

Run the command below to load the data into DimMonth table in `billingDW` database.

In [10]:
#psql  -h localhost -U postgres -p 5432 billingDW < DimMonth.sql

# **Exercise 5 - Load data into Fact table**

Load data into FactBilling table

Run the command below to load the data into FactBilling table in `billingDW` database.

In [11]:
#psql  -h localhost -U postgres -p 5432 billingDW < FactBilling.sql

# **Exercise 6 - Run a sample query**

Run the command below to check the number of rows in all the tables in the `billingDW` database.

In [12]:
#psql  -h localhost -U postgres -p 5432 billingDW < verify.sql

You should see an output similar to the one below.

<img src=https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DB0260EN-SkillsNetwork/labs/Setting%20up%20a%20staging%20area/images/verify.png >

Your data warehouse staging area is now ready.