### Tutorial 10: Extracting Data from Wide World Importers and Analyzing in SQLite

In this tutorial, students will extract customer-related data from the **Wide World Importers** database, create a **SQLite database** on their computer, and perform a full analysis using SQL queries.

#### **Prerequisites**
- Access to the **Wide World Importers OLTP** database in Microsoft SQL Server.
- Python installed (recommended for SQLite manipulation).
- SQLite installed (or use SQLite via Python's `sqlite3` module).

#### This tutorial covers fundamental data engineering tasks, including:

- ✅ Extracting structured data from a transactional database.
- ✅ Transforming data for compatibility with a new database system (SQLite).
- ✅ Loading (ETL Process) the transformed data into another database for analysis.
- ✅ Querying and analyzing data to derive business insights.
---

#### **Step 1: Extract Customer Data from Wide World Importers**

##### **Query to Retrieve Customer Data**
Run the following SQL query in SQL Server to extract relevant customer data:

```sql
SELECT c.CustomerID, c.CustomerName, c.BillToCustomerID, c.AccountOpenedDate, 
       c.StandardDiscountPercentage, ci.CityName, co.CountryName
FROM Sales.Customers AS c
JOIN Application.Cities AS ci ON c.DeliveryCityID = ci.CityID
JOIN Application.StateProvinces AS sp ON ci.StateProvinceID = sp.StateProvinceID
JOIN Application.Countries AS co ON sp.CountryID = co.CountryID;
```

- Export the results as a **CSV file** (`customers.csv`).
- Save the file for use in SQLite.

---

#### **Step 2: Create a SQLite Database and Import Data**

##### **1. Create a SQLite Database**
Run the following command to create a SQLite database file:

```sh
sqlite3 customers_analysis.db
```

Or, using Python:

```python
import sqlite3
conn = sqlite3.connect("customers_analysis.db")
cursor = conn.cursor()
```

##### **2. Create a Table for Customer Data**
Run this SQL command in SQLite:

```sql
CREATE TABLE Customers (
    CustomerID INTEGER PRIMARY KEY,
    CustomerName TEXT,
    BillToCustomerID INTEGER,
    AccountOpenedDate TEXT,
    StandardDiscountPercentage REAL,
    CityName TEXT,
    CountryName TEXT
);
```

##### **3. Import Data from CSV into SQLite**
Using SQLite command-line:

```sh
.mode csv
.import customers.csv Customers
```

Using Python:

```python
import pandas as pd
import sqlite3

conn = sqlite3.connect("customers_analysis.db")
cursor = conn.cursor()

df = pd.read_csv("customers.csv")
df.to_sql("Customers", conn, if_exists="replace", index=False)
conn.close()
```

---

#### **Step 3: Perform Data Analysis in SQLite**

##### **1. Find the Total Number of Customers**
```sql
SELECT COUNT(*) AS TotalCustomers FROM Customers;
```

##### **2. List the Top 10 Customers with the Highest Discounts**
```sql
SELECT CustomerName, StandardDiscountPercentage 
FROM Customers 
ORDER BY StandardDiscountPercentage DESC 
LIMIT 10;
```

##### **3. Find the Number of Customers Per Country**
```sql
SELECT CountryName, COUNT(*) AS NumberOfCustomers 
FROM Customers 
GROUP BY CountryName 
ORDER BY NumberOfCustomers DESC;
```

##### **4. Find Customers Who Have Been Active for More Than 5 Years**
```sql
SELECT CustomerName, AccountOpenedDate 
FROM Customers 
WHERE AccountOpenedDate <= DATE('now', '-5 years');
```

##### **5. Identify Duplicate Billing Accounts**
```sql
SELECT BillToCustomerID, COUNT(*) AS DuplicateCount 
FROM Customers 
GROUP BY BillToCustomerID 
HAVING COUNT(*) > 1;
```

##### **6. Rank Customers Based on Discount Percentage Using Window Functions**
**Task:** Rank customers based on their **StandardDiscountPercentage** within each **CountryName** using window functions.

```sql
SELECT CustomerName, CountryName, StandardDiscountPercentage,
       RANK() OVER (PARTITION BY CountryName ORDER BY StandardDiscountPercentage DESC) AS DiscountRank
FROM Customers;
```

---

#### **Step 4: Generate a Report**
- Create a summary report including:
  - Total number of customers
  - Customer distribution by country
  - Top customers with highest discounts
  - Any insights from the analysis
- Save the report as `customer_analysis_report.pdf`.

---

#### **Submission Instructions**
- Submit the SQLite database file (`customers_analysis.db`).
- Provide the SQL queries used (`queries.sql`).
- Include a written report (`customer_analysis_report.pdf`).
- If using Python, submit the Python script (`import_script.py`).

By completing this tutorial, students will gain hands-on experience in **data extraction, transformation, and analysis using SQL and SQLite**. 🚀
