### Tutorial 09: Extracting and Exploring Data in Microsoft SQL Server
 
This tutorial will guide you on how to explore and extract data from a Microsoft SQL Server database using Python and SQL queries.

#### **Objectives**
- Connect to a Microsoft SQL Server database using Python.
- Retrieve and explore the database schema.
- Perform basic `SELECT` queries.
- Retrieve specific columns and filter data using the `WHERE` clause.
- Understand how to sort and limit results.

---

#### **Pre-requisites**
1. Python installed with the `pyodbc` library.
2. Access to a Microsoft SQL Server database.

---

#### **Step 1: Connect to Microsoft SQL Server**
- Use Python to establish a connection to your SQL Server.

#### **Step 2: Explore the Schema**
- Before writing queries, it's essential to understand the structure of the database.

### SQL Exercises for WideWorldImporters

#### **Exercise 1: List Relevant Tables**
🔹 **Task:**  
- Write a query to **list all tables** from the `Sales` and `Warehouse` schemas only.  
- Identify the schema and table names that contain useful data.  
---

#### **Exercise 2: List Columns of a Table**
🔹 **Task:**  
- Write a query to list all columns in the `Sales.Orders` table.  
---

#### **Exercise 3: Retrieve Specific Columns**
🔹 **Task:**  
- Get the `OrderID`, `OrderDate`, and `CustomerID` in 2013.  
---

## **Final Note**
Always close the connection when you are done:

```python
conn.close()
print("Connection closed.")
```
---

## **Summary of Exercises**
1. List all tables in the database.
2. List all columns in a specific table.
3. Retrieve specific columns.
---


In [3]:
import pyodbc
from dotenv import load_dotenv
import os

load_dotenv()  # Load environment variables from .env file

server = os.getenv("DB_SERVER")
database = os.getenv("DB_NAME")
username = os.getenv("DB_USER")
password = os.getenv("DB_PASSWORD") 
driver = '{ODBC Driver 18 for SQL Server}'  # Ensure the driver matches your installation

try:
    # Add TrustServerCertificate and ENCRYPT options to the connection string
    conn =  pyodbc.connect(
        f'DRIVER={driver}; SERVER={server}; DATABASE={database};'
        f'UID={username}; PWD={password};'
        f'ENCRYPT=yes; TrustServerCertificate=yes'
    )
    cursor = conn.cursor()
    print("Connection successful!")
except Exception as e:
    print(f"Error: {e}")



Connection successful!


In [None]:
# **Exercise 1: List Relevant Tables**
import pandas as pd

query = """
    SELECT TABLE_SCHEMA, TABLE_NAME 
    FROM INFORMATION_SCHEMA.TABLES 
    WHERE TABLE_TYPE = 'BASE TABLE' AND
    TABLE_SCHEMA IN ('Sales', 'Warehouse')
    """

cursor.execute(query)
tables = cursor.fetchall()
print("Tables in the database:")
for schema, table in tables:
    print(f"{schema}.{table}")
    


In [14]:
import pandas as pd
import urllib
import pyodbc
from sqlalchemy import create_engine

driver = 'ODBC Driver 18 for SQL Server'

params = urllib.parse.quote_plus(
    f"DRIVER={{{driver}}};SERVER={server};DATABASE={database};"
    f"UID={username};PWD={password};ENCRYPT=yes;TrustServerCertificate=yes"
)

engine = create_engine(f"mssql+pyodbc:///?odbc_connect={params}")


tables_df = pd.read_sql(query, engine)

# Display the tables
print("Tables in the database:")
for _, row in tables_df.iterrows():
    print(f"{row['TABLE_SCHEMA']}.{row['TABLE_NAME']}")

Tables in the database:
Warehouse.Colors
Warehouse.Colors_Archive
Sales.OrderLines
Warehouse.PackageTypes
Warehouse.PackageTypes_Archive
Warehouse.StockGroups
Warehouse.StockItemStockGroups
Warehouse.StockGroups_Archive
Sales.CustomerTransactions
Sales.InvoiceLines
Warehouse.StockItemTransactions
Sales.Customers
Sales.Customers_Archive
Sales.Orders
Warehouse.ColdRoomTemperatures
Warehouse.VehicleTemperatures
Warehouse.StockItems
Warehouse.ColdRoomTemperatures_Archive
Warehouse.StockItems_Archive
Warehouse.StockItemHoldings
Sales.SpecialDeals
Sales.BuyingGroups
Sales.Invoices
Sales.BuyingGroups_Archive
Sales.CustomerCategories
Sales.CustomerCategories_Archive


In [None]:
## **Exercise 2: List Columns of a Table** ##
import pandas as pd
query = """
    SELECT *
    FROM Sales.Orders
    """

cursor.execute(query)
column_names = [column[0] for column in cursor.description]
rows = cursor.fetchall()
df = pd.DataFrame.from_records(rows, columns=column_names)



In [None]:
query = """ 
    SELECT COLUMN_NAME, DATA_TYPE
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME = 'Orders' AND COLUMN_NAME = 'OrderDate';
"""

cursor.execute(query)
column_names = [column[0] for column in cursor.description]
rows = cursor.fetchall()
print(column_names)
print(rows)


In [None]:
## **Exercise 3: Extract the orders in 2013** ##
import pandas as pd
query = """ 
    SELECT OrderID, CustomerID, OrderDate
    FROM Sales.Orders
    WHERE YEAR(OrderDate) = 2013;
"""
cursor.execute(query)
column_names = [column[0] for column in cursor.description]
rows = cursor.fetchall()

df2 = pd.DataFrame.from_records(rows, columns=column_names)
df2


In [None]:
cursor.close()
conn.close()