# Ungraded Lab: Create a Database Table Challenge

## 📋 Overview 

Welcome to the final lab of Module 4! In this challenge, you'll apply the concepts you've learned throughout the course to create a robust database using sales data from a business. This hands-on experience will give you the opportunity to demonstrate your skills in a real-world context, preparing you for similar tasks you might encounter as a data scientist.


## 🎯 Learning Outcomes
By the end of this lab, you will be able to:
- Create new schema and tables based on given business requirements
- Implement the schema and tables using SQL CREATE TABLE statements
- Apply appropriate data types and constraints to ensure data integrity
- Import sample data into the newly created tables


## 📚 Dataset Information
You'll be working with the <b>product.csv</b> dataset, which contains information about products listed in an online shop. The data includes <b>product IDs, city locations, product URLs, tags for categorization, and links to product images</b>.


## 🖥️ Activities

### Activity 1: Analyze the Dataset  

Before creating our schema, we need to understand the structure of our data.

<b>Step 1:</b> Review the dataset information provided:
- Examine the columns: product_id, city, product_url, tags, product_picture
- Note the data types and any potential constraints

In [1]:
#Import necessary libraries:
import sqlite3
import pandas as pd

#Take a look at the given data : 
products_data = pd.read_csv("product.csv")
products_data.head()

Unnamed: 0,product_id,city,product_url,tags,product_picture
0,1,New York,https://data_shop_inc.com/product/1,electronics,https://picsum.photos/300?random=1
1,2,Miami,https://data_shop_inc.com/product/2,electronics,https://picsum.photos/300?random=2
2,3,New York,https://data_shop_inc.com/product/3,summer,https://picsum.photos/300?random=3
3,4,Houston,https://data_shop_inc.com/product/4,beauty,https://picsum.photos/300?random=4
4,5,Miami,https://data_shop_inc.com/product/5,beauty,https://picsum.photos/300?random=5


<b>Step 2:</b> Consider the following questions:
- What could be the primary key for this dataset?
- Are there any columns that should be required (NOT NULL)?
- Are there any columns that should be unique?

<b>💡 Tip:</b> Pay attention to the product_id column as a potential primary key.

### Activity 2: Creating the Schema

In SQL, a schema is a way to organize how a database is structured. It includes all the elements like tables, views, indexes, and procedures within the database. Think of it as a blueprint that groups everything together, making it easier to manage and secure the data.
In SQLite, unlike databases such as PostgreSQL or MySQL, you don’t explicitly create schemas. Instead, the database file itself serves as the schema, and all database objects like tables, views, and indexes are created directly within that database file.

Create a new schema, named my_schema, using SQLite:


In [2]:
# Create or connect to a schema
conn = sqlite3.connect("my_schema.db") 
cursor = conn.cursor()

### Activity 3:  Design the Products Table

Now that we have created our schema, let's create the table.

<b>Step 1:</b> Design a new table called 'products' with appropriate columns: 

Choose appropriate data types for each column:
- product_id: Consider using INT or BIGINT
- city: VARCHAR would be suitable for city names
- product_url: VARCHAR or TEXT for URLs
- tags: VARCHAR or TEXT for product categories
- product_picture: VARCHAR or TEXT for image URLs

```
CREATE TABLE products (
  # Your code here
);
```

<b>Step 2:</b> Add constraints to ensure data integrity:
- Set product_id as the primary key
- Make all columns NOT NULL if they should always have a value
- Consider adding a UNIQUE constraint to product_url if each product should have a unique URL

<b>💡 Tip:</b> Remember to end each line (except the last) with a comma, and the entire statement with a semicolon.

### Activity 4: Implement the Table
Let's turn our design into actual SQL code.

<b>Step 1:</b> Write the SQL statement to create the 'products' table:

In [None]:
create_table_query = """
CREATE TABLE products (
  -- Your code here
);
"""

# Execute the query
cursor.execute(create_table_query)
conn.commit()

<b>💡 Tip:</b> Double-check your SQL syntax before running the query. Common errors include missing commas or semicolons.

### Activity 5: Verify the Table
It's important to verify that our schema was created correctly.

<b>Step 1:</b> Write a query to describe the 'products' table:

In [None]:
describe_query = """
PRAGMA table_info(products);
"""

# Execute the query and display results
df = pd.read_sql_query(describe_query, conn)
display(df)

<b>Step 2:</b> Run the cell and examine the output. Verify that:
- All columns are present
- Data types are correct
- Constraints (e.g., PRIMARY KEY, NOT NULL) are applied correctly


<b>💡 Tip:</b> If you notice any issues, you can drop the table and recreate it with corrections.

In [None]:
# Close the database connection 
conn.close()

## ✅ Success Checklist
- The 'products' table is created without errors
- All required columns are present in the table
- Data types are appropriate for each column
- Constraints (PRIMARY KEY, NOT NULL, etc.) are correctly applied
- You can successfully describe the table structure

## 🔍 Common Issues & Solutions 

- Problem: "Table already exists" error 
    - Solution: If you're re-running your code, drop the existing table first:<br>
    <b>DROP TABLE IF EXISTS products;</b></br>

- Problem: Incorrect data type causing insertion issues  
    - Solution: Double-check that your data types match the sample data provided. For example, ensure TEXT is used for longer string fields if VARCHAR is too limiting.

## ➡️ Summary
Great job completing this lab! You've now designed and implemented a database schema, a fundamental skill for any data scientist working with structured data. In your future projects, remember to always start with a well-thought-out schema that accurately represents your data and enforces necessary constraints.

### 🔑 Key Points
- Properly designing a database schema is crucial for efficient data management
- SQL CREATE TABLE statements define the structure of your database
- Choosing appropriate data types and constraints ensures data integrity
- Always verify your schema after creation to catch any potential issues