afesobi · afesobi · Feb 4, 2025 · Feb 4, 2025 · Feb 4, 2025 · Feb 4, 2025
diff --git a/02_activities/assignments/Assignment2.md b/02_activities/assignments/Assignment2.md
@@ -54,7 +54,49 @@ The store wants to keep customer addresses. Propose two architectures for the CU
 **HINT:** search type 1 vs type 2 slowly changing dimensions. 
 
 ```
-Your answer...
+To store customer addresses, we propose two different architectures:
+
+Type 1: Overwriting Changes
+
+A simple structure where the latest address replaces the previous one:
+
+Table: CUSTOMER_ADDRESS
+
+customer_id (Primary Key, Foreign Key from Customer)
+
+address
+
+city
+
+state
+
+zip
+
+In this model, whenever a customer updates their address, the old data is overwritten.
+
+Type 2: Retaining History
+
+A more complex structure that maintains historical address changes:
+
+Table: CUSTOMER_ADDRESS_HISTORY
+
+customer_id (Foreign Key from Customer)
+
+address
+
+city
+
+state
+
+zip
+
+start_date
+
+end_date
+
+With this approach, whenever a customer changes their address, a new record is created with the start_date, and the previous record is updated with an end_date, preserving history.
+
+Type 1 is best when historical data is not needed, whereas Type 2 is essential when tracking address changes over time.
 ```
 
 ***
@@ -182,5 +224,36 @@ Consider, for example, concepts of labour, bias, LLM proliferation, moderating c
 
 
 ```
-Your thoughts...
+Section 4: Ethics in AI and Data Processing
+
+Ethical Issues in "Neural Nets are Just People All the Way Down"
+
+The article by Vicki Boykis explores the ethical complexities surrounding AI, specifically Large Language Models (LLMs). Key ethical concerns include:
+
+Bias in AI Models
+
+AI systems inherit biases from their training data, which reflects societal prejudices.
+
+This perpetuates discrimination in automated decision-making.
+
+Labor and Automation
+
+LLMs rely on vast amounts of data labeled by underpaid human workers.
+
+The ethical issue of exploiting global labor for AI development raises concerns.
+
+Challenges in Moderating AI-Generated Content
+
+AI-generated content can be harmful or misleading.
+
+There is no perfect moderation system, as AI models lack human context and ethics.
+
+AI in Society & Ethical Dilemmas
+
+The rapid growth of LLMs creates a monopoly where only a few corporations control AI development.
+
+Ethical concerns arise about transparency, accessibility, and misinformation.
+
+Conclusion
+While AI provides immense benefits, its ethical implications cannot be ignored. To mitigate bias, labor exploitation, and misinformation, there must be continuous oversight, regulation, and a commitment to ethical AI development. AI is ultimately shaped by human values, and ensuring fairness and accountability remains a shared responsibility
 ```
diff --git a/02_activities/assignments/BOOKSTORE ERD.jpeg b/02_activities/assignments/BOOKSTORE ERD.jpeg
diff --git a/02_activities/assignments/assignment2.sql b/02_activities/assignments/assignment2.sql
@@ -1,70 +1,87 @@
-/* ASSIGNMENT 2 */
+/* ASSIGNMENT 2 */ --- FESOBI OLUWAMUYIWA
 /* SECTION 2 */
 
--- COALESCE
-/* 1. Our favourite manager wants a detailed long list of products, but is afraid of tables! 
-We tell them, no problem! We can produce a list with all of the appropriate details. 
-
-Using the following syntax you create our super cool and not at all needy manager a list:
-
+-- COALESCE - Handle NULL values
 SELECT 
-product_name || ', ' || product_size|| ' (' || product_qty_type || ')'
-FROM product
-
-But wait! The product table has some bad data (a few NULL values). 
-Find the NULLs and then using COALESCE, replace the NULL with a 
-blank for the first problem, and 'unit' for the second problem. 
-
-HINT: keep the syntax the same, but edited the correct components with the string. 
-The `||` values concatenate the columns into strings. 
-Edit the appropriate columns -- you're making two edits -- and the NULL rows will be fixed. 
-All the other rows will remain the same.) */
-
-
-
---Windowed Functions
-/* 1. Write a query that selects from the customer_purchases table and numbers each customer’s  
-visits to the farmer’s market (labeling each market date with a different number). 
-Each customer’s first visit is labeled 1, second visit is labeled 2, etc. 
-
-You can either display all rows in the customer_purchases table, with the counter changing on
-each new market date for each customer, or select only the unique market dates per customer 
-(without purchase details) and number those visits. 
-HINT: One of these approaches uses ROW_NUMBER() and one uses DENSE_RANK(). */
-
-
-
-/* 2. Reverse the numbering of the query from a part so each customer’s most recent visit is labeled 1, 
-then write another query that uses this one as a subquery (or temp table) and filters the results to 
-only the customer’s most recent visit. */
-
-
-
-/* 3. Using a COUNT() window function, include a value along with each row of the 
-customer_purchases table that indicates how many different times that customer has purchased that product_id. */
-
-
-
--- String manipulations
-/* 1. Some product names in the product table have descriptions like "Jar" or "Organic". 
-These are separated from the product name with a hyphen. 
-Create a column using SUBSTR (and a couple of other commands) that captures these, but is otherwise NULL. 
-Remove any trailing or leading whitespaces. Don't just use a case statement for each product! 
-
-| product_name               | description |
-|----------------------------|-------------|
-| Habanero Peppers - Organic | Organic     |
-
-Hint: you might need to use INSTR(product_name,'-') to find the hyphens. INSTR will help split the column. */
-
-
-
-/* 2. Filter the query to show any product_size value that contain a number with REGEXP. */
+    product_name || ', ' || COALESCE(product_size, '') || ' (' || COALESCE(product_qty_type, 'unit') || ')'
+FROM product;
 
+--Window function
+SELECT 
+    customer_id,
+    market_date,
+    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date ASC) AS visit_number
+FROM customer_purchases;
 
+--Reversing the numbering
+SELECT 
+    customer_id,
+    market_date,
+    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS visit_number
+FROM customer_purchases;
+
+---Filtering only the most recent visit for each customer:
+
+WITH RankedVisits AS (
+    SELECT 
+        customer_id,
+        market_date,
+        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS visit_number
+    FROM customer_purchases
+)
+SELECT customer_id, market_date
+FROM RankedVisits
+WHERE visit_number = 1;
+
+
+--Count window function
+WITH RankedVisits AS (
+    SELECT 
+        customer_id,
+        market_date,
+        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS visit_number
+    FROM customer_purchases
+)
+SELECT customer_id, market_date
+FROM RankedVisits
+WHERE visit_number = 1;
+
+-- Count Window Function - Number of times a customer has purchased a product
+SELECT 
+    customer_id,
+    product_id,
+    COUNT(*) OVER (PARTITION BY customer_id, product_id) AS purchase_count
+FROM customer_purchases;
 
--- UNION
-/* 1. Using a UNION, write a query that displays the market dates with the highest and lowest total sales.
+--String Manipulation
+SELECT 
+    product_name,
+    TRIM(SUBSTR(product_name, INSTR(product_name, '-') + 1)) AS description
+FROM product
+WHERE INSTR(product_name, '-') > 0;
+
+--UNION - Market dates with highest and lowest total sales
+
+WITH SalesData AS (
+    SELECT 
+        market_date,
+        SUM(quantity * cost_to_customer_per_qty) AS total_sales
+    FROM customer_purchases
+    GROUP BY market_date
+),
+RankedSales AS (
+    SELECT 
+        market_date,
+        total_sales,
+        RANK() OVER (ORDER BY total_sales DESC) AS highest_rank,
+        RANK() OVER (ORDER BY total_sales ASC) AS lowest_rank
+    FROM SalesData
+)
+SELECT market_date, total_sales, 'Highest Sales' AS category
+FROM RankedSales WHERE highest_rank = 1
+UNION
+SELECT market_date, total_sales, 'Lowest Sales' AS category
+FROM RankedSales WHERE lowest_rank = 1;
 
 HINT: There are a possibly a few ways to do this query, but if you're struggling, try the following: 
 1) Create a CTE/Temp Table to find sales values grouped dates; 
@@ -78,56 +95,44 @@ with a UNION binding them. */
 
 /* SECTION 3 */
 
--- Cross Join
-/*1. Suppose every vendor in the `vendor_inventory` table had 5 of each of their products to sell to **every** 
-customer on record. How much money would each vendor make per product? 
-Show this by vendor_name and product name, rather than using the IDs.
-
-HINT: Be sure you select only relevant columns and rows. 
-Remember, CROSS JOIN will explode your table rows, so CROSS JOIN should likely be a subquery. 
-Think a bit about the row counts: how many distinct vendors, product names are there (x)?
-How many customers are there (y). 
-Before your final group by you should have the product of those two queries (x*y).  */
-
-
-
--- INSERT
-/*1.  Create a new table "product_units". 
-This table will contain only products where the `product_qty_type = 'unit'`. 
-It should use all of the columns from the product table, as well as a new column for the `CURRENT_TIMESTAMP`.  
-Name the timestamp column `snapshot_timestamp`. */
-
-
-
-/*2. Using `INSERT`, add a new row to the product_units table (with an updated timestamp). 
-This can be any product you desire (e.g. add another record for Apple Pie). */
-
-
+--CROSS JOIN - Vendor revenue per product
+SELECT 
+    v.vendor_name, 
+    p.product_name, 
+    5 * vi.original_price AS revenue_per_product
+FROM vendor v
+CROSS JOIN (
+    SELECT DISTINCT product_id, original_price FROM vendor_inventory
+) vi
+JOIN product p ON vi.product_id = p.product_id;
 
--- DELETE
-/* 1. Delete the older record for the whatever product you added. 
 
-HINT: If you don't specify a WHERE clause, you are going to have a bad time.*/
+---INSERT - Create a product_units table
+--CREATE TABLE product_units AS
+--SELECT *, CURRENT_TIMESTAMP AS snapshot_timestamp
+--FROM product
+--WHERE product_qty_type = 'unit';
 
 
+---Insert a new record into product_units:
 
--- UPDATE
-/* 1.We want to add the current_quantity to the product_units table. 
-First, add a new column, current_quantity to the table using the following syntax.
+INSERT INTO product_units (product_id, product_name, product_size, product_category_id, product_qty_type, snapshot_timestamp)
+VALUES (999, 'Apple Pie', 'Medium', 3, 'unit', CURRENT_TIMESTAMP);
 
-ALTER TABLE product_units
-ADD current_quantity INT;
+--DELETE - Remove older record
+DELETE FROM product_units 
+WHERE product_id = 999
+AND snapshot_timestamp = (SELECT MIN(snapshot_timestamp) FROM product_units WHERE product_id = 999);
 
-Then, using UPDATE, change the current_quantity equal to the last quantity value from the vendor_inventory details.
 
-HINT: This one is pretty hard. 
-First, determine how to get the "last" quantity per product. 
-Second, coalesce null values to 0 (if you don't have null values, figure out how to rearrange your query so you do.) 
-Third, SET current_quantity = (...your select statement...), remembering that WHERE can only accommodate one column. 
-Finally, make sure you have a WHERE statement to update the right row, 
-	you'll need to use product_units.product_id to refer to the correct row within the product_units table. 
-When you have all of these components, you can run the update statement. */
+--UPDATE - Add current_quantity column and update it
+ALTER TABLE product_units ADD COLUMN current_quantity INT;
 
+UPDATE product_units
+SET current_quantity = COALESCE(
+    (SELECT quantity FROM vendor_inventory vi WHERE vi.product_id = product_units.product_id ORDER BY market_date DESC LIMIT 1),
+    0
+);