**QUERY 1 BEST 1**

**Problem:** list all abilities shared by Pokémon across different generations, including the count of Pokémon per ability and the generations they span

**Tables:** 

PokemonGen1, 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen2,&nbsp;</span>     

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen3</span>

**Columns:** 

Ability1, 

Generation is a derived column, 

Totalcount derieved column

**Predicate:** 

HAVING COUNT(DISTINCT Generation) \> 1: This predicate is used to filter abilities that appear in more than one generation. It ensures that only abilities shared across different generations are included in the final output. 

**Process:**

The AbilityCounts CTE combines the Ability1 data and associated generation from the first three Pokémon generation tables, treating them as if they came from a single source. The RankedAbilities CTE then aggregates this combined data to count how many times each ability appears across the generations and identifies in which generations each ability is found. The final SELECT statement orders the results by the total count of each ability (descending) and alphabetically by ability name.

In [1]:
--listing all abilities shared by Pokémon across different generations, including the count of Pokémon per ability and the generations they span


USE PokemonGen1;
USE PokemonGen2;
USE PokemonGen3;

WITH AbilityCounts AS (
  SELECT Ability1, 'Gen1' AS Generation
  FROM PokemonGen1.dbo.PokemonGen1
  UNION ALL
  SELECT ability1, 'Gen2'
  FROM PokemonGen2.dbo.PokemonGen2
  UNION ALL
  SELECT ability1, 'Gen3'
  FROM PokemonGen3.dbo.PokemonGen3
),
RankedAbilities AS (
  SELECT Ability1, COUNT(*) AS TotalCount, STRING_AGG(Generation, ', ') WITHIN GROUP (ORDER BY Generation) AS Generations
  FROM AbilityCounts
  GROUP BY Ability1
  HAVING COUNT(DISTINCT Generation) > 1 -- Abilities that appear in more than one generation
)
SELECT Ability1, TotalCount, Generations
FROM RankedAbilities
ORDER BY TotalCount DESC, Ability1;

Ability1,TotalCount,Generations
Chlorophyll,19,"Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen2, Gen2, Gen2, Gen2, Gen2, Gen2, Gen3, Gen3, Gen3, Gen3"
Swift Swim,19,"Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen2, Gen2, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3"
Levitate,16,"Gen1, Gen1, Gen1, Gen1, Gen2, Gen2, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3"
Intimidate,13,"Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen2, Gen2, Gen2, Gen2, Gen3, Gen3, Gen3"
Thick Fat,13,"Gen1, Gen1, Gen2, Gen2, Gen2, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3, Gen3"
Pressure,12,"Gen1, Gen1, Gen1, Gen1, Gen2, Gen2, Gen2, Gen2, Gen2, Gen3, Gen3, Gen3"
Keen Eye,11,"Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen2, Gen3, Gen3, Gen3"
Oblivious,11,"Gen1, Gen1, Gen1, Gen2, Gen2, Gen2, Gen2, Gen3, Gen3, Gen3, Gen3"
Run Away,11,"Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen1, Gen2, Gen2, Gen2, Gen3"
Synchronize,11,"Gen1, Gen1, Gen1, Gen1, Gen2, Gen2, Gen2, Gen2, Gen3, Gen3, Gen3"


**QUERY 2** 

**Problem:**   <span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">&nbsp;Find Pokémon that have moves unique to their generation, excluding moves learned by Pokémon in other generations</span>

**Tables:** 

PokemonGen1, 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen2,&nbsp;</span>          

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen3</span>

**Columns:** 

Ability1, 

Generation is a derived column, 

**Predicate:**

HAVING COUNT(\*) = 1: This condition is used in the UniqueAbilities CTE to ensure that only abilities that occur exactly once across all the generations are selected for the final output. 

**Process:**

The Abilities CTE uses the UNPIVOT operator for each generation's table to transform the Ability1, Ability2, and Ability3 columns into rows, effectively treating each ability as a separate record while associating it with its generation. The query then combines these records using UNION ALL, allowing for the aggregation of abilities across all generations. The UniqueAbilities CTE aggregates this data by ability and filters to keep only those abilities that appear exactly once (HAVING COUNT(\*) = 1). It also compiles which generations these unique abilities belong to, though due to the count condition, each ability will be associated with a single generation. The final SELECT statement orders the unique abilities alphabetically by ability name and lists their corresponding generations.

In [2]:
--Find Pokémon that have moves unique to their generation, excluding moves learned by Pokémon in other generations
USE PokemonGen1;
USE PokemonGen2;
USE PokemonGen3;

WITH Abilities AS (
  SELECT Ability AS Ability, 'Gen1' AS Generation FROM PokemonGen1.dbo.PokemonGen1
  UNPIVOT
  (Ability FOR AbilityNumber IN (Ability1, Ability2, Ability3)) AS unpvt
  UNION ALL
  SELECT Ability, 'Gen2' FROM PokemonGen2.dbo.PokemonGen2
  UNPIVOT
  (Ability FOR AbilityNumber IN (Ability1, Ability2, Ability3)) AS unpvt
  UNION ALL
  SELECT Ability, 'Gen3' FROM PokemonGen3.dbo.PokemonGen3
  UNPIVOT
  (Ability FOR AbilityNumber IN (Ability1, Ability2, Ability3)) AS unpvt
),
UniqueAbilities AS (
  SELECT Ability, COUNT(*) AS Count, STRING_AGG(Generation, ', ') WITHIN GROUP (ORDER BY Generation) AS Generations
  FROM Abilities
  GROUP BY Ability
  HAVING COUNT(*) = 1
)
SELECT Ability, Generations
FROM UniqueAbilities
ORDER BY Ability;

Ability,Generations
Air Lock,Gen3
Color Change,Gen3
Filter,Gen1
Forecast,Gen3
Honey Gather,Gen2
Imposter,Gen1
Mold Breaker,Gen1
Protean,Gen3
Sand Stream,Gen2
Simple,Gen3


**QUERY 3**

**Problem:** How many Pokémon are there for each type? Provide a list sorted by the most common type first

**Tables:** 

PokemonGen1, 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen2,&nbsp;</span>                

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen3</span>

**Columns:** 

Type1 

Type2

**Predicate:**

WHERE Type2 IS NOT NULL 

**Process:** 

This SQL code aggregates Pokémon types across three generations (Gen 1, Gen 2, and Gen 3), including both primary (Type1) and secondary types (Type2). It counts how many times each type appears, taking into account both primary and secondary types, excludes null values for secondary types to ensure accurate counting, and then orders the results by the count in descending order to highlight the most common types.

In [4]:
--How many Pokémon are there for each type? Provide a list sorted by the most common type first

USE PokemonGen1;
USE PokemonGen2;
USE PokemonGen3;

SELECT Type, COUNT(*) AS Count
FROM (
    SELECT Type1 AS Type FROM PokemonGen1.dbo.PokemonGen1
    UNION ALL
    SELECT Type2 FROM PokemonGen1.dbo.PokemonGen1 WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT Type1 FROM PokemonGen2.dbo.PokemonGen2
    UNION ALL
    SELECT Type2 FROM PokemonGen2.dbo.PokemonGen2 WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT Type1 FROM PokemonGen3.dbo.PokemonGen3
    UNION ALL
    SELECT Type2 FROM PokemonGen3.dbo.PokemonGen3 WHERE Type2 IS NOT NULL
) AS CombinedTypes
GROUP BY Type
ORDER BY Count DESC;

Type,Count
Water,78
Normal,55
Flying,50
Psychic,44
Poison,41
Grass,41
Bug,36
Ground,35
Rock,30
Fire,28


**QUERY 4**

**Problem:** <span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">&nbsp;identify the Pokémon with the highest 'Speed' in each generation</span>  

**Tables:** 

PokemonGen1, 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen2,&nbsp;</span>                

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen3</span>

**Columns:** 

Name

Speed

**Predicate:**

WHERE Speed = (SELECT MAX(Speed) FROM PokemonGen1.dbo.PokemonGen1) 

WHERE Speed = (SELECT MAX(Speed) FROM PokemonGen2.dbo.PokemonGen2) 

WHERE Speed = (SELECT MAX(Speed) FROM PokemonGen3.dbo.PokemonGen3) 

**Process:** 

The predicate is checking for Pokémon whose 'Speed' matches the maximum 'Speed' value found in the corresponding generation's table. The subquery SELECT MAX(Speed) FROM... calculates the maximum 'Speed' value within each table, and the outer query filters for Pokémon with a 'Speed' that equals this maximum value. This ensures that only the Pokémon (or Pokémon, in case of ties) with the highest 'Speed' in each generation are selected.

In [5]:
--identify the Pokémon with the highest 'Speed' in each generation
USE PokemonGen1;
USE PokemonGen2;
USE PokemonGen3;

-- Generation 1
SELECT 'Gen1' AS Generation, Name, Speed AS MaxSpeed
FROM PokemonGen1.dbo.PokemonGen1
WHERE Speed = (SELECT MAX(Speed) FROM PokemonGen1.dbo.PokemonGen1)

UNION ALL

-- Generation 2
SELECT 'Gen2' AS Generation, Name, Speed AS MaxSpeed
FROM PokemonGen2.dbo.PokemonGen2
WHERE Speed = (SELECT MAX(Speed) FROM PokemonGen2.dbo.PokemonGen2)

UNION ALL

-- Generation 3
SELECT 'Gen3' AS Generation, Name, Speed AS MaxSpeed
FROM PokemonGen3.dbo.PokemonGen3
WHERE Speed = (SELECT MAX(Speed) FROM PokemonGen3.dbo.PokemonGen3);


Generation,Name,MaxSpeed
Gen1,Electrode,150
Gen2,Crobat,130
Gen3,Ninjask,160


**QUERY 5**

**Problem:**  <span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">&nbsp;ID pokemon with an Attack to Speed ratio greater than 1.0, which means they have a higher attack power relative to their speed.</span>

**Tables:** 

PokemonGen1, 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen2,&nbsp;</span>                   

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen3</span>

**Columns:** 

Name

Type1

Attack

Speed

Ratio (derived column of attack to speed)

**Predicate:**

WHERE Attack \> Speed

**Process:** 

This condition is applied individually to the selections from each of the three tables and ensures that only Pokémon whose Attack stat is greater than their Speed stat are included in the final result. The resulting dataset is then ordered by the calculated ratio in descending order, highlighting Pokémon with a significantly higher Attack compared to their Speed.

In [None]:
-- ID pokemon with an Attack to Speed ratio greater than 1.0, which means they have a higher attack power relative to their speed.

USE PokemonGen1;
USE PokemonGen2;
USE PokemonGen3;

SELECT Name, Type1, Attack, Speed, (CAST(Attack AS FLOAT) / Speed) AS Ratio
FROM PokemonGen1.dbo.PokemonGen1
WHERE Attack > Speed
UNION ALL
SELECT Name, Type1, Attack, Speed, (CAST(Attack AS FLOAT) / Speed)
FROM PokemonGen2.dbo.PokemonGen2
WHERE Attack > Speed
UNION ALL
SELECT Name, Type1, Attack, Speed, (CAST(Attack AS FLOAT) / Speed)
FROM PokemonGen3.dbo.PokemonGen3
WHERE Attack > Speed
ORDER BY Ratio DESC;

**QUERY 6 BEST 2**

**Problem:** determine battle readiness of each pokemon, which is a sum of attack, defense, and the higher of special att or special def, plus twice their speed

**Tables:** 

PokemonGen1, 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen2,&nbsp;</span>                        

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen3</span>

**Columns:** 

Name

Type1

Attack

Defense

SpecialAttack

SpecialDefense

Speed 

Gen(derived col) 

battlescore(derieved col)

**Predicate:** 

calucalating battlescore which is a computed as the sum of Attack, Defense, and the higher value between SpecialAttack and SpecialDefense, plus twice the Speed stat of the Pokémon.

Order by BattleScore Desc, sorts the battlescore in descending order

**Process:** 

This SQL code combines Pokémon data from three different generation-specific tables into a single dataset, calculates a "BattleScore" for each Pokémon based on their attack, defense, special attack or special defense (whichever is higher), and speed stats, and then orders the resulting combined list by this calculated score in descending order.

In [6]:
-- determine battle readiness of each pokemon, which is a sum of attack, defense, and the higher of special att or special def, plus twice their speed

USE PokemonGen1; 
USE PokemonGen2;
USE PokemonGen3; 

SELECT TOP 10 * FROM (
    SELECT
        Name,
        Type1,
        'Gen1' AS Generation,
        Attack + Defense + CASE WHEN SpecialAttack > SpecialDefense THEN SpecialAttack ELSE SpecialDefense END + (Speed * 2) AS BattleScore
    FROM PokemonGen1.dbo.PokemonGen1
    UNION ALL
    SELECT
        Name,
        Type1,
        'Gen2',
        Attack + Defense + CASE WHEN SpecialAttack > SpecialDefense THEN SpecialAttack ELSE SpecialDefense END + (Speed * 2)
    FROM PokemonGen2.dbo.PokemonGen2
    UNION ALL
    SELECT
        Name,
        Type1,
        'Gen3',
        Attack + Defense + CASE WHEN SpecialAttack > SpecialDefense THEN SpecialAttack ELSE SpecialDefense END + (Speed * 2)
    FROM PokemonGen3.dbo.PokemonGen3
) AS Combined
ORDER BY BattleScore DESC;


Name,Type1,Generation,BattleScore
Deoxys,Psychic,Gen3,650
Mewtwo,Psychic,Gen1,614
Lugia,Psychic,Gen2,594
Rayquaza,Dragon,Gen3,580
Groudon,Ground,Gen3,570
Slaking,Normal,Gen3,555
Ho-Oh,Fire,Gen2,554
Salamence,Dragon,Gen3,525
Kyogre,Water,Gen3,520
Latias,Dragon,Gen3,520


**QUERY 7 WORST 1 BEST 3**

**Problem:**  Which new abilities (Ability1, Ability2, Ability3) were introduced in each generation, and how are they distributed among different Pokémon types?

**Tables:** 

PokemonGen1, 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen2,&nbsp;</span>                       

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">PokemonGen3</span>

**Columns:** 

Ability 1

Ability2

ability3

Ability (derieved from 1,2,3)

Generation (derieved from each respective gen)

Type1 

Type2 

**Predicate:**

The WHERE Ability IS NOT NULL predicate filters out rows where the ability is NULL, ensuring that only entries with a defined ability are considered. 

The GROUP BY Ability, Generation, Type1, Type2 clause in the AbilityCount common table expression (CTE) groups the data by ability, generation, and Pokémon type, which is necessary for counting the number of occurrences. 

T<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">he ORDER BY Ability, Generation, Type1, Type2 clause at the end of the query sorts the final result set based on the ability name, generation, and Pokémon types, in ascending order.</span>

**Process:** 

With AbilityIntro CTE: The query creates a unified view (AbilityIntro) of all abilities across the three generations, tagging each ability with its generation and associated Pokémon types, ensuring only non-null abilities are included. With AbilityCount CTE: It then aggregates this data (AbilityCount), counting the occurrences of each ability within each generation and type combination. Final SELECT: Finally, it selects from AbilityCount to present the ability, its generation, associated types, and the count of occurrences, ordered alphabetically by ability and then by generation and type.

In [7]:
-- Which new abilities (Ability1, Ability2, Ability3) were introduced in each generation, and how are they distributed among different Pokémon types?
USE PokemonGen1;
USE PokemonGen2;
USE PokemonGen3;

WITH AbilityIntro AS (
  SELECT Ability, 'Gen1' AS Generation, Type1, Type2
  FROM (
    SELECT Ability1 AS Ability, Type1, Type2 FROM PokemonGen1.dbo.PokemonGen1
    UNION ALL
    SELECT Ability2, Type1, Type2 FROM PokemonGen1.dbo.PokemonGen1
    UNION ALL
    SELECT Ability3, Type1, Type2 FROM PokemonGen1.dbo.PokemonGen1
  ) Gen1
  WHERE Ability IS NOT NULL
  UNION ALL
  SELECT Ability, 'Gen2', Type1, Type2
  FROM (
    SELECT Ability1 AS Ability, Type1, Type2 FROM PokemonGen2.dbo.PokemonGen2
    UNION ALL
    SELECT Ability2, Type1, Type2 FROM PokemonGen2.dbo.PokemonGen2
    UNION ALL
    SELECT Ability3, Type1, Type2 FROM PokemonGen2.dbo.PokemonGen2
  ) Gen2
  WHERE Ability IS NOT NULL
  UNION ALL
  SELECT Ability, 'Gen3', Type1, Type2
  FROM (
    SELECT Ability1 AS Ability, Type1, Type2 FROM PokemonGen3.dbo.PokemonGen3
    UNION ALL
    SELECT Ability2, Type1, Type2 FROM PokemonGen3.dbo.PokemonGen3
    UNION ALL
    SELECT Ability3, Type1, Type2 FROM PokemonGen3.dbo.PokemonGen3
  ) Gen3
  WHERE Ability IS NOT NULL
),
AbilityCount AS (
  SELECT Ability, Generation, Type1, Type2, COUNT(*) AS Count
  FROM AbilityIntro
  GROUP BY Ability, Generation, Type1, Type2
)
SELECT Ability, Generation, Type1, Type2, Count
FROM AbilityCount
ORDER BY Ability, Generation, Type1, Type2;


Ability,Generation,Type1,Type2,Count
Adaptability,Gen1,Normal,,1
Adaptability,Gen3,Water,,2
Adaptability,Gen3,Water,Dark,1
Aftermath,Gen1,Electric,,2
Air Lock,Gen3,Dragon,Flying,1
Analytic,Gen1,Electric,Steel,2
Analytic,Gen1,Normal,,1
Analytic,Gen1,Water,,1
Analytic,Gen1,Water,Psychic,1
Analytic,Gen2,Normal,,1


BETTER CODE 

Combine all abilities from all three generations into one list using UNION to ensure uniqueness across the entire dataset. UNION is used instead of UNION ALL to eliminate duplicates automatically, ensuring that each ability is listed only once regardless of how many times it appears across different generations or types. The final SELECT DISTINCT might seem redundant because UNION already ensures uniqueness, but it emphasizes the intent to list unique abilities. It can be removed without affecting the outcome. The result is a list of unique abilities sorted alphabetically.

In [13]:
USE PokemonGen1;
USE PokemonGen2;
USE PokemonGen3;

WITH Abilities AS (
    SELECT Ability1 AS Ability, 'Gen1' AS Generation FROM PokemonGen1.dbo.PokemonGen1 WHERE Ability1 IS NOT NULL
    UNION
    SELECT Ability2, 'Gen1' FROM PokemonGen1.dbo.PokemonGen1 WHERE Ability2 IS NOT NULL
    UNION
    SELECT Ability3, 'Gen1' FROM PokemonGen1.dbo.PokemonGen1 WHERE Ability3 IS NOT NULL
    UNION ALL
    SELECT Ability1, 'Gen2' FROM PokemonGen2.dbo.PokemonGen2 WHERE Ability1 IS NOT NULL
    UNION
    SELECT Ability2, 'Gen2' FROM PokemonGen2.dbo.PokemonGen2 WHERE Ability2 IS NOT NULL
    UNION
    SELECT Ability3, 'Gen2' FROM PokemonGen2.dbo.PokemonGen2 WHERE Ability3 IS NOT NULL
    UNION ALL
    SELECT Ability1, 'Gen3' FROM PokemonGen3.dbo.PokemonGen3 WHERE Ability1 IS NOT NULL
    UNION
    SELECT Ability2, 'Gen3' FROM PokemonGen3.dbo.PokemonGen3 WHERE Ability2 IS NOT NULL
    UNION
    SELECT Ability3, 'Gen3' FROM PokemonGen3.dbo.PokemonGen3 WHERE Ability3 IS NOT NULL
),
RankedAbilities AS (
    SELECT Ability, Generation,
           ROW_NUMBER() OVER(PARTITION BY Ability ORDER BY CASE Generation WHEN 'Gen1' THEN 1 WHEN 'Gen2' THEN 2 WHEN 'Gen3' THEN 3 END) AS Rank
    FROM Abilities
)
SELECT Ability, Generation
FROM RankedAbilities
WHERE Rank = 1
ORDER BY Ability;



Ability,Generation
Adaptability,Gen1
Aftermath,Gen1
Air Lock,Gen3
Analytic,Gen1
Anger Point,Gen1
Anticipation,Gen1
Arena Trap,Gen1
Battle Armor,Gen1
Big Pecks,Gen1
Blaze,Gen1


**QUERY 8**

**Problem:**  write an SQL query that lists each product that has a unit price above the average unit price within its category

**Tables:** 

Production.Product

Production.category

**Columns:**   

Product

> ProductID
> 
> ProductName
> 
> UnitPrice
> 
> CategoryID

Category

> CategoryName
> 
> CategoryID

**Predicate:**

The WHERE clause specifies the condition for the products to be selected: p.UnitPrice \> (SELECT AVG(UnitPrice) FROM Production.Product WHERE CategoryID = p.CategoryID). This means that only those products whose UnitPrice is greater than the average UnitPrice of all products within the same category (as determined by CategoryID) are selected. 

The query uses an inner join between the Production.Product and Production.Category tables on their CategoryID fields to associate each product with its category name. 

Finally, the ORDER BY clause orders the results first by CategoryName (alphabetically) and then by UnitPrice in descending order, within each category.

In [None]:
--write an SQL query that lists each product that has a unit price above the average unit price within its category
USE Northwinds2022TSQLV7; 
SELECT p.ProductID, p.ProductName, p.UnitPrice, c.CategoryName
FROM Production.Product p
INNER JOIN Production.Category c ON p.CategoryID = c.CategoryID
WHERE p.UnitPrice > (
    SELECT AVG(UnitPrice)
    FROM Production.Product
    WHERE CategoryID = p.CategoryID
)
ORDER BY c.CategoryName, p.UnitPrice DESC;

**QUERY 9**

**Problem:**  find the average unit price of products in each category, displaying the category name and the average price, sorted by average price 

**Tables:** 

Production.Product

Production.category

**Columns:** 

Product

> UnitPrice
> 
> CategoryID

Category

> CategoryName
> 
> CategoryID

**Predicate:**

The core of this query involves calculating the average UnitPrice for products within each CategoryName, facilitated by a JOIN operation between the Product and Category tables on the CategoryID field. This join ensures that each product is correctly associated with its category. 

<span style="color: var(--vscode-foreground); font-family: -apple-system, BlinkMacSystemFont, sans-serif;">The GROUP BY c.CategoryName clause groups the products by their category name, which is necessary for the AVG(p.UnitPrice) function to calculate the average unit price per category.&nbsp;</span> 

<span style="color: var(--vscode-foreground); font-family: -apple-system, BlinkMacSystemFont, sans-serif;">The ORDER BY AveragePrice DESC clause orders the results by the calculated average unit price in descending order, allowing users to see the categories with the highest average product prices first.</span>

In [None]:
USE Northwinds2022TSQLV7;
SELECT c.CategoryName, AVG(p.UnitPrice) AS AveragePrice
FROM Production.Product p
JOIN Production.Category c ON p.CategoryID = c.CategoryID
GROUP BY c.CategoryName
ORDER BY AveragePrice DESC;

**QUERY 10**

**Problem:** calculating the total order amount for each customer and displaying those who have spent more than $100000,

**Tables:**   
Sales.Order

Sales.OrderDetail

**Columns:** 

Order

> CustomerID

OrderDetail

> UnitPrice
> 
> Quantity

**Predicate:**

The JOIN operation links the Sales.\[Order\] and Sales.OrderDetail tables on their OrderID fields, ensuring that the details of each order are correctly matched with the order's metadata (including the customer ID). 

The GROUP BY o.CustomerID clause aggregates the results by customer, which is necessary for the SUM function to calculate the total amount spent per customer. 

The HAVING SUM(od.UnitPrice \* od.Quantity) \> 100000 clause filters the grouped results to include only those customers whose total spending exceeds 100,000 units of currency. This is a critical part of the query that focuses on high-spending customers. 

Finally, the ORDER BY TotalSpent DESC clause sorts the results by the calculated TotalSpent in descending order, allowing users to easily identify the highest spending customers first.

In [None]:
USE Northwinds2022TSQLV7; 
SELECT o.CustomerID, SUM(od.UnitPrice * od.Quantity) AS TotalSpent
FROM Sales.[Order] o
JOIN Sales.OrderDetail od ON o.OrderID = od.OrderID
GROUP BY o.CustomerID
HAVING SUM(od.UnitPrice * od.Quantity) > 100000
ORDER BY TotalSpent DESC;

**QUERY 11**

**Problem:**  listing all products that have never been ordered

**Tables:** 

Production.Product

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">Sales.OrderDetail</span>  

**Columns:** 

Product

> ProductID
> 
> ProductName

  

**Predicate:**

The WHERE NOT EXISTS clause is the core predicate of this query. It filters the list of products to include only those for which no corresponding entry exists in the Sales.OrderDetail table. This is determined by checking if the ProductID of the product from Production.Product does not match any ProductID in Sales.OrderDetail.

In [None]:
-- listing all products that have never been ordered
USE Northwinds2022TSQLV7; 
SELECT p.ProductID, p.ProductName
FROM Production.Product p
WHERE NOT EXISTS (
    SELECT 1
    FROM Sales.OrderDetail od
    WHERE p.ProductID = od.ProductID
);


**QUERY 12**

**Problem:**   find the average price of products in each category

**Tables:** 

Production.Product

Production.Cateogory

**Columns:** 

Product

> UnitPrice
> 
> CategoryID
> 
> ProductID

Category

> <span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">CategoryName</span>

**Predicate:**

The JOIN clause creates a connection between the Product and Category tables on the CategoryID field, ensuring each product is matched with its category name for the aggregation. 

The GROUP BY c.CategoryName clause groups the results by category name, allowing the AVG(p.UnitPrice) function to calculate the average price for products within each category. 

The HAVING COUNT(p.ProductID) \> 5 clause further filters these grouped results to include only those categories that have more than 5 products. This condition ensures that the query focuses on categories with a significant number of products, potentially excluding niche or less populated categories. 

Finally, the ORDER BY AveragePrice ASC clause sorts the categories by their calculated average product price in ascending order, from the lowest to the highest average price.

In [None]:
-- find the average price of products in each category
USE Northwinds2022TSQLV7;
SELECT c.CategoryName, AVG(p.UnitPrice) AS AveragePrice
FROM Production.Product p
JOIN Production.Category c ON p.CategoryID = c.CategoryID
GROUP BY c.CategoryName
HAVING COUNT(p.ProductID) > 5
ORDER BY AveragePrice ASC;

**QUERY 13**

**Problem:** calculating the total sales for each month and year

**Tables:** 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">Sales.Order</span>

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">Sales.OrderDetail</span>

**Columns:**   

Order

> OrderDate

OrderDetial

> Quantity
> 
> UnitPrice

**Predicate:**

The query groups the results by YEAR(o.OrderDate) and MONTH(o.OrderDate) to aggregate sales data by month and year, ensuring that sales are correctly summed up within these periods. 

The ORDER BY OrderYear, OrderMonth clause orders the results chronologically, starting from the earliest year and month to the latest, making it easy to track sales trends over time.

In [None]:
--calculating the total sales for each month and year
USE Northwinds2022TSQLV7;
SELECT
    YEAR(o.OrderDate) AS OrderYear,
    MONTH(o.OrderDate) AS OrderMonth,
    SUM(od.Quantity * od.UnitPrice) AS TotalSales
FROM Sales.[Order] o
JOIN Sales.OrderDetail od ON o.OrderID = od.OrderID
GROUP BY YEAR(o.OrderDate), MONTH(o.OrderDate)
ORDER BY OrderYear, OrderMonth;


**QUERY 14 WORST CODE 2**

**Problem:** how many orders each customer has placed in each year

**Tables:** 

<span style="font-family: -apple-system, BlinkMacSystemFont, sans-serif; color: var(--vscode-foreground);">Sales.Order</span>

**Columns:** 

Order

> CustomerID
> 
> OrderDate
> 
> OrderID

**Predicate:**

The query groups the results by CustomerID and YEAR(OrderDate) to ensure that the count of orders is calculated separately for each customer in each year. 

The ORDER BY CustomerID, OrderYear clause sorts the results first by CustomerID to group all years for a customer together, and then by OrderYear to arrange those years in ascending order.

In [5]:
--how many orders each customer has placed in each year
USE Northwinds2022TSQLV7; 
SELECT 
    CustomerID, 
    YEAR(OrderDate) AS OrderYear, 
    COUNT(OrderID) AS NumberOfOrders
FROM Sales.[Order]
GROUP BY CustomerID, YEAR(OrderDate)
ORDER BY CustomerID, OrderYear;


CustomerID,OrderYear,NumberOfOrders
1,2015,3
1,2016,3
2,2014,1
2,2015,2
2,2016,1
3,2014,1
3,2015,5
3,2016,1
4,2014,2
4,2015,7


BETTER CODE 

This static pivot query creates a table where each row represents a customer, and each column (beyond the CustomerID) represents the number of orders they placed in each year from 2014 to 2016.

In [4]:
--BETTER CODE 
USE Northwinds2022TSQLV7;
SELECT
    CustomerID,
    [2014] AS Orders_2014,
    [2015] AS Orders_2015,
    [2016] AS Orders_2016
FROM
    (SELECT
        CustomerID,
        YEAR(OrderDate) AS OrderYear,
        COUNT(OrderID) AS NumberOfOrders
    FROM Sales.[Order]
    GROUP BY CustomerID, YEAR(OrderDate)) AS SourceTable
PIVOT
    (SUM(NumberOfOrders) FOR OrderYear IN ([2014], [2015], [2016])) AS PivotTable
ORDER BY CustomerID;


CustomerID,Orders_2014,Orders_2015,Orders_2016
1,,3.0,3.0
2,1.0,2.0,1.0
3,1.0,5.0,1.0
4,2.0,7.0,4.0
5,3.0,10.0,5.0
6,,4.0,3.0
7,3.0,7.0,1.0
8,1.0,1.0,1.0
9,3.0,8.0,6.0
10,1.0,5.0,8.0


**QUERY 15**

**Problem:** list the number of customers by city and country.

**Tables:** 

DimCustomer

DimGeography

**Columns:** 

Customer

> CustomerKey

Geography

> CountryRegionCode
> 
> CIty

**Predicate:**

The query groups the results by g.CountryRegionCode and g.City to ensure that the customer count is calculated separately for each unique city within each country. 

The ORDER BY g.CountryRegionCode, COUNT(c.CustomerKey) DESC clause sorts the results firstly by CountryRegionCode (alphabetically) to group cities within the same country together, and then by the count of customers in descending order within each country.

In [None]:
--list the number of customers by city and country.
USE AdventureWorksDW2017; 
SELECT 
    g.CountryRegionCode, 
    g.City, 
    COUNT(c.CustomerKey) AS NumberOfCustomers
FROM 
    DimCustomer c
JOIN 
    dbo.DimGeography g ON c.GeographyKey = g.GeographyKey
GROUP BY 
    g.CountryRegionCode, 
    g.City
ORDER BY 
    g.CountryRegionCode, 
    COUNT(c.CustomerKey) DESC;


**QUERY 16**

**Problem**: comparison of the geographic distribution of customers and resellers by country

**Tables:** 

DimCustomer

DimGeography 

DimReseller

**Columns:** 

Customer

> CustomerCount(Derived col)

Geography

> CountryRegionCode

Reseller

> ResellerCount(Derieved col)

**Predicate:** 

The sql code groups and aggregates data across tables to compare counts. The grouping is done based on CountryRegionCode, and the ordering is alphabetically by CountryRegionCode to list countries and their respective counts of customers and resellers.

In [None]:
--16 comparison of the geographic distribution of customers and resellers by country
USE AdventureWorksDW2017;
SELECT 
    g.CountryRegionCode,
    COALESCE(SUM(c.CustomerCount), 0) AS CustomerCount,
    COALESCE(SUM(r.ResellerCount), 0) AS ResellerCount
FROM 
    DimGeography g
LEFT JOIN 
    (SELECT GeographyKey, COUNT(*) AS CustomerCount FROM DimCustomer GROUP BY GeographyKey) c ON g.GeographyKey = c.GeographyKey
LEFT JOIN 
    (SELECT GeographyKey, COUNT(*) AS ResellerCount FROM DimReseller GROUP BY GeographyKey) r ON g.GeographyKey = r.GeographyKey
GROUP BY 
    g.CountryRegionCode
ORDER BY 
    g.CountryRegionCode;

**QUERY 17**

**Problem**: query to calculate the average dealer price of products grouped by color.

**Tables:** 

DimProduct

**Columns:** 

DimProduct

> Color
> 
> DealerPrice

**Predicate:**  

The WHERE clause filters the data to include only those records where Color and DealerPrice are not null: Color IS NOT NULL AND DealerPrice IS NOT NULL. This ensures that the calculation only considers products with both a defined color and dealer price, eliminating potential data quality issues or missing values that could skew the results.

In [None]:
--17 query to calculate the average dealer price of products grouped by color.
USE AdventureWorksDW2017; 
SELECT 
    Color, 
    AVG(DealerPrice) AS AverageDealerPrice
FROM 
    DimProduct
WHERE 
    Color IS NOT NULL AND DealerPrice IS NOT NULL
GROUP BY 
    Color
ORDER BY 
    AverageDealerPrice DESC;

**QUERY 18**

**Problem**: SQL query to determine the distribution of product sizes available. Calculate the total number of products for each distinct size

**Tables:** 

DimProduct

**Columns:** 

DimProduct

> Size

**Predicate:** 

The WHERE clause (Size IS NOT NULL) filters out any records where the Size is null, ensuring the analysis only includes products with a defined size.

In [None]:
-- 18 SQL query to determine the distribution of product sizes available. Calculate the total number of products for each distinct size
USE AdventureWorksDW2017; 
SELECT 
    Size, 
    COUNT(*) AS ProductCount
FROM 
    DimProduct
WHERE 
    Size IS NOT NULL
GROUP BY 
    Size
ORDER BY 
    ProductCount DESC;


**QUERY 19 WORST 3**

**Problem**:  Write an SQL query to analyze the distribution of weekdays throughout the year.

**Tables:** 

DimDate

**Columns:** 

DimDate

> CalenderYear
> 
> CalenderSemester
> 
> EnglishDayNameOfWeek
> 
> DayCount

**Predicate:** 

The query groups the results by CalendarYear, CalendarSemester, and EnglishDayNameOfWeek using the GROUP BY clause. This aggregation allows for a detailed analysis of how the days of the week are distributed across different years and semesters. 

The results are ordered by CalendarYear, CalendarSemester, and then by DayCount DESC with the ORDER BY clause. This sorting prioritizes the display of information first chronologically by year and semester, and within those groups, it shows the most to least frequent days of the week.

In [8]:
--19 Write an SQL query to analyze the distribution of weekdays throughout the year.
USE AdventureWorksDW2017;
SELECT 
    CalendarYear,
    CalendarSemester,
    EnglishDayNameOfWeek,
    COUNT(*) AS DayCount
FROM 
    DimDate
GROUP BY 
    CalendarYear, 
    CalendarSemester, 
    EnglishDayNameOfWeek
ORDER BY 
    CalendarYear, 
    CalendarSemester, 
    DayCount DESC;

CalendarYear,CalendarSemester,EnglishDayNameOfWeek,DayCount
2005,1,Saturday,26
2005,1,Sunday,26
2005,1,Wednesday,26
2005,1,Monday,26
2005,1,Thursday,26
2005,1,Tuesday,26
2005,1,Friday,25
2005,2,Friday,27
2005,2,Saturday,27
2005,2,Monday,26


BETTER CODE

adding a WHERE clause to limit the query to a specific range of years. This reduces the amount of data the database engine needs to process.

In [7]:
USE AdventureWorksDW2017;
SELECT 
    CalendarYear,
    CalendarSemester,
    EnglishDayNameOfWeek,
    COUNT(*) AS DayCount
FROM 
    DimDate
WHERE 
    CalendarYear BETWEEN 2011 AND 2014
GROUP BY 
    CalendarYear, 
    CalendarSemester, 
    EnglishDayNameOfWeek
ORDER BY 
    CalendarYear, 
    CalendarSemester, 
    DayCount DESC;


CalendarYear,CalendarSemester,EnglishDayNameOfWeek,DayCount
2011,1,Thursday,26
2011,1,Tuesday,26
2011,1,Monday,26
2011,1,Wednesday,26
2011,1,Saturday,26
2011,1,Sunday,26
2011,1,Friday,25
2011,2,Saturday,27
2011,2,Friday,27
2011,2,Sunday,26


**QUERY 20**

**Problem**:  Write an SQL query to track monthly trends in call volume and average time per issue in the call center.

**Tables:** 

FactCallCenter

DimDate

**Columns:** 

DimDate

> CalenderYear
> 
> MonthNumberOfYear

FactCallCenter

> FactCallCenterID
> 
> AverageTimePerIssue

**Predicate:** 

The query groups the results by d.CalendarYear and d.MonthNumberOfYear using the GROUP BY clause. This aggregation is crucial for analyzing monthly trends within each year.

 The results are ordered by d.CalendarYear and d.MonthNumberOfYear using the ORDER BY clause. This ensures that the output is sorted first by year and then by month, making it easy to follow and analyze trends over time.

In [1]:
--20  Write an SQL query to track monthly trends in call volume and average time per issue in the call center. 
USE AdventureWorksDW2017;
SELECT 
    d.CalendarYear,
    d.MonthNumberOfYear,
    COUNT(f.FactCallCenterID) AS TotalCalls,
    AVG(f.AverageTimePerIssue) AS AverageTimePerIssue
FROM 
    FactCallCenter f
JOIN 
    DimDate d ON f.DateKey = d.DateKey
GROUP BY 
    d.CalendarYear, 
    d.MonthNumberOfYear
ORDER BY 
    d.CalendarYear, 
    d.MonthNumberOfYear;

CalendarYear,MonthNumberOfYear,TotalCalls,AverageTimePerIssue
2014,5,120,79
