**Prithibi Paul | Group 6 | Project 1**

**Complex Queries:**

## Complex Queries: Query 1

**Proposition**: Identify the most frequently occurring Pokémon types (primary and secondary) across the three Pokémon generations, ranking these occurrences and selecting the type with the highest occurrence in each category.

**Tables**:
- `PokemonGen1.dbo.PokemonGen1`
- `PokemonGen2.dbo.PokemonGen2`
- `PokemonGen3.dbo.PokemonGen3`

**Columns**:
- `Type1`: The primary type of a Pokémon.
- `Type2`: The secondary type of a Pokémon, which can be NULL if the Pokémon does not possess a secondary type.

**Predicate**:
The query first aggregates the counts of each Pokémon type for both `Type1` and `Type2` across all three generations in two separate CTEs (`Type1Counts` and `Type2Counts`). It counts the number of occurrences in the tables, excluding the null values. Then, it ranks the types based on the number of occurrences using the `RANK()` function with two CTEs (`RankedType1` and `RankedType2`). The query then selects the most common types from `RankedType1` and `RankedType2` where the rank is 1, using a `SELECT` statement with a `UNION ALL`. If there are duplicates or cases where the count is the same, all of them are displayed.

In [None]:
WITH CombinedData AS (
    SELECT 'Gen1' AS Generation, Type1, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen1.dbo.PokemonGen1
    WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT 'Gen2' AS Generation, Type1, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen2.dbo.PokemonGen2
    WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT 'Gen3' AS Generation, Type1, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen3.dbo.PokemonGen3
    WHERE Type2 IS NOT NULL
),
AverageStats AS (
    SELECT Type1, AVG(HP) AS AvgHP, AVG(Attack) AS AvgAttack, AVG(Defense) AS AvgDefense,
           AVG(SpecialAttack) AS AvgSpecialAttack, AVG(SpecialDefense) AS AvgSpecialDefense, AVG(Speed) AS AvgSpeed
    FROM CombinedData
    GROUP BY Type1
)
SELECT Type1, AvgHP, AvgAttack, AvgDefense, AvgSpecialAttack, AvgSpecialDefense, AvgSpeed
FROM AverageStats
ORDER BY AvgHP DESC, AvgAttack DESC;


## Complex Queries: Query 2

### Proposition:
Find the average stats (HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed) for each primary type (Type1) across all three Pokémon generations, focusing only on Pokémon with a secondary type (Type2). The query will present the average stats by type and generation.

### Tables:
- PokemonGen1.dbo.PokemonGen1
- PokemonGen2.dbo.PokemonGen2
- PokemonGen3.dbo.PokemonGen3

### Columns:
- `Type1`: The primary type of a Pokémon.
- `Type2`: The secondary type of a Pokémon, which can be NULL if the Pokémon does not possess a secondary type.
- Other stat columns like HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed.

### Predicate:
The query first creates a combined dataset of Pokémon from all three generations using a UNION ALL approach, including only those Pokémon with a secondary type. Then, it computes the average stats for each primary type (Type1) across the combined data. The results are grouped by Type1 and sorted by average HP and average Attack in descending order.


In [None]:
WITH CombinedData AS (
    SELECT 'Gen1' AS Generation, Type1, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen1.dbo.PokemonGen1
    WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT 'Gen2' AS Generation, Type1, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen2.dbo.PokemonGen2
    WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT 'Gen3' AS Generation, Type1, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen3.dbo.PokemonGen3
    WHERE Type2 IS NOT NULL
),
AverageStats AS (
    SELECT Type1, AVG(HP) AS AvgHP, AVG(Attack) AS AvgAttack, AVG(Defense) AS AvgDefense,
           AVG(SpecialAttack) AS AvgSpecialAttack, AVG(SpecialDefense) AS AvgSpecialDefense, AVG(Speed) AS AvgSpeed
    FROM CombinedData
    GROUP BY Type1
)
SELECT Type1, AvgHP, AvgAttack, AvgDefense, AvgSpecialAttack, AvgSpecialDefense, AvgSpeed
FROM AverageStats
ORDER BY AvgHP DESC, AvgAttack DESC;


## Complex Queries: Query 3

### Proposition:
Determine the Pokémon with the highest base stat total in each category (HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed) across all generations, displaying the top Pokémon for each stat category.

### Tables:
- PokemonGen1.dbo.PokemonGen1
- PokemonGen2.dbo.PokemonGen2
- PokemonGen3.dbo.PokemonGen3

### Columns:
- `Name`: The name of the Pokémon.
- `HP`, `Attack`, `Defense`, `SpecialAttack`, `SpecialDefense`, `Speed`: The various stat columns for a Pokémon.

### Predicate:
The query consolidates data from all three Pokémon generations into one dataset. It then utilizes a series of CTEs (Common Table Expressions) to find the Pokémon with the highest value in each stat category. The final output is a union of these CTEs, presenting the top Pokémon for each stat across all generations.

In [None]:
WITH CombinedData AS (
    SELECT 'Gen1' AS Generation, Name, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen1.dbo.PokemonGen1
    UNION ALL
    SELECT 'Gen2' AS Generation, Name, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen2.dbo.PokemonGen2
    UNION ALL
    SELECT 'Gen3' AS Generation, Name, HP, Attack, Defense, SpecialAttack, SpecialDefense, Speed
    FROM PokemonGen3.dbo.PokemonGen3
),
TopHP AS (
    SELECT TOP 1 'Highest HP' AS Category, Generation, Name, HP
    FROM CombinedData
    ORDER BY HP DESC
),
TopAttack AS (
    SELECT TOP 1 'Highest Attack' AS Category, Generation, Name, Attack
    FROM CombinedData
    ORDER BY Attack DESC
),
TopDefense AS (
    SELECT TOP 1 'Highest Defense' AS Category, Generation, Name, Defense
    FROM CombinedData
    ORDER BY Defense DESC
),
TopSpecialAttack AS (
    SELECT TOP 1 'Highest SpecialAttack' AS Category, Generation, Name, SpecialAttack
    FROM CombinedData
    ORDER BY SpecialAttack DESC
),
TopSpecialDefense AS (
    SELECT TOP 1 'Highest SpecialDefense' AS Category, Generation, Name, SpecialDefense
    FROM CombinedData
    ORDER BY SpecialDefense DESC
),
TopSpeed AS (
    SELECT TOP 1 'Highest Speed' AS Category, Generation, Name, Speed
    FROM CombinedData
    ORDER BY Speed DESC
)
SELECT * FROM TopHP
UNION ALL
SELECT * FROM TopAttack
UNION ALL
SELECT * FROM TopDefense
UNION ALL
SELECT * FROM TopSpecialAttack
UNION ALL
SELECT * FROM TopSpecialDefense
UNION ALL
SELECT * FROM TopSpeed;

## Complex Queries: Query 4

### Proposition:
Analyze the distribution of Pokémon types (primary and secondary) across all generations, and determine the generation where each type is most prevalent.

### Tables:
- PokemonGen1.dbo.PokemonGen1
- PokemonGen2.dbo.PokemonGen2
- PokemonGen3.dbo.PokemonGen3

### Columns:
- `Type1`: The primary type of a Pokémon.
- `Type2`: The secondary type of a Pokémon, which can be NULL if the Pokémon does not possess a secondary type.

### Predicate:
The query begins by consolidating the counts of primary and secondary types across all three generations. It then utilizes window functions to identify the generation in which each type is most prevalent. The final output lists each type along with the generation where it appears most frequently.

In [None]:
WITH TypeCounts AS (
    SELECT 'Gen1' AS Generation, Type1 AS Type
    FROM PokemonGen1.dbo.PokemonGen1
    WHERE Type1 IS NOT NULL
    UNION ALL
    SELECT 'Gen1', Type2
    FROM PokemonGen1.dbo.PokemonGen1
    WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT 'Gen2', Type1
    FROM PokemonGen2.dbo.PokemonGen2
    WHERE Type1 IS NOT NULL
    UNION ALL
    SELECT 'Gen2', Type2
    FROM PokemonGen2.dbo.PokemonGen2
    WHERE Type2 IS NOT NULL
    UNION ALL
    SELECT 'Gen3', Type1
    FROM PokemonGen3.dbo.PokemonGen3
    WHERE Type1 IS NOT NULL
    UNION ALL
    SELECT 'Gen3', Type2
    FROM PokemonGen3.dbo.PokemonGen3
    WHERE Type2 IS NOT NULL
),
CountsByGeneration AS (
    SELECT Type, Generation, COUNT(*) AS Count
    FROM TypeCounts
    GROUP BY Type, Generation
),
RankedTypes AS (
    SELECT Type, Generation, Count, RANK() OVER (PARTITION BY Type ORDER BY Count DESC) AS Rank
    FROM CountsByGeneration
)
SELECT Type, Generation
FROM RankedTypes
WHERE Rank = 1;


## Complex Queries: Query 5

### Proposition:
Compare the average base stat totals of each Pokémon type across the three generations to find out which type has evolved the most in terms of average total stats.

### Tables:
- PokemonGen1.dbo.PokemonGen1
- PokemonGen2.dbo.PokemonGen2
- PokemonGen3.dbo.PokemonGen3

### Columns:
- `Type1`: The primary type of a Pokémon.
- `HP`, `Attack`, `Defense`, `SpecialAttack`, `SpecialDefense`, `Speed`: The various stat columns for a Pokémon.

### Predicate:
This query calculates the average total base stats for each primary type (Type1) in each generation. It then compares these averages to determine which type has shown the greatest increase in average total stats from Generation 1 to Generation 3.

In [None]:
WITH Gen1Avg AS (
    SELECT Type1, AVG(HP + Attack + Defense + SpecialAttack + SpecialDefense + Speed) AS AvgTotalStats
    FROM PokemonGen1.dbo.PokemonGen1
    GROUP BY Type1
),
Gen2Avg AS (
    SELECT Type1, AVG(HP + Attack + Defense + SpecialAttack + SpecialDefense + Speed) AS AvgTotalStats
    FROM PokemonGen2.dbo.PokemonGen2
    GROUP BY Type1
),
Gen3Avg AS (
    SELECT Type1, AVG(HP + Attack + Defense + SpecialAttack + SpecialDefense + Speed) AS AvgTotalStats
    FROM PokemonGen3.dbo.PokemonGen3
    GROUP BY Type1
),
CombinedAverages AS (
    SELECT g1.Type1, g1.AvgTotalStats AS Gen1Avg, g2.AvgTotalStats AS Gen2Avg, g3.AvgTotalStats AS Gen3Avg
    FROM Gen1Avg g1
    FULL OUTER JOIN Gen2Avg g2 ON g1.Type1 = g2.Type1
    FULL OUTER JOIN Gen3Avg g3 ON g1.Type1 = g3.Type1 OR g2.Type1 = g3.Type1
),
StatsImprovement AS (
    SELECT 
        Type1, 
        Gen1Avg, 
        Gen2Avg, 
        Gen3Avg, 
        COALESCE(Gen3Avg - Gen1Avg, 0) AS Improvement
    FROM CombinedAverages
)
SELECT Type1, Gen1Avg, Gen2Avg, Gen3Avg, Improvement
FROM StatsImprovement
ORDER BY Improvement DESC;

## Complex Queries: Query 6

### Proposition:
Assess the balance of Pokémon types within each generation, determining the proportion of each type in relation to the total number of Pokémon in that generation.

### Tables:
- PokemonGen1.dbo.PokemonGen1
- PokemonGen2.dbo.PokemonGen2
- PokemonGen3.dbo.PokemonGen3

### Columns:
- `Type1`: The primary type of a Pokémon.

### Predicate:
This query calculates the proportion of each Pokémon type (Type1) within each generation. It does this by first determining the count of each type and the total number of Pokémon in each generation, and then computing the proportion of each type.


In [None]:
WITH TypeCounts AS (
    SELECT 'Gen1' AS Generation, Type1, COUNT(*) AS Count
    FROM PokemonGen1.dbo.PokemonGen1
    GROUP BY Type1
    UNION ALL
    SELECT 'Gen2', Type1, COUNT(*)
    FROM PokemonGen2.dbo.PokemonGen2
    GROUP BY Type1
    UNION ALL
    SELECT 'Gen3', Type1, COUNT(*)
    FROM PokemonGen3.dbo.PokemonGen3
    GROUP BY Type1
),
TotalCounts AS (
    SELECT Generation, COUNT(*) AS Total
    FROM (
        SELECT 'Gen1' AS Generation, Type1
        FROM PokemonGen1.dbo.PokemonGen1
        UNION ALL
        SELECT 'Gen2', Type1
        FROM PokemonGen2.dbo.PokemonGen2
        UNION ALL
        SELECT 'Gen3', Type1
        FROM PokemonGen3.dbo.PokemonGen3
    ) AS AllGenerations
    GROUP BY Generation
),
TypeProportions AS (
    SELECT tc.Generation, tc.Type1, tc.Count, tc.Count * 100.0 / tt.Total AS Proportion
    FROM TypeCounts tc
    JOIN TotalCounts tt ON tc.Generation = tt.Generation
)
SELECT Generation, Type1, Count, Proportion
FROM TypeProportions
ORDER BY Generation, Proportion DESC;


## Complex Queries: Query 7

### Proposition:
Analyze the distribution of Pokémon abilities across all generations, identifying the most common and least common abilities.

### Tables:
- PokemonGen1.dbo.PokemonGen1
- PokemonGen2.dbo.PokemonGen2
- PokemonGen3.dbo.PokemonGen3

### Columns:
- `Ability1`, `Ability2`, `Ability3`: The abilities of a Pokémon.

### Predicate:
This query focuses on the distribution of Pokémon abilities. It first aggregates the occurrences of each ability across all generations, then identifies the most and least common abilities.

In [None]:
WITH AbilityCounts AS (
    SELECT Ability1 AS Ability
    FROM PokemonGen1.dbo.PokemonGen1
    WHERE Ability1 IS NOT NULL
    UNION ALL
    SELECT Ability2
    FROM PokemonGen1.dbo.PokemonGen1
    WHERE Ability2 IS NOT NULL
    UNION ALL
    SELECT Ability3
    FROM PokemonGen1.dbo.PokemonGen1
    WHERE Ability3 IS NOT NULL
    UNION ALL
    SELECT Ability1
    FROM PokemonGen2.dbo.PokemonGen2
    WHERE Ability1 IS NOT NULL
    UNION ALL
    SELECT Ability2
    FROM PokemonGen2.dbo.PokemonGen2
    WHERE Ability2 IS NOT NULL
    UNION ALL
    SELECT Ability3
    FROM PokemonGen2.dbo.PokemonGen2
    WHERE Ability3 IS NOT NULL
    UNION ALL
    SELECT Ability1
    FROM PokemonGen3.dbo.PokemonGen3
    WHERE Ability1 IS NOT NULL
    UNION ALL
    SELECT Ability2
    FROM PokemonGen3.dbo.PokemonGen3
    WHERE Ability2 IS NOT NULL
    UNION ALL
    SELECT Ability3
    FROM PokemonGen3.dbo.PokemonGen3
    WHERE Ability3 IS NOT NULL
),
TotalAbilityCounts AS (
    SELECT Ability, COUNT(*) AS Count
    FROM AbilityCounts
    GROUP BY Ability
),
MostCommonAbility AS (
    SELECT TOP 1 Ability, Count, 'Most Common' AS Category
    FROM TotalAbilityCounts
    ORDER BY Count DESC
),
LeastCommonAbility AS (
    SELECT TOP 1 Ability, Count, 'Least Common' AS Category
    FROM TotalAbilityCounts
    ORDER BY Count
)
SELECT * FROM MostCommonAbility
UNION ALL
SELECT * FROM LeastCommonAbility;


\-------------------------------------

**Medium Queries**

<span style="color: #800000;font-weight: bold;">-------------------------------------</span>

## Medium Query: 1
Implement a query to identify common location entries between the Human Resources and Sales departments in the Northwinds2022TSQLV7 database. The goal is to determine which cities are shared between employees and customers, indicating regions of overlapping corporate interest.

### Requirements:

#### Shared Location Identification:
- Use the INTERSECT operator to find distinct common locations between employees and customers.
- Ensure that the resulting list contains only unique entries with no duplicates.

#### Advanced Intersection Analysis (Optional):
- Use an advanced version of the INTERSECT operator to count occurrences of shared locations between employees and customers.
- Implement the ROW_NUMBER() function to enumerate shared entries within each location group.

#### Location Details:
- Retrieve the country, region, and city for each shared entry.

### Tables:
- `HR.Employees`
- `Sales.Customers`

### Columns:
- `country`
- `region`
- `city`

### Predicate:
- The INTERSECT operator ensures that only entries present in both the `HR.Employees` and `Sales.Customers` tables are included in the result set.
- The advanced INTERSECT uses `ROW_NUMBER()` to provide a unique sequence number for each occurrence of shared locations, partitioned by country, region, and city.

### Sorting:
- The basic INTERSECT query does not require an explicit ORDER BY clause as it produces a distinct set.
- The advanced INTERSECT is ordered within the `ROW_NUMBER()` function but does not influence the final output order.

In [None]:
USE Northwinds2022TSQLV7
SELECT EmployeeCountry AS country, EmployeeRegion AS region, EmployeeCity AS city 
FROM HumanResources.Employee
INTERSECT
SELECT CustomerCountry AS country, CustomerRegion AS region, CustomerCity AS city 
FROM Sales.Customer;

SELECT
  ROW_NUMBER() OVER(PARTITION BY country, region, city ORDER BY (SELECT NULL)) AS rownum,
  country, region, city
FROM
  (SELECT EmployeeCountry AS country, EmployeeRegion AS region, EmployeeCity AS city 
   FROM HumanResources.Employee) AS Employees

INTERSECT

SELECT
  ROW_NUMBER() OVER(PARTITION BY country, region, city ORDER BY (SELECT NULL)),
  country, region, city
FROM
  (SELECT CustomerCountry AS country, CustomerRegion AS region, CustomerCity AS city 
   FROM Sales.Customer) AS Customers;

## Medium Query: 2

### Proposition:
Create a comprehensive list of locations from the Northwinds2022TSQLV7 database by combining location information from both the Human Resources and Sales modules. The list should include all locations where either an employee or a customer is based, without any omissions. Additionally, create a distinct list of locations that excludes duplicates.

### Requirements:

#### All-Inclusive Location List:
- Combine location data from both employees and customers into a single list.
- Include all entries, even duplicates, for completeness.

#### Distinct Location List:
- Generate a unique set of locations by removing duplicates.
- Ensure each location appears only once in this distinct list.

#### Location Details:
- Retrieve country, region, and city for each entry in the lists.

### Tables:
- `[HumanResources].[Employee]`
- `[Sales].[Customer]`

### Columns:
- From `[HumanResources].[Employee]`: EmployeeCountry, EmployeeRegion, EmployeeCity
- From `[Sales].[Customer]`: CustomerCountry, CustomerRegion, CustomerCity

### Predicate:
- Use `UNION ALL` to compile all location entries, including duplicates.
- Use `UNION` to filter out duplicates, creating a distinct list of locations.

### Sorting:
- The distinct list will be sorted by country, region, and city to organize unique entries.

### Context:
This combined location data is vital for businesses needing a complete view of their geographic presence for market analysis, distribution planning, and demographic research. The all-inclusive list provides a full count of location-based entries, while the distinct list offers a clear picture of the company's spread without repeated locations.


In [None]:
USE Northwinds2022TSQLV7;

SELECT EmployeeCountry AS country, EmployeeRegion AS region, EmployeeCity AS city 
FROM HumanResources.Employee
UNION ALL
SELECT CustomerCountry AS country, CustomerRegion AS region, CustomerCity AS city 
FROM Sales.Customer;

-- Combined location data from Employees and Customers using UNION in Northwinds2022TSQLV7
USE Northwinds2022TSQLV7;

SELECT EmployeeCountry AS country, EmployeeRegion AS region, EmployeeCity AS city 
FROM HumanResources.Employee
UNION
SELECT CustomerCountry AS country, CustomerRegion AS region, CustomerCity AS city 
FROM Sales.Customer;

## Medium Query: 3

### Proposition:
Create and analyze a comprehensive and distinct list of locations from the Northwinds2022TSQLV7 database, combining location data from the Human Resources and Sales departments. This analysis is aimed at providing a complete geographic footprint of the company, useful for strategic planning and market analysis.

### Requirements:

#### All-Inclusive Location List:
- Utilize `UNION ALL` to merge location data from both employees (Human Resources) and customers (Sales) into one inclusive list.
- Ensure that every entry, including duplicates, is included to represent the full spectrum of the company’s geographic presence.

#### Distinct Location List:
- Generate a distinct list of locations using `UNION` to eliminate duplicates.
- Present each unique location only once, providing a clearer perspective of the company's diverse presence.

#### Location Details:
- Extract and display details such as country, region, and city for each location.
- Source location data from the specified columns in the `HumanResources.Employee` and `Sales.Customer` tables.

### Tables:
- `[HumanResources].[Employee]`
- `[Sales].[Customer]`

### Columns:
- From `[HumanResources].[Employee]`: EmployeeCountry, EmployeeRegion, EmployeeCity
- From `[Sales].[Customer]`: CustomerCountry, CustomerRegion, CustomerCity

### Predicate:
- The query will utilize `UNION ALL` for the all-inclusive list, and `UNION` for the distinct list.
- These operations will merge and filter the data as required, ensuring comprehensive and unique sets of location data.

### Sorting:
- Organize the distinct list of locations by country, region, and city for systematic presentation.


In [None]:
USE Northwinds2022TSQLV7;

WITH OverlapLocations AS (
    SELECT 
        E.EmployeeCountry AS Country, 
        E.EmployeeRegion AS Region, 
        E.EmployeeCity AS City
    FROM 
        HumanResources.Employee AS E
    INTERSECT
    SELECT 
        C.CustomerCountry AS Country, 
        C.CustomerRegion AS Region, 
        C.CustomerCity AS City
    FROM 
        Sales.Customer AS C
)

, RankedOverlap AS (
    SELECT 
        Country, 
        Region, 
        City,
        RANK() OVER (ORDER BY COUNT(*) DESC) AS OverlapIntensityRank
    FROM 
        OverlapLocations
    GROUP BY 
        Country, 
        Region, 
        City
)

SELECT 
    Country, 
    Region, 
    City, 
    OverlapIntensityRank
FROM 
    RankedOverlap;

## Medium Query: 4

### Proposition:
Create and analyze a comprehensive and distinct list of locations from the Northwinds2022TSQLV7 database, combining location data from the Human Resources and Sales departments. This analysis is aimed at providing a complete geographic footprint of the company, useful for strategic planning and market analysis.

### Requirements:

#### All-Inclusive Location List:
- Utilize `UNION ALL` to merge location data from both employees (Human Resources) and customers (Sales) into one inclusive list.
- Ensure that every entry, including duplicates, is included to represent the full spectrum of the company’s geographic presence.

#### Distinct Location List:
- Generate a distinct list of locations using `UNION` to eliminate duplicates.
- Present each unique location only once, providing a clearer perspective of the company's diverse presence.

#### Location Details:
- Extract and display details such as country, region, and city for each location.
- Source location data from the specified columns in the `HumanResources.Employee` and `Sales.Customer` tables.

### Tables:
- `[HumanResources].[Employee]`
- `[Sales].[Customer]`

### Columns:
- From `[HumanResources].[Employee]`: EmployeeCountry, EmployeeRegion, EmployeeCity
- From `[Sales].[Customer]`: CustomerCountry, CustomerRegion, CustomerCity

### Predicate:
- The query will utilize `UNION ALL` for the all-inclusive list, and `UNION` for the distinct list.
- These operations will merge and filter the data as required, ensuring comprehensive and unique sets of location data.


In [None]:
SELECT 
    o.OrderId,
    o.OrderDate,
    COUNT(distinct od.ProductId) AS NumberOfProducts,
    SUM(od.Quantity) AS TotalQuantity,
    SUM(od.UnitPrice * od.Quantity) AS TotalValue
FROM 
    Sales.[Order] AS o
INNER JOIN 
    Sales.OrderDetail AS od 
ON 
    o.OrderId = od.OrderId
INNER JOIN 
    Production.Product AS p 
ON 
    od.ProductId = p.ProductId
GROUP BY 
    o.OrderId, 
    o.OrderDate
ORDER BY 
    TotalValue DESC;


## Medium Query: 5

### Proposition:
Construct a query to identify and analyze the sales performance by product categories in the Northwinds2022TSQLV7 database. This analysis is intended to uncover sales trends and performance metrics that can inform product strategy and inventory management decisions.

### Requirements:

#### Sales Performance by Category:
- Aggregate sales data to calculate the total sales and average unit price for each product category.
- Provide insights into the quantity sold and the revenue generated per category.

#### Performance Metrics:
- Include calculations for the total quantity of products sold and the total revenue per category.
- Calculate the average price per unit within each category to assess pricing strategies.

#### Category Details:
- Include the category name in the results to identify the categories.
- Utilize the `Production.Category` and `Production.Product` tables for category information, and join with `Sales.OrderDetail` to include sales data.

### Tables:
- `[Production].[Category]`
- `[Production].[Product]`
- `[Sales].[OrderDetail]`

### Columns:
- From `[Production].[Category]`: CategoryId, CategoryName
- From `[Production].[Product]`: ProductId, CategoryId
- From `[Sales].[OrderDetail]`: ProductId, UnitPrice, Quantity

### Joins:
- Join the tables on their respective `CategoryId` and `ProductId` to collate the sales data for each category.

### Grouping and Sorting:
- Group the results by category name to consolidate the sales metrics.
- Order the results by total revenue generated in descending order to highlight the top-performing categories.


In [None]:
SELECT 
    c.CategoryName,
    COUNT(DISTINCT p.ProductId) AS NumberOfProducts,
    SUM(od.Quantity) AS TotalQuantitySold,
    SUM(od.UnitPrice * od.Quantity) AS TotalRevenue,
    AVG(od.UnitPrice) AS AverageUnitPrice
FROM 
    [Production].[Category] AS c
INNER JOIN 
    [Production].[Product] AS p 
    ON c.CategoryId = p.CategoryId
INNER JOIN 
    [Sales].[OrderDetail] AS od 
    ON p.ProductId = od.ProductId
GROUP BY 
    c.CategoryName
ORDER BY 
    TotalRevenue DESC;


## Medium Query: 6

### Proposition:
Develop a query to evaluate the inventory status across various suppliers in the Northwinds2022TSQLV7 database. This assessment is aimed at providing a detailed view of the stock levels and supply chain efficiency, which is crucial for inventory control and order fulfillment processes.

### Requirements:

#### Inventory Assessment:
- Summarize the inventory data to reflect the total number of products supplied and the average price per supplier.
- Highlight the suppliers' contribution to the inventory in terms of product diversity and value.

#### Supplier Performance Metrics:
- Calculate the total number of distinct products supplied by each supplier.
- Determine the average unit price of products supplied by each supplier to gauge cost efficiency.

#### Supplier and Product Details:
- Pull supplier name details from the `Production.Supplier` table.
- Use the `Production.Product` table for product inventory and pricing information.

### Tables:
- `[Production].[Supplier]`
- `[Production].[Product]`

### Columns:
- From `[Production].[Supplier]`: SupplierId, SupplierCompanyName
- From `[Production].[Product]`: ProductId, SupplierId, UnitPrice

### Joins:
- Join the tables on their respective `SupplierId` to associate the products with their suppliers.

### Grouping and Sorting:
- Group the results by supplier to consolidate inventory data.
- Sort the results by the total number of products in descending order to identify the suppliers with the widest range of products.

In [None]:
SELECT 
    s.SupplierCompanyName,
    COUNT(p.ProductId) AS TotalNumberOfProducts,
    AVG(p.UnitPrice) AS AveragePricePerProduct
FROM 
    [Production].[Supplier] AS s
INNER JOIN 
    [Production].[Product] AS p 
    ON s.SupplierId = p.SupplierId
GROUP BY 
    s.SupplierCompanyName
ORDER BY 
    TotalNumberOfProducts DESC;


## Medium Query: 7

### Proposition:
Implement a query to assess the tenure and attrition of employees within the Northwinds2022TSQLV7 database. The goal is to gain insight into the duration of employment and to identify any notable trends in staff turnover which can inform human resources strategies and retention policies.

### Requirements:

#### Tenure Evaluation:
- Calculate the length of employment for each staff member from the `HumanResources.Employee` table.
- Distinguish between current employees and those who are no longer with the company.

#### Attrition Analysis:
- Compute the tenure in years for all employees, considering the HireDate for the start and SysEnd (or equivalent) for the termination of employment.
- Examine the rate of attrition by analyzing the proportion of employees still active versus those who have left.

#### Employee Details:
- Include pertinent employee details such as name, title, start date, and, if available, the end date of employment.
- Derive information from the `HumanResources.Employee` table, with additional data from `SystemVersioned.Employee` where available.

### Tables:
- `[HumanResources].[Employee]`
- `[SystemVersioned].[Employee]`

### Columns:
- From `[HumanResources].[Employee]`: EmployeeId, EmployeeFirstName, EmployeeLastName, EmployeeTitle, HireDate
- From `[SystemVersioned].[Employee]`: EmployeeId, SysStart, SysEnd (assuming SysEnd signifies the end of employment)

### Predicate:
- The presence of a `SysEnd` value will indicate employees who have left the company.

### Sorting:
- Order the results by length of service to prioritize employees with the longest tenure.


In [None]:
SELECT 
    e.EmployeeFirstName + ' ' + e.EmployeeLastName AS EmployeeFullName,
    e.EmployeeTitle,
    e.HireDate,
    ISNULL(sv.SysEnd, GETDATE()) AS EndDate,
    DATEDIFF(year, e.HireDate, ISNULL(sv.SysEnd, GETDATE())) AS YearsOfService
FROM 
    [HumanResources].[Employee] AS e
LEFT JOIN 
    [SystemVersioned].[Employee] AS sv 
    ON e.EmployeeId = sv.EmployeeId
WHERE 
    e.EmployeeTitle IS NOT NULL
ORDER BY 
    YearsOfService DESC;


## Medium Query: 8

### Proposition:
Generate a query to track the order fulfillment times across different shippers in the Northwinds2022TSQLV7 database. This analysis aims to optimize logistics by assessing the efficiency of delivery services used by the company.

### Requirements:

#### Fulfillment Time Analysis:
- Calculate the average time taken from order placement to shipment for each shipper.
- Identify patterns or outliers in delivery times that could indicate performance issues.

#### Shipper Performance Metrics:
- Determine the number of orders shipped by each shipper and the average fulfillment time in days.
- Provide data that can be used to compare shippers and make informed decisions on logistics partnerships.

#### Shipper and Order Details:
- Include shipper name details from the `Sales.Shipper` table.
- Use the `Sales.Order` table to obtain order and shipment dates.

### Tables:
- `[Sales].[Shipper]`
- `[Sales].[Order]`

### Columns:
- From `[Sales].[Shipper]`: ShipperId, ShipperCompanyName
- From `[Sales].[Order]`: OrderId, ShipperId, OrderDate, ShipToDate

### Joins:
- Join the `Sales.Order` table with the `Sales.Shipper` table on `ShipperId` to associate orders with their shippers.

### Grouping and Sorting:
- Group the results by shipper company name to aggregate the fulfillment data.
- Sort the results by the average fulfillment time to prioritize the shippers with the quickest delivery times.


In [None]:
SELECT 
    s.ShipperCompanyName,
    COUNT(o.OrderId) AS TotalOrdersShipped,
    AVG(DATEDIFF(day, o.OrderDate, o.ShipToDate)) AS AverageFulfillmentTime
FROM 
    [Sales].[Shipper] AS s
JOIN 
    [Sales].[Order] AS o 
    ON s.ShipperId = o.ShipperId
GROUP BY 
    s.ShipperCompanyName
ORDER BY 
    AverageFulfillmentTime;


## Medium Query: 9

### Proposition:
Design a query to assess the sales performance of employees in the Northwinds2022TSQLV7 database. This exploration is aimed at identifying top performers based on sales revenue and understanding how employee sales efforts contribute to the company's overall success.

### Requirements:

#### Sales Performance Evaluation:
- Analyze total sales generated by each employee, providing a measure of individual contribution to the company's revenue.
- Highlight top-performing employees based on the revenue they have generated.

#### Performance Metrics:
- Include calculations for the total sales revenue associated with each employee.
- Assess the number of orders handled by each employee to gauge their productivity.

#### Employee and Sales Details:
- Source employee details from the `HumanResources.Employee` table.
- Utilize the `Sales.Order` table for order details, and join with `Sales.OrderDetail` to incorporate sales data.

### Tables:
- `[HumanResources].[Employee]`
- `[Sales].[Order]`
- `[Sales].[OrderDetail]`

### Columns:
- From `[HumanResources].[Employee]`: EmployeeId, EmployeeFirstName, EmployeeLastName
- From `[Sales].[Order]`: OrderId, EmployeeId, OrderDate
- From `[Sales].[OrderDetail]`: OrderId, UnitPrice, Quantity

### Joins:
- Join `Sales.Order` with `HumanResources.Employee` on `EmployeeId` to link orders to employees.
- Join `Sales.Order` with `Sales.OrderDetail` on `OrderId` to calculate total sales.

### Grouping and Sorting:
- Group the results by employee to aggregate sales data.
- Sort the results by total sales revenue in descending order to identify the highest contributors.


In [None]:
SELECT 
    e.EmployeeFirstName + ' ' + e.EmployeeLastName AS EmployeeName,
    COUNT(o.OrderId) AS NumberOfOrders,
    SUM(od.UnitPrice * od.Quantity) AS TotalSalesRevenue
FROM 
    [HumanResources].[Employee] e
JOIN 
    [Sales].[Order] o ON e.EmployeeId = o.EmployeeId
JOIN 
    [Sales].[OrderDetail] od ON o.OrderId = od.OrderId
GROUP BY 
    e.EmployeeFirstName, e.EmployeeLastName
ORDER BY 
    TotalSalesRevenue DESC;


## Medium Query: 10

### Proposition:
Craft a query to analyze customer satisfaction based on the frequency and volume of repeat orders within the Northwinds2022TSQLV7 database. This investigation aims to leverage order history to gauge customer loyalty and satisfaction, which are crucial metrics for business growth and customer relationship management.

### Requirements:

#### Customer Loyalty Evaluation:
- Determine the frequency of orders placed by each customer as an indicator of satisfaction and loyalty.
- Calculate the total volume of products ordered by each customer to assess their value to the company.

#### Loyalty and Value Metrics:
- Include counts for the total number of orders and the total quantity of products ordered by each customer.
- Evaluate the average order value (AOV) for each customer to understand spending behavior.

#### Customer and Order Details:
- Source customer details from the `Sales.Customer` table.
- Utilize the `Sales.Order` and `Sales.OrderDetail` tables to gather order and product quantity information.

### Tables:
- `[Sales].[Customer]`
- `[Sales].[Order]`
- `[Sales].[OrderDetail]`

### Columns:
- From `[Sales].[Customer]`: CustomerId, CustomerCompanyName
- From `[Sales].[Order]`: OrderId, CustomerId
- From `[Sales].[OrderDetail]`: OrderId, Quantity, UnitPrice

### Joins:
- Join `Sales.Order` with `Sales.Customer` on `CustomerId` to link orders to customers.
- Join `Sales.Order` with `Sales.OrderDetail` on `OrderId` to obtain details about each order.

### Grouping and Sorting:
- Group the results by customer to compile loyalty and value data.
- Sort the results by the number of orders and total quantity ordered in descending order to highlight the most loyal and valuable customers.


In [None]:
SELECT 
    c.CustomerCompanyName,
    COUNT(DISTINCT o.OrderId) AS NumberOfOrders,
    SUM(od.Quantity) AS TotalQuantityOrdered,
    AVG(od.UnitPrice * od.Quantity) AS AverageOrderValue
FROM 
    [Sales].[Customer] c
JOIN 
    [Sales].[Order] o ON c.CustomerId = o.CustomerId
JOIN 
    [Sales].[OrderDetail] od ON o.OrderId = od.OrderId
GROUP BY 
    c.CustomerCompanyName
ORDER BY 
    NumberOfOrders DESC, TotalQuantityOrdered DESC;


## Medium Query: 11

### Proposition:
Implement a query to explore the relationship between product pricing, sales volume, and discounts applied within the Northwinds2022TSQLV7 database. This study aims to understand how pricing strategies and discounting affect product sales, enabling more informed decisions on pricing and promotions.

### Requirements:

#### Pricing and Sales Volume Analysis:
- Assess the impact of unit price and discounts on the sales volume of products.
- Investigate the correlation between high sales volume and the level of discounts offered.

#### Discount Impact Evaluation:
- Quantify the average discount applied to products and its effect on the quantity sold.
- Explore variations in discounting practices across different product categories.

#### Product and Sales Details:
- Use the `Production.Product` table for product details, including pricing.
- Employ the `Sales.OrderDetail` table for sales data, including quantities sold and discounts applied.

### Tables:
- `[Production].[Product]`
- `[Sales].[OrderDetail]`

### Columns:
- From `[Production].[Product]`: ProductId, ProductName, UnitPrice, CategoryId
- From `[Sales].[OrderDetail]`: ProductId, Quantity, DiscountPercentage

### Joins:
- Join `Sales.OrderDetail` with `Production.Product` on `ProductId` to correlate sales data with product details.

### Grouping and Sorting:
- Group the results by product to aggregate sales and discount data.
- Sort the results by the quantity sold and average discount applied to highlight the effects of discounting on sales volume.

In [None]:
SELECT 
    p.ProductName,
    p.UnitPrice,
    AVG(od.DiscountPercentage) AS AverageDiscount,
    SUM(od.Quantity) AS TotalQuantitySold
FROM 
    [Production].[Product] p
JOIN 
    [Sales].[OrderDetail] od ON p.ProductId = od.ProductId
GROUP BY 
    p.ProductName, p.UnitPrice
ORDER BY 
    TotalQuantitySold DESC, AverageDiscount DESC;


## Medium Query: 12

### Proposition:
Create a query to identify the impact of discount strategies on sales volume in the Northwinds2022TSQLV7 database. This analysis is intended to reveal how different levels of discounts affect the quantity of products sold, aiding in the optimization of pricing and discount policies.

### Requirements:

#### Discount Impact Analysis:
- Examine the relationship between discount rates and the quantity of products sold for each order.
- Group data to compare average quantities sold across different discount brackets (e.g., 0%, 1-5%, 6-10%, etc.).

#### Sales Volume Metrics:
- Calculate the average quantity sold for each discount bracket to identify trends in customer purchasing behavior.
- Assess the overall effectiveness of discounting strategies on sales volume.

#### Order and Discount Details:
- Source data from the `Sales.OrderDetail` table, which contains detailed order information, including discount rates and quantities.

### Tables:
- `[Sales].[OrderDetail]`

### Columns:
- From `[Sales].[OrderDetail]`: OrderId, ProductId, UnitPrice, Quantity, DiscountPercentage

### Calculation and Grouping:
- Dynamically categorize each order into discount brackets based on the `DiscountPercentage`.
- Aggregate the data to compute the average quantity sold within each discount bracket.

### Sorting:
- Sort the results by discount brackets in ascending order to facilitate a straightforward analysis of discount impact.


In [None]:
SELECT 
    CASE 
        WHEN DiscountPercentage = 0 THEN '0%'
        WHEN DiscountPercentage > 0 AND DiscountPercentage <= 5 THEN '1-5%'
        WHEN DiscountPercentage > 5 AND DiscountPercentage <= 10 THEN '6-10%'
        WHEN DiscountPercentage > 10 AND DiscountPercentage <= 15 THEN '11-15%'
        WHEN DiscountPercentage > 15 AND DiscountPercentage <= 20 THEN '16-20%'
        WHEN DiscountPercentage > 20 AND DiscountPercentage <= 25 THEN '21-25%'
        ELSE 'Above 25%'
    END AS DiscountBracket,
    AVG(Quantity) AS AverageQuantitySold
FROM 
    [Sales].[OrderDetail]
GROUP BY 
    CASE 
        WHEN DiscountPercentage = 0 THEN '0%'
        WHEN DiscountPercentage > 0 AND DiscountPercentage <= 5 THEN '1-5%'
        WHEN DiscountPercentage > 5 AND DiscountPercentage <= 10 THEN '6-10%'
        WHEN DiscountPercentage > 10 AND DiscountPercentage <= 15 THEN '11-15%'
        WHEN DiscountPercentage > 15 AND DiscountPercentage <= 20 THEN '16-20%'
        WHEN DiscountPercentage > 20 AND DiscountPercentage <= 25 THEN '21-25%'
        ELSE 'Above 25%'
    END
ORDER BY 
    MIN(DiscountPercentage);


## Medium Query: 13

### Proposition:
Design a query to evaluate the seasonal impact on sales within the Northwinds2022TSQLV7 database. This study aims to uncover how different seasons affect the sales volume and revenue, enabling targeted marketing campaigns and inventory adjustments based on seasonal trends.

### Requirements:

#### Seasonal Sales Analysis:
- Identify patterns in sales volume and revenue across different seasons of the year.
- Categorize sales data into seasonal brackets (e.g., Spring, Summer, Fall, Winter) based on order dates.

#### Sales Trends by Season:
- Calculate the total sales revenue and the average order size for each season to assess the seasonal impact on sales.
- Highlight the season with the highest sales to guide marketing and inventory strategies.

#### Date and Sales Details:
- Utilize the `Sales.Order` table for order dates and associate these with the `Sales.OrderDetail` for sales data.

### Tables:
- `[Sales].[Order]`
- `[Sales].[OrderDetail]`

### Columns:
- From `[Sales].[Order]`: OrderId, OrderDate
- From `[Sales].[OrderDetail]`: OrderId, UnitPrice, Quantity

### Grouping and Calculation:
- Dynamically assign each order to a season based on its order date.
- Group the results by season to compute total revenue and average order size.

### Sorting:
- Organize the results by total sales revenue in descending order to easily identify the most lucrative season.


In [None]:
SELECT 
    CASE 
        WHEN MONTH(OrderDate) IN (3, 4, 5) THEN 'Spring'
        WHEN MONTH(OrderDate) IN (6, 7, 8) THEN 'Summer'
        WHEN MONTH(OrderDate) IN (9, 10, 11) THEN 'Fall'
        ELSE 'Winter'
    END AS Season,
    COUNT(DISTINCT o.OrderId) AS NumberOfOrders,
    SUM(od.Quantity * od.UnitPrice) AS TotalSalesRevenue,
    AVG(od.Quantity * od.UnitPrice) AS AverageOrderValue
FROM 
    [Sales].[Order] o
JOIN 
    [Sales].[OrderDetail] od ON o.OrderId = od.OrderId
GROUP BY 
    CASE 
        WHEN MONTH(OrderDate) IN (3, 4, 5) THEN 'Spring'
        WHEN MONTH(OrderDate) IN (6, 7, 8) THEN 'Summer'
        WHEN MONTH(OrderDate) IN (9, 10, 11) THEN 'Fall'
        ELSE 'Winter'
    END
ORDER BY 
    TotalSalesRevenue DESC;


<span style="color: rgba(0, 0, 0, 0.87); font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255);">Written&nbsp;</span> <span style="font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255); color: rgb(0, 0, 255);">in</span> <span style="color: rgba(0, 0, 0, 0.87); font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255);">&nbsp;collaboration&nbsp;</span> <span style="font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255); color: rgb(0, 0, 255);">with</span> <span style="color: rgba(0, 0, 0, 0.87); font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255);">&nbsp;ChatGPT&nbsp;</span> <span style="font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255); color: rgb(0, 0, 255);">from</span> <span style="color: rgba(0, 0, 0, 0.87); font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255);">&nbsp;OpenAI&nbsp;</span> <span style="font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255); color: rgb(0, 0, 255);">to</span> <span style="color: rgba(0, 0, 0, 0.87); font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255);">&nbsp;improve understanding&nbsp;</span> <span style="font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255); color: rgb(0, 0, 255);">and</span> <span style="color: rgba(0, 0, 0, 0.87); font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255);">&nbsp;assist&nbsp;</span> <span style="font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255); color: rgb(0, 0, 255);">with</span> <span style="color: rgba(0, 0, 0, 0.87); font-family: system-ui, -apple-system, blinkmacsystemfont, &quot;Segoe UI&quot;, helvetica, arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;; background-color: rgb(255, 255, 255);">&nbsp;the explanation of the query.</span>