## 1715. Count Apples and Oranges
### Table: Boxes

| Column Name  | Type |
|--------------|------|
| box_id       | int  |
| chest_id     | int  |
| apple_count  | int  |
| orange_count | int  |

box_id is the primary key for this table.  
chest_id is a foreign key of the chests table.  
This table contains information about the boxes and the number of oranges and apples they contain. Each box may contain a chest, which also can contain oranges and apples.

---

### Table: Chests

| Column Name  | Type |
|--------------|------|
| chest_id     | int  |
| apple_count  | int  |
| orange_count | int  |

chest_id is the primary key for this table.  
This table contains information about the chests we have, and the corresponding number if oranges and apples they contain.

---

### Problem Statement

Write an SQL query to count the number of apples and oranges in all the boxes. If a box contains a chest, you should also include the number of apples and oranges it has.

Return the result table in any order.

---

### Sample Input

#### Boxes table:

| box_id | chest_id | apple_count | orange_count |
|--------|----------|-------------|--------------|
| 2      | null     | 6           | 15           |
| 18     | 14       | 4           | 15           |
| 19     | 3        | 8           | 4            |
| 12     | 2        | 19          | 20           |
| 20     | 6        | 12          | 9            |
| 8      | 6        | 9           | 9            |
| 3      | 14       | 16          | 7            |

#### Chests table:

| chest_id | apple_count | orange_count |
|----------|-------------|--------------|
| 6        | 5           | 6            |
| 14       | 20          | 10           |
| 2        | 8           | 8            |
| 3        | 19          | 4            |
| 16       | 19          | 19           |

---

### Output

| apple_count | orange_count |
|-------------|--------------|
| 151         | 123          |

---

### Explanation

box 2 has 6 apples and 15 oranges.  
box 18 has 4 + 20 (from the chest) = 24 apples and 15 + 10 (from the chest) = 25 oranges.  
box 19 has 8 + 19 (from the chest) = 27 apples and 4 + 4 (from the chest) = 8 oranges.  
box 12 has 19 + 8 (from the chest) = 27 apples and 20 + 8 (from the chest) = 28 oranges.  
box 20 has 12 + 5 (from the chest) = 17 apples and 9 + 6 (from the chest) = 15 oranges.  
box 8 has 9 + 5 (from the chest) = 14 apples and 9 + 6 (from the chest) = 15 oranges.  
box 3 has 16 + 20 (from the chest) = 36 apples and 7 + 10 (from the chest) = 17 oranges.  
Total number of apples = 6 + 24 + 27 + 27 + 17 + 14 + 36 = 151  
Total number of oranges = 15 + 25 + 8 + 28 + 15 + 15 + 17 = 123

In [0]:
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, IntegerType
from pyspark.sql import Row

spark = SparkSession.builder.getOrCreate()

# Define schemas
boxes_schema = StructType([
    StructField("box_id", IntegerType(), True),
    StructField("chest_id", IntegerType(), True),
    StructField("apple_count", IntegerType(), True),
    StructField("orange_count", IntegerType(), True)
])

chests_schema = StructType([
    StructField("chest_id", IntegerType(), True),
    StructField("apple_count", IntegerType(), True),
    StructField("orange_count", IntegerType(), True)
])

# Sample data
boxes_data = [
    (2, None, 6, 15),
    (18, 14, 4, 15),
    (19, 3, 8, 4),
    (12, 2, 19, 20),
    (20, 6, 12, 9),
    (8, 6, 9, 9),
    (3, 14, 16, 7)
]

chests_data = [
    (6, 5, 6),
    (14, 20, 10),
    (2, 8, 8),
    (3, 19, 4),
    (16, 19, 19)
]

# Create DataFrames
boxes_df = spark.createDataFrame(boxes_data, schema=boxes_schema)
chests_df = spark.createDataFrame(chests_data, schema=chests_schema)

# Register temp views
boxes_df.createOrReplaceTempView("Boxes")
chests_df.createOrReplaceTempView("Chests")


In [0]:

from pyspark.sql.functions import *
box=boxes_df.selectExpr("box_id", "chest_id as b_c_id", "apple_count as b_ac", "orange_count as b_oc")
chest=chests_df.selectExpr("chest_id as c_c_id", "apple_count as c_ac", "orange_count as c_oc")
box.join(chest,col("b_c_id") == col("c_c_id"),"left").selectExpr("coalesce(b_ac , 0)+coalesce(c_ac , 0) as apple_count", "coalesce(b_oc , 0)+coalesce(c_oc , 0) as orange_count").agg(sum(col("apple_count")), sum(col("orange_count"))).display()


In [0]:
%sql
Select sum(coalesce(b.apple_count,0) + coalesce(c.apple_count,0) )as total_apples,
       sum(coalesce(b.orange_count,0) + coalesce(c.orange_count,0) ) as total_oranges
from Boxes b
 left join Chests c
  on b.chest_id = c.chest_id

In [0]:

# SQL logic
query = """
SELECT 
    SUM(b.apple_count + COALESCE(c.apple_count, 0)) AS apple_count,
    SUM(b.orange_count + COALESCE(c.orange_count, 0)) AS orange_count
FROM Boxes b
LEFT JOIN Chests c
ON b.chest_id = c.chest_id
"""

result_df = spark.sql(query)
display(result_df)