## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**1098. Unpopular Books (Medium)**

**Table: Books**

| Column Name    | Type    |
|----------------|---------|
| book_id        | int     |
| name           | varchar |
| available_from | date    |

book_id is the primary key (column with unique values) of this table.
 
**Table: Orders**

| Column Name    | Type    |
|----------------|---------|
| order_id       | int     |
| book_id        | int     |
| quantity       | int     |
| dispatch_date  | date    |

order_id is the primary key (column with unique values) of this table.
book_id is a foreign key (reference column) to the Books table.
 
**Write a solution to report the books that have sold less than 10 copies in the last year, excluding books that have been available for less than one month from today. Assume today is 2019-06-23.**

Return the result table in any order.

The result format is in the following example.

**Example 1:**

**Input:** 

**Books table:**

| book_id | name               | available_from |
|---------|--------------------|----------------|
| 1       | "Kalila And Demna" | 2010-01-01     |
| 2       | "28 Letters"       | 2012-05-12     |
| 3       | "The Hobbit"       | 2019-06-10     |
| 4       | "13 Reasons Why"   | 2019-06-01     |
| 5       | "The Hunger Games" | 2008-09-21     |

**Orders table:**

| order_id | book_id | quantity | dispatch_date |
|----------|---------|----------|---------------|
| 1        | 1       | 2        | 2018-07-26    |
| 2        | 1       | 1        | 2018-11-05    |
| 3        | 3       | 8        | 2019-06-11    |
| 4        | 4       | 6        | 2019-06-05    |
| 5        | 4       | 5        | 2019-06-20    |
| 6        | 5       | 9        | 2009-02-02    |
| 7        | 5       | 8        | 2010-04-13    |

**Output:** 
| book_id   | name               |
|-----------|--------------------|
| 1         | "Kalila And Demna" |
| 2         | "28 Letters"       |
| 5         | "The Hunger Games" |


In [0]:
books_data_1098 = [
    (1, "Kalila And Demna", "2010-01-01"),
    (2, "28 Letters", "2012-05-12"),
    (3, "The Hobbit", "2019-06-10"),
    (4, "13 Reasons Why", "2019-06-01"),
    (5, "The Hunger Games", "2008-09-21"),
]

books_columns_1098 = ["book_id", "name", "available_from"]
books_df_1098 = spark.createDataFrame(books_data_1098, books_columns_1098)
books_df_1098.show()

orders_data_1098 = [
    (1, 1, 2, "2018-07-26"),
    (2, 1, 1, "2018-11-05"),
    (3, 3, 8, "2019-06-11"),
    (4, 4, 6, "2019-06-05"),
    (5, 4, 5, "2019-06-20"),
    (6, 5, 9, "2009-02-02"),
    (7, 5, 8, "2010-04-13"),
]

orders_columns_1098 = ["order_id", "book_id", "quantity", "dispatch_date"]
orders_df_1098 = spark.createDataFrame(orders_data_1098, orders_columns_1098)
orders_df_1098.show()

+-------+----------------+--------------+
|book_id|            name|available_from|
+-------+----------------+--------------+
|      1|Kalila And Demna|    2010-01-01|
|      2|      28 Letters|    2012-05-12|
|      3|      The Hobbit|    2019-06-10|
|      4|  13 Reasons Why|    2019-06-01|
|      5|The Hunger Games|    2008-09-21|
+-------+----------------+--------------+

+--------+-------+--------+-------------+
|order_id|book_id|quantity|dispatch_date|
+--------+-------+--------+-------------+
|       1|      1|       2|   2018-07-26|
|       2|      1|       1|   2018-11-05|
|       3|      3|       8|   2019-06-11|
|       4|      4|       6|   2019-06-05|
|       5|      4|       5|   2019-06-20|
|       6|      5|       9|   2009-02-02|
|       7|      5|       8|   2010-04-13|
+--------+-------+--------+-------------+



In [0]:
eligible_books_df_1098 = books_df_1098.filter(col("available_from") <= to_date(lit('2019-05-23')))
eligible_books_df_1098.show()

+-------+----------------+--------------+
|book_id|            name|available_from|
+-------+----------------+--------------+
|      1|Kalila And Demna|    2010-01-01|
|      2|      28 Letters|    2012-05-12|
|      5|The Hunger Games|    2008-09-21|
+-------+----------------+--------------+



In [0]:
recent_orders_df_1098 = orders_df_1098.filter(col("dispatch_date") > to_date(lit('2018-06-23')))
recent_orders_df_1098.show()

+--------+-------+--------+-------------+
|order_id|book_id|quantity|dispatch_date|
+--------+-------+--------+-------------+
|       1|      1|       2|   2018-07-26|
|       2|      1|       1|   2018-11-05|
|       3|      3|       8|   2019-06-11|
|       4|      4|       6|   2019-06-05|
|       5|      4|       5|   2019-06-20|
+--------+-------+--------+-------------+



In [0]:
sales_df_1098 = recent_orders_df_1098\
                .groupBy("book_id") \
                .agg(sum("quantity").alias("total_sold"))
sales_df_1098.show()                

+-------+----------+
|book_id|total_sold|
+-------+----------+
|      1|         3|
|      3|         8|
|      4|        11|
+-------+----------+



In [0]:
eligible_books_df_1098.join(sales_df_1098, on="book_id", how="left").fillna(0)\
                        .filter(col("total_sold") < 10) \
                        .select("book_id", "name").show()

+-------+----------------+
|book_id|            name|
+-------+----------------+
|      1|Kalila And Demna|
|      2|      28 Letters|
|      5|The Hunger Games|
+-------+----------------+

