Cookbook Recipes

You are given the table with titles of recipes from a cookbook and their page numbers. You are asked to represent how the recipes will be distributed in the book.
Produce a table consisting of three columns: left_page_number, left_title and right_title. The k-th row (counting from 0), should contain the number and the title of the page with the number 
2
×
�
2×k in the first and second columns respectively, and the title of the page with the number 
2
×
�
+
1
2×k+1 in the third column.
Each page contains at most 1 recipe. If the page does not contain a recipe, the appropriate cell should remain empty (NULL value). Page 0 (the internal side of the front cover) is guaranteed to be empty.

In [1]:
import pandas as pd

In [2]:
cookbook_titles = pd.read_excel("../CSV/cookbook_titles.xlsx")
cookbook_titles

Unnamed: 0,Tаблица 1,Unnamed: 1
0,page_number,title
1,1,Scrambled eggs
2,2,Fondue
3,3,Sandwich
4,4,Tomato soup
5,6,Liver
6,11,Fried duck
7,12,Boiled duck
8,15,Baked chicken


In [3]:
dic={}
for i, j in cookbook_titles.iloc[:1].items():
    dic[i] = j.values[0]
dic

{'Tаблица 1': 'page_number', 'Unnamed: 1': 'title'}

In [4]:
cookbook_titles.rename(columns=dic, inplace=True)
cookbook_titles.drop(0, inplace=True)
cookbook_titles

Unnamed: 0,page_number,title
1,1,Scrambled eggs
2,2,Fondue
3,3,Sandwich
4,4,Tomato soup
5,6,Liver
6,11,Fried duck
7,12,Boiled duck
8,15,Baked chicken


In [5]:
series = pd.DataFrame({'page_number': range(max(cookbook_titles.page_number) + 1)})
series

Unnamed: 0,page_number
0,0
1,1
2,2
3,3
4,4
5,5
6,6
7,7
8,8
9,9


In [6]:
df = series.merge(cookbook_titles, how='left', on='page_number')
df

Unnamed: 0,page_number,title
0,0,
1,1,Scrambled eggs
2,2,Fondue
3,3,Sandwich
4,4,Tomato soup
5,5,
6,6,Liver
7,7,
8,8,
9,9,


In [7]:
df['row_num'] = df.page_number // 2
df

Unnamed: 0,page_number,title,row_num
0,0,,0
1,1,Scrambled eggs,0
2,2,Fondue,1
3,3,Sandwich,1
4,4,Tomato soup,2
5,5,,2
6,6,Liver,3
7,7,,3
8,8,,4
9,9,,4


In [8]:
df_left = df[df.page_number % 2 == 0].groupby('row_num').first().reset_index()
df_left

Unnamed: 0,row_num,page_number,title
0,0,0,
1,1,2,Fondue
2,2,4,Tomato soup
3,3,6,Liver
4,4,8,
5,5,10,
6,6,12,Boiled duck
7,7,14,


In [9]:
df_right = df[df.page_number % 2 == 1].groupby('row_num').first().reset_index()
df_right

Unnamed: 0,row_num,page_number,title
0,0,1,Scrambled eggs
1,1,3,Sandwich
2,2,5,
3,3,7,
4,4,9,
5,5,11,Fried duck
6,6,13,
7,7,15,Baked chicken


In [10]:
result = df_left.merge(df_right, how='outer', on='row_num').sort_values(by='row_num')[
    ['page_number_x','title_x', 'title_y']].rename(
        columns={'page_number_x': 'left_page_number', 'title_x': 'left_title', 'title_y': 'right_title'})
result

Unnamed: 0,left_page_number,left_title,right_title
0,0,,Scrambled eggs
1,2,Fondue,Sandwich
2,4,Tomato soup,
3,6,Liver,
4,8,,
5,10,,Fried duck
6,12,Boiled duck,
7,14,,Baked chicken


Solution Walkthrough
This solution involves using the pandas library in Python to manipulate and merge data frames. The problem asks us to represent how recipes will be distributed in a cookbook, by creating a table with three columns: left_page_number, left_title, and right_title. Each row in the table represents a pair of facing pages in the cookbook, with the left page number and title in the first two columns, and the right page title in the third column.

To solve this problem, we will perform the following steps:

Understand the data and tables involved
Define the problem statement and requirements
Break down the given code and explain its functionality
Bring it all together to understand the final solution
Let's now dive into each of these steps.

Understanding The Data
The data provided consists of a table of recipe titles from a cookbook and their corresponding page numbers. The table has two columns: title and page_number. Each row represents a recipe, with the title of the recipe and the page number on which it appears.

The Problem Statement
The task is to create a table with three columns: left_page_number, left_title, and right_title. Each row in the table corresponds to a pair of facing pages in the cookbook. The first two columns should contain the page number and title of the page with number 2 * k, and the third column should contain the title of the page with number 2 * k + 1. The table should be sorted by the row_num column, which represents the index of each pair of facing pages.

Breaking Down The Code
Let's break down the given code and explain its functionality step by step:

series = pd.DataFrame(
    {"page_number": range(max(cookbook_titles.page_number) + 1)}
)
This line creates a new DataFrame called series with one column named page_number. The range of page numbers is determined by max(cookbook_titles.page_number) + 1. This ensures that the series DataFrame will have enough rows to accommodate all the page numbers in the cookbook.
df = series.merge(cookbook_titles, how="left", on="page_number")
This line merges the series DataFrame with the cookbook_titles DataFrame on the page_number column. The how='left' parameter ensures that all the page numbers from the series DataFrame are included in the resulting DataFrame df. The merge is performed based on the common page_number column.
df["row_num"] = df.page_number // 2
This line creates a new column called row_num in the df DataFrame, which represents the index of each pair of facing pages. It is calculated by integer dividing the page_number column by 2.
df_left = (
    df[df.page_number % 2 == 0]
    .groupby("row_num")
    .first()
    .reset_index()
)
This line filters the df DataFrame to select only the rows where the page_number is even (i.e., divisible by 2). These rows correspond to the left page of each pair of facing pages. Then, the rows are grouped by the row_num column and the first() row of each group is selected. The resulting DataFrame df_left contains the left page number and title for each row.
df_right = (
    df[df.page_number % 2 == 1]
    .groupby("row_num")
    .first()
    .reset_index()
)
This line filters the df DataFrame to select only the rows where the page_number is odd (i.e., not divisible by 2). These rows correspond to the right page of each pair of facing pages. Then, the rows are grouped by the row_num column and the first() row of each group is selected. The resulting DataFrame df_right contains the right page number and title for each row.
result = (
    df_left.merge(df_right, how="outer", on="row_num")
    .sort_values(by="row_num")[
        ["page_number_x", "title_x", "title_y"]
    ]
    .rename(
        columns={
            "page_number_x": "left_page_number",
            "title_x": "left_title",
            "title_y": "right_title",
        }
    )
)
This line merges the df_left and df_right DataFrames on the row_num column to create the final result DataFrame result. The merge is performed with how='outer', which ensures that all the rows from both DataFrames are included. The resulting DataFrame is then sorted by the row_num column. Finally, the columns are renamed to left_page_number, left_title, and right_title to match the problem statement's requirements.
Bringing It All Together
The given code can be summarized as follows:

Create a DataFrame series with a column of page numbers that covers the range of page numbers in the cookbook.
Merge the series DataFrame with the cookbook_titles DataFrame to obtain a DataFrame df with all page numbers and their corresponding titles.
Calculate the row_num column in the df DataFrame to represent the index of each pair of facing pages.
Filter the df DataFrame to select the left pages of each pair, group by row_num, and select the first row of each group to obtain the df_left DataFrame.
Filter the df DataFrame to select the right pages of each pair, group by row_num, and select the first row of each group to obtain the df_right DataFrame.
Merge the df_left and df_right DataFrames on the row_num column, sort by row_num, and rename the columns to match the problem statement's requirements to obtain the final result DataFrame result.
Conclusion
In this walkthrough, we have explained the given code step by step to understand its functionality and how it solves the problem statement. The code involves manipulating DataFrames using the pandas library in Python to represent the distribution of recipes in a cookbook.