# One-to-many relationships

In the previous lesson, we explored how to merge DataFrames using the `merge` method. Now, we will move on to understanding different kinds of relationships between tables, with a focus on the concept of a **one-to-many relationship**. Before diving into that, let’s briefly revisit the idea of a **one-to-one relationship**.

A **one-to-one relationship** occurs when each row in one table corresponds to exactly one row in another table. In other words, there is a direct match between entries across both tables, and no row is linked to multiple rows on the other side.

On the other hand, a **one-to-many relationship** happens when each row in the first table can be linked to one or multiple rows in the second table. This type of relationship naturally leads to the repetition of values from the first table in the merged output, since one entry connects to several records on the other side.

When performing a merge in a one-to-many scenario, pandas handles this relationship without requiring special instructions. However, an important point to note is that the number of rows in the resulting DataFrame will usually be greater than the number of rows in the original table used for merging.


## Prepare Data

In [2]:
# Import pandas library
import pandas as pd

# Read the file
licenses = pd.read_pickle("datasets/licenses.p")
biz_owners = pd.read_pickle("datasets/business_owners.p")

## Exercise: One-to-many merge

In this task, you’ll practice working with one-to-many relationships by combining the `licenses` table with the `biz_owners` table. Since a single business can have multiple owners, merging these tables will show how rows from the left table may be repeated when connected to multiple matches in the right table. After merging, you will identify the most frequent owner titles, such as CEO, secretary, or vice president.

Both the `licenses` and `biz_owners` DataFrames have already been provided.

### Instructions

1. Merge the `licenses` table (left) with the `biz_owners` table (right) using the `account` column, and store the result as `licenses_owners`.
2. Group the merged DataFrame by the `title` column and count how many accounts fall under each title. Save this as `counted_df`.
3. Sort `counted_df` by the account counts in descending order, and assign the result to `sorted_df`.
4. Display the first few rows of `sorted_df` using `.head()`.

In [3]:
# Merge licenses with biz_owners on the 'account' column
licenses_owners = licenses.merge(biz_owners, on="account")

# Count accounts by each owner title
counted_df = licenses_owners.groupby("title")["account"].count()

# Sort the results in descending order
sorted_df = counted_df.sort_values(ascending=False)

# View the first few rows
print(sorted_df.head())

title
PRESIDENT          6259
SECRETARY          5205
SOLE PROPRIETOR    1658
OTHER              1200
VICE PRESIDENT      970
Name: account, dtype: int64
