# Simpsons: Merging and Concatenation

## Imports

In [None]:
import numpy as np
import pandas as pd
from pandas import DataFrame, Series

## Relationships between DataFrames

When you have multiple `DataFrame`s that have common keys you can have **relationships** between the entities in the different `DataFrame`s. There are three types of entity relationships that are possible:

* 1-to-1
* 1-to-many
* many-to-many

Here is a small data set from the TV show [The Simpsons](https://en.wikipedia.org/wiki/The_Simpsons) to illustrate these relationshps.

First, here is a `DataFrame` with students' first and last names, along with a unique student id:

In [None]:
students = DataFrame({'fname': ['Bart','Lisa','Milhouse'],
                      'lname': ['Simpson','Simpson','Van Houten']},
                     index=list('abc'))
students

Here is a `DataFrame` with the student social security numbers, indexed by their unique student id:

In [None]:
ssns = DataFrame({'ssn':[1234,5678,9101]}, index=list('abc'))
ssns

Each student can have aliases or nicknames:

In [None]:
aliases = DataFrame({'alias':['Bartman','Bartron','Cosmos','Truth Teller','Lady Penelope Ariel',
                              'Jake Boyman','Lou La Trec','Eagle Eye','Maestro'],
                     'student': list('aaabbbccc')})
aliases

Here are the student home addresses:

In [None]:
addresses = DataFrame({'address':['742 Evergreen Terrace','742 Evergreen Terrace','316 Pikeland Ave.']},
                      index=list('abc'))
addresses

A table of courses the students can be enrolled in:

In [None]:
courses = DataFrame({'name':['Biology','Math','PE','Underwater electronics']}, index=range(4))
courses

This table contains the enrollment for each course. Each row of this table has a student and course.

In [None]:
enroll = DataFrame({'student':['a','b','b','c','c','c']},index=(2,0,1,0,1,2))
enroll

## 1-1 relationships

* Each student has exactly one SSN.
* Each SSN belongs to exactly one student.

Create a `DataFrame` with the students' first name, last name and social security number:

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
merge1

In [None]:
assert list(merge1.columns)==['fname', 'lname', 'ssn']
assert list(merge1.index)==list('abc')

## 1-many relationships

### Students and addresses

* Each student has exactly one address.
* Each address can have many students.

Create a `DataFrame` with the students' first name, last name and address:

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
merge2

In [None]:
assert list(merge2.columns)==['fname', 'lname', 'address']
assert list(merge2.index)==list('abc')

### Students and aliases

* Each student can have many aliases.
* Each alias belong to exactly one student.

Create a `DataFrame` with the students' first name, last name and alias. The index of the data frame should be the student column (a, b, c).

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
merge3

In [None]:
assert list(merge3.columns)==['fname', 'lname', 'alias']
assert list(merge3.index)==list('aaabbbccc')

## Many-many relationships

* A student can take multiple classes.
* A single class can have multiple students.

Create a `DataFrame` with the students first name, last name and student (a, b, c) and course name. Multiple merges may be required.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
merge4

In [None]:
assert list(merge4.columns)==['fname', 'lname', 'student', 'name']
assert len(merge4)==7

## Concatenation

Use Pandas' `concat` function to combining the `students` and `ssns` `DataFrame`s by columns

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
concat1

In [None]:
assert list(concat1.columns)==['fname', 'lname', 'ssn']
assert list(concat1.index)==list('abc')

Do the same thing for the `students`, `ssns` and `addresses` `DataFrame`s:

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
concat2

In [None]:
assert list(concat2.columns)==['fname', 'lname', 'ssn', 'address']
assert list(concat2.index)==list('abc')