You have 3 tables, one called PROFESSOR, one called COURSE, one called SCHEDULE.
<br><br>
**PROFESSOR** table has columns:
- professor_id (unique int)
- professor_department (int)
- professor_name (string)

**COURSE** table has columns:
- course_id (unique int)
- course_department (int)
- course_name (string)

**SCHEDULE** table has columns:
- professor_id
- course_id
- semester ("FALL", "SPRING", or "SUMMER")
- year (int)

Write the query to return unique rows that have two columns, professor_name and course_name, but only where the professor is teaching a course that isn't in his/her department.

In [1]:
import numpy as np
import pandas as pd
import time
from faker import Faker
fake = Faker()

### Generate professor table

In [2]:
n_prof = 30
n_dep = 5
n_course = 25 
seed = 0

professor_name = set([])
while len(professor_name) < n_prof:
    professor_name.add(fake.name())

np.random.seed(seed)
professor_name = list(professor_name)
professor = pd.DataFrame(
    enumerate(professor_name), columns=["professor_id", "professor_name"])
professor["professor_department"] = np.random.choice(range(n_dep), size=n_prof)


### Generate course table

In [3]:
course = pd.DataFrame(
    ((i, "Course_{}".format(i)) for i in range(n_course)),
    columns=["course_id", "course_name"])

course["course_department"] = np.random.choice(range(n_dep), n_course)

### Generate schedule table

In [4]:
n_year = 5
start_year = 2015
most_courses = n_course - 5
least_courses = most_courses - 3
num_courses_semester = np.random.uniform(least_courses, most_courses, n_year*3).astype(int)

schedule_lst = []
for i in range(n_year):
    year = start_year + i
    for semester in ["SPRING", "SUMMER", "FALL"]:
        size = num_courses_semester[i]
        tmp = pd.DataFrame(
            np.random.choice(course["course_id"], size, replace=False), 
            columns=["course_id"])
        tmp["year"] = year
        tmp["semester"] = semester
        tmp["professor_id"] = np.random.choice(professor["professor_id"], size, replace=False)
        schedule_lst.append(tmp)
    
schedule = pd.concat(schedule_lst)

Unnamed: 0,course_id,year,semester,professor_id
0,6,2015,SPRING,8
1,10,2015,SPRING,12
2,24,2015,SPRING,28
3,7,2015,SPRING,5
4,23,2015,SPRING,25
...,...,...,...,...
12,18,2019,FALL,8
13,21,2019,FALL,5
14,13,2019,FALL,22
15,12,2019,FALL,4


In [6]:
professor.to_csv("./csv/professor.csv", index=False)
course.to_csv("./csv/course.csv", index=False)
schedule.to_csv("./csv/schedule.csv", index=False)