# Setup Connection
The first thing to begin with SQL is to establish a **connection**. <br>
SQL lite perfrom slightly differently from using the **SQL library** and **panda**. 
Using the sqlite library, I **form a connection named 'conn'** and then read the desire database with **pandas 'read_sql'**. <br>
**''' '''** is used to express the action that the user want to perform. <br>
1. Establish a connection. If connection doesn't exist, SQL will create one with the name given when attempting to open the database. 
2. Execute command to request data from the table. 
3. Close connection once task is competed. 

In [1]:
import sqlite3 as sql
import pandas as pd

conn = sql.connect('data/im.db/im.db/im.db')   # connect to the database
pd.read_sql('''SELECT * FROM persons;''', conn)   # Select all data from the table, persons

### Same result but with SQL sytax

# cur = conn.cursor()   # Command to allow to excute SQL commands
# cur.execute("""SELECT * FROM persons;""")   # Execute the command, but does not display like panda
# cur.fetchall()   # Display the result of the exceuted command. Notice the display is a list of tuples
# cur.description   # Obtain the information from the request. 
# pd.DataFrame(
#     data=cur.execute("""SELECT * FROM persons;""").fetchall(),
#     columns=[x[0] for x in cur.description]
# )

Unnamed: 0,person_id,primary_name,birth_year,death_year,primary_profession
0,nm0061671,Mary Ellen Bauder,,,"miscellaneous,production_manager,producer"
1,nm0061865,Joseph Bauer,,,"composer,music_department,sound_department"
2,nm0062070,Bruce Baum,,,"miscellaneous,actor,writer"
3,nm0062195,Axel Baumann,,,"camera_department,cinematographer,art_department"
4,nm0062798,Pete Baxter,,,"production_designer,art_department,set_decorator"
...,...,...,...,...,...
606643,nm9990381,Susan Grobes,,,actress
606644,nm9990690,Joo Yeon So,,,actress
606645,nm9991320,Madeline Smith,,,actress
606646,nm9991786,Michelle Modigliani,,,producer


## CLAUSE Example
Clauses are built in functions that SQL to retrieve data. <br>
**CREATE** - Create a table from scratch <br>
**SELECT** - select which columns to display. * will look at every column instead. <br>
**WHERE** - similar to if logic <br>
**GROUP BY** - Group the rows based off a columns. <BR>
**HAVING** - Condition based off a column after an aggregated function have processed the columns. CAN ONLY BE USED AFTER GROUP BY <BR>
**length()** - look at length of string <br>
**substr()** - look at specific string  <br> 

| Data Type | Desctription |
| :-----: | :------: |
| **INTEGER** |-2147483648 to +2147483647 (Why?) |
| **REAL** | Floating numbers with a **max** of 6 decimal places |
| **TEXT** | a string of any size |
| **CHAR** | single character |
| **DATE** | date format MM/DD/YYYY |
| **BLOB** | A large file... |


## CREATE Clause
Function: Create a table for a database.
Requirement: A connection to a database is needed. SQL library will create database if the connection failed to be made.
1. Add **primary key** to keep tables connected and related. 
2. Table can be **empty or preset** columns. 

**Note:** **SQL** library is used to make table and to **maniplate data**. **Panda** is used to **review** the request command and **can not modify** the data. 

In [2]:
# Command variable to execute. Doesn't 
command = '''CREATE TABLE games(
            id INTEGER,
            name TEXT,
            price REAL
            );'''

# Form a connection to the test database
test_conn = sql.connect('data/test_db.db')

# Execute command to create table
test_conn.execute(command)

# Select all data from the table, games. Should be empty
display(pd.read_sql('''SELECT * FROM games;''', test_conn))

# Drop the Table to be recreated again once code runs 
command = '''DROP TABLE games;'''

# Execute command to drop the table 
test_conn.execute(command)

# Close the database
test_conn.close()

Unnamed: 0,id,name,price


## DELETE FROM 
Function: Delete from a table based off a column if given.

In [3]:
command = '''DELETE FROM games(
            WHERE price > 50
            );'''

## SELECT Command
**Function**: Select columns from a table to obtain from database. 
1. **\*** can be used to **select all columns**
2. Column names can be **listed** to select **specific columns**.
3. The **AS clause** can be used to **rename** the columns to different names. <br>
    **Note**: **Pandas remembers the original name** and given name but will **change the display to the given name**. 
4. **SQLite** allows **tableName.columnName** to be used as well. 
5. Any name given after the **SELECT** will be added to the dataframe **after the original request**. <br> 
    **Note**: Order can be made clear or repeat column. 

In [4]:
# Select all columns from the table, persons 
display(pd.read_sql('''SELECT * FROM persons LIMIT 5;''',conn )) 

# Select all ID and Name from the table, persons 
display(pd.read_sql('''SELECT person_id, primary_name FROM persons LIMIT 5;''',conn )) 

# Select person ID and rename to ID from the table, persons 
display(pd.read_sql('''SELECT person_id AS ID FROM persons LIMIT 5;''',conn )) 

# Select all data AND add a new columnd with name length from the table, persons 
display(pd.read_sql('''SELECT * , primary_name FROM persons LIMIT 5;''',conn )) 


Unnamed: 0,person_id,primary_name,birth_year,death_year,primary_profession
0,nm0061671,Mary Ellen Bauder,,,"miscellaneous,production_manager,producer"
1,nm0061865,Joseph Bauer,,,"composer,music_department,sound_department"
2,nm0062070,Bruce Baum,,,"miscellaneous,actor,writer"
3,nm0062195,Axel Baumann,,,"camera_department,cinematographer,art_department"
4,nm0062798,Pete Baxter,,,"production_designer,art_department,set_decorator"


Unnamed: 0,person_id,primary_name
0,nm0061671,Mary Ellen Bauder
1,nm0061865,Joseph Bauer
2,nm0062070,Bruce Baum
3,nm0062195,Axel Baumann
4,nm0062798,Pete Baxter


Unnamed: 0,ID
0,nm0061671
1,nm0061865
2,nm0062070
3,nm0062195
4,nm0062798


Unnamed: 0,person_id,primary_name,birth_year,death_year,primary_profession,primary_name.1
0,nm0061671,Mary Ellen Bauder,,,"miscellaneous,production_manager,producer",Mary Ellen Bauder
1,nm0061865,Joseph Bauer,,,"composer,music_department,sound_department",Joseph Bauer
2,nm0062070,Bruce Baum,,,"miscellaneous,actor,writer",Bruce Baum
3,nm0062195,Axel Baumann,,,"camera_department,cinematographer,art_department",Axel Baumann
4,nm0062798,Pete Baxter,,,"production_designer,art_department,set_decorator",Pete Baxter


## WHERE Command
**Function:** Uses to filter the data with a **logical statement** or **condition**.
1. Some logical statements to use are inequilty (>,<, =, !=), AND, OR, BETWEEN, IN, LIKE, IS, NOT etc <br> 
2. Where can use current name or a new name if given with AS. SQL can use both without getting confused <BR> 

In [5]:
# Select all actresses from the table, persons 
display(pd.read_sql('''
                    SELECT * 
                        FROM persons 
                    WHERE primary_profession = 'actress' 
                    LIMIT 5;''', conn))

# Select all actresses from the table, persons and with ID greater than nm0061671
display(pd.read_sql('''
                    SELECT * 
                        FROM persons 
                    WHERE primary_profession = 'actress' AND person_id > 'nm0061671' 
                    LIMIT 5;''', conn)) 

# Select all from the table, persons and with ID greater than nm0061671
display(pd.read_sql('''
                    SELECT person_ID AS ID 
                        FROM persons 
                    WHERE person_id > 'nm0061671' 
                    LIMIT 5;''', conn)) 
# Select all from the table, persons and with ID greater than nm0061671 using ID 
display(pd.read_sql('''
                    SELECT person_ID AS ID 
                        FROM persons 
                    WHERE ID > 'nm0061671' 
                    LIMIT 5;''', conn)) 

Unnamed: 0,person_id,primary_name,birth_year,death_year,primary_profession
0,nm0067845,Sondos Belhassen,,,actress
1,nm0073381,Roxana Berco,,,actress
2,nm0076139,Andrée Bernard,1966.0,,actress
3,nm0082740,Shirin Bina,,,actress
4,nm0086205,Bre Blair,1980.0,,actress


Unnamed: 0,person_id,primary_name,birth_year,death_year,primary_profession
0,nm0067845,Sondos Belhassen,,,actress
1,nm0073381,Roxana Berco,,,actress
2,nm0076139,Andrée Bernard,1966.0,,actress
3,nm0082740,Shirin Bina,,,actress
4,nm0086205,Bre Blair,1980.0,,actress


Unnamed: 0,ID
0,nm0061865
1,nm0062070
2,nm0062195
3,nm0062798
4,nm0062879


Unnamed: 0,ID
0,nm0061865
1,nm0062070
2,nm0062195
3,nm0062798
4,nm0062879


## ORDER BY
Similiar to sort where it sort the data requested based off the column name. <br>

**ASC** - Ascending order, default parameter <br>
**DESC** - Descending order  <br>
**Sort** can working with other custom made columns

Multiple sorts can be performed with columns listed one after the other. It is advise to sort by the more unique column than the more diverse column. <br>
To ensure that a column is sorted properly, use CAST to convert the column to a given data type. 

In [6]:
# Select all from the table, persons and with ID greater than nm0061671 using ID 
display(pd.read_sql('''
                    SELECT person_ID AS ID 
                        FROM persons 
                    ORDER BY ID 
                    LIMIT 5;''', conn)) 

display(pd.read_sql('''
                    SELECT person_ID AS ID 
                        FROM persons 
                    ORDER BY ID DESC
                    LIMIT 5;''', conn)) 

display(pd.read_sql('''
                    SELECT person_ID AS ID, primary_name AS name 
                        FROM persons 
                    ORDER BY ID , name DESC
                    LIMIT 5;''', conn)) 

Unnamed: 0,ID
0,nm0000002
1,nm0000003
2,nm0000005
3,nm0000006
4,nm0000007


Unnamed: 0,ID
0,nm9993680
1,nm9993650
2,nm9993616
3,nm9993573
4,nm9993494


Unnamed: 0,ID,name
0,nm0000002,Lauren Bacall
1,nm0000003,Brigitte Bardot
2,nm0000005,Ingmar Bergman
3,nm0000006,Ingrid Bergman
4,nm0000007,Humphrey Bogart


## LIMIT 

Simple clause that limit the request to a certain number <br>
Clause have been used a couple time already and doesn't require any special treatment.

## GROUP BY

Usually used along with aggregate function. This can also be grouped by multiple columns as well. 
If a number is given, index based column will be used instead starting with the number 1. 

In [7]:
# Display the numher of profession for each of the given value in the table. 
display(pd.read_sql('''
                    SELECT primary_profession, count(*)
                        FROM persons 
                    GROUP BY primary_profession;''', conn)) 

Unnamed: 0,primary_profession,count(*)
0,,51340
1,actor,88306
2,"actor,animation_department",46
3,"actor,animation_department,art_department",12
4,"actor,animation_department,art_director",1
...,...,...
8643,"writer,visual_effects,editorial_department",3
8644,"writer,visual_effects,miscellaneous",7
8645,"writer,visual_effects,producer",14
8646,"writer,visual_effects,production_manager",2


## HAVING

Similar to WHERE but used after the clause GROUP BY. <br>
Once the dataset is grouped by a column, HAVING will check the aggreated values <br>
This can also be combined with WHERE as well so long as WHERE doesn't use the aggretaged functions. 

In [8]:
# Display the numher of profession for each of the given value in the table having the youngest being from after 1900
display(pd.read_sql('''
                    SELECT primary_profession, count(*), min(birth_year) AS youngest
                        FROM persons 
                    GROUP BY primary_profession
                    HAVING youngest > 1900;''', conn)) 

# Display the numher of profession where there are more than 10 people having the youngest being from after 1900
display(pd.read_sql('''
                    SELECT primary_profession, count(*) AS num_people, min(birth_year) AS youngest
                        FROM persons 
                    WHERE primary_profession LIKE '%actor%'
                    GROUP BY primary_profession
                    HAVING youngest > 1900;''', conn)) 

Unnamed: 0,primary_profession,count(*),youngest
0,"actor,animation_department",46,1949.0
1,"actor,animation_department,art_department",12,1934.0
2,"actor,animation_department,cinematographer",2,1973.0
3,"actor,animation_department,director",7,1945.0
4,"actor,animation_department,editor",4,1976.0
...,...,...,...
3431,"writer,visual_effects,director",12,1983.0
3432,"writer,visual_effects,editor",19,1993.0
3433,"writer,visual_effects,miscellaneous",7,1953.0
3434,"writer,visual_effects,producer",14,1979.0


Unnamed: 0,primary_profession,num_people,youngest
0,"actor,animation_department",46,1949.0
1,"actor,animation_department,art_department",12,1934.0
2,"actor,animation_department,cinematographer",2,1973.0
3,"actor,animation_department,director",7,1945.0
4,"actor,animation_department,editor",4,1976.0
...,...,...,...
857,"writer,music_department,actor",15,1955.0
858,"writer,production_designer,actor",4,1929.0
859,"writer,production_manager,actor",3,1935.0
860,"writer,stunts,actor",4,1936.0


## JOIN 
Function: Combine tables based off the column given. Can be used with INNER, OUTER or FULL <br>
    **INNER**: Combine the datat table that both are contained. **Default** to **INNER** if JOIN is used. <br>
    **OUTER**: Combine from **both** the tables but **not their interset. <br>
    **FULL**: Include data from **both** table. <br>
    **LEFT**: Contains all the data from the left table along with the data in both the left and right table. <br>
    **RIGHT**: Contains all the data from the right table along with the data in both the left and right table.<br>
If the table doesn't match, a NULL will be filled for those columns. 

In [9]:
# Convert the ID with the name of the movie for clarity.
display(pd.read_sql('''
                    SELECT movie_basics.primary_title AS Title, count(*) AS num_count
                        FROM principals
                    JOIN movie_basics
                        ON principals.movie_id = movie_basics.movie_id
                    GROUP BY principals.movie_id''', conn)) 

Unnamed: 0,Title,num_count
0,Sunghursh,10
1,One Day Before the Rainy Season,7
2,The Other Side of the Wind,10
3,Sabse Bada Sukh,10
4,The Wandering Soap Opera,10
...,...,...
143449,Kuambil Lagi Hatiku,10
143450,Rodolpho Teóphilo - O Legado de um Pioneiro,10
143451,Dankyavar Danka,10
143452,6 Gunn,9


## SQL Pratice Problem
Determine the number of workers for each movie and their average ratings. 

### Data Exploration:
1. Determine the number of current registered workers
2. Determine the number of movies 
3. Determine the number of total oppuntinies. 
4. Determine the average for each movie. 

In [10]:
# Setup 
## Number of movies in the table movie_basic
display(pd.read_sql('''
                    SELECT count(*) AS num_movies
                        FROM movie_basics
                    ''', conn)) 

## Number of unique workers in persons
display(pd.read_sql('''
                    SELECT count(*) AS num_workers
                        FROM persons
                    ''', conn)) 

## Number of workers for the movies 
display(pd.read_sql('''
                    SELECT count(*) AS num_of_principals
                        FROM principals
                    ''', conn)) 

## Average rating for each movie
display(pd.read_sql('''
                    SELECT averagerating AS ratings
                        FROM movie_ratings
                    ''', conn)) 

Unnamed: 0,num_movies
0,146144


Unnamed: 0,num_workers
0,606648


Unnamed: 0,num_of_principals
0,1028186


Unnamed: 0,ratings
0,8.3
1,8.9
2,6.4
3,4.2
4,6.5
...,...
73851,8.1
73852,7.5
73853,4.7
73854,7.0


#### Note 1
1. There are **146,144**
2. There are **606,648** workers. 
3. There are **1,028,186** oppunitity for this database recorded. 
4. There are **73,856** movies that have ratings

Initial Impressions: There are some limiting factors that the database have when considering the problem. Not ever movie have a rating and so only the movies that do are being considered for this problem. 

### Final Steps:
The solution will have the following properties
1. Only use movies with ratings will be included. 
2. Each movie will have a rating.
3. Each movie will have the number of worker for the movie 

In [23]:
## Movies with ratings 
display(pd.read_sql('''
                    SELECT movie_id AS ID, averagerating AS AvgRating
                        FROM movie_ratings
                    ''', conn)) 

## Movie titles from Movie Basics
display(pd.read_sql('''
                    SELECT movie_id AS ID, primary_title AS Title
                        FROM movie_basics
                    ''', conn)) 

## Join the two tables together
display(pd.read_sql('''
                    SELECT *
                        FROM movie_basics
                    JOIN (movie_ratings)
                        USING (movie_id)
                    ''', conn)) 


Unnamed: 0,ID,AvgRating
0,tt10356526,8.3
1,tt10384606,8.9
2,tt1042974,6.4
3,tt1043726,4.2
4,tt1060240,6.5
...,...,...
73851,tt9805820,8.1
73852,tt9844256,7.5
73853,tt9851050,4.7
73854,tt9886934,7.0


Unnamed: 0,ID,Title
0,tt0063540,Sunghursh
1,tt0066787,One Day Before the Rainy Season
2,tt0069049,The Other Side of the Wind
3,tt0069204,Sabse Bada Sukh
4,tt0100275,The Wandering Soap Opera
...,...,...
146139,tt9916538,Kuambil Lagi Hatiku
146140,tt9916622,Rodolpho Teóphilo - O Legado de um Pioneiro
146141,tt9916706,Dankyavar Danka
146142,tt9916730,6 Gunn


Unnamed: 0,movie_id,primary_title,original_title,start_year,runtime_minutes,genres,averagerating,numvotes
0,tt0063540,Sunghursh,Sunghursh,2013,175.0,"Action,Crime,Drama",7.0,77
1,tt0066787,One Day Before the Rainy Season,Ashad Ka Ek Din,2019,114.0,"Biography,Drama",7.2,43
2,tt0069049,The Other Side of the Wind,The Other Side of the Wind,2018,122.0,Drama,6.9,4517
3,tt0069204,Sabse Bada Sukh,Sabse Bada Sukh,2018,,"Comedy,Drama",6.1,13
4,tt0100275,The Wandering Soap Opera,La Telenovela Errante,2017,80.0,"Comedy,Drama,Fantasy",6.5,119
...,...,...,...,...,...,...,...,...
73851,tt9913084,Diabolik sono io,Diabolik sono io,2019,75.0,Documentary,6.2,6
73852,tt9914286,Sokagin Çocuklari,Sokagin Çocuklari,2019,98.0,"Drama,Family",8.7,136
73853,tt9914642,Albatross,Albatross,2017,,Documentary,8.5,8
73854,tt9914942,La vida sense la Sara Amat,La vida sense la Sara Amat,2019,,,6.6,5


In [48]:
## Number of workers per movie and there ratings
display(pd.read_sql('''
                    SELECT movie_id AS ID, primary_title AS Title, averagerating AS Rating, count(*)
                        FROM movie_basics
                    LEFT JOIN (movie_ratings)
                        USING (movie_id)
                    LEFT JOIN (principals)
                        USING (movie_id) 
                    GROUP BY movie_id
                    ''', conn)) 

## Movies that have a NULL 
## 146144 - 72288 = 73856 (Check)
display(pd.read_sql('''
                    SELECT movie_id AS ID, primary_title AS Title, averagerating AS Rating, count(*)
                        FROM movie_basics
                    LEFT JOIN (movie_ratings)
                        USING (movie_id)
                    LEFT JOIN (principals)
                        USING (movie_id) 
                    GROUP BY movie_id
                    HAVING Rating IS NULL
                    ''', conn)) 

Unnamed: 0,ID,Title,Rating,count(*)
0,tt0063540,Sunghursh,7.0,10
1,tt0066787,One Day Before the Rainy Season,7.2,7
2,tt0069049,The Other Side of the Wind,6.9,10
3,tt0069204,Sabse Bada Sukh,6.1,10
4,tt0100275,The Wandering Soap Opera,6.5,10
...,...,...,...,...
146139,tt9916538,Kuambil Lagi Hatiku,,10
146140,tt9916622,Rodolpho Teóphilo - O Legado de um Pioneiro,,10
146141,tt9916706,Dankyavar Danka,,10
146142,tt9916730,6 Gunn,,9


Unnamed: 0,ID,Title,Rating,count(*)
0,tt0111414,A Thin Life,,3
1,tt0139613,O Silêncio,,6
2,tt0144449,Nema aviona za Zagreb,,10
3,tt0187902,How Huang Fei-hong Rescued the Orphan from the...,,1
4,tt0262759,Seven Jews from My Class,,6
...,...,...,...,...
72283,tt9916538,Kuambil Lagi Hatiku,,10
72284,tt9916622,Rodolpho Teóphilo - O Legado de um Pioneiro,,10
72285,tt9916706,Dankyavar Danka,,10
72286,tt9916730,6 Gunn,,9


Optional Analysis
1. Which movies have the best movies?
2. Which movies have the least worker?
3. Which movie have the best worker to rating ratio?

In [58]:
## Best rated movie titles
display(pd.read_sql('''
                    SELECT movie_id AS ID, primary_title AS Title, averagerating AS Rating, count(*) AS num_workers
                        FROM movie_basics
                    JOIN (movie_ratings)
                        USING (movie_id)
                    JOIN (principals)
                        USING (movie_id) 
                    GROUP BY movie_id
                    ORDER BY Rating DESC
                    LIMIT 5
                    ''', conn)) 

display(pd.read_sql('''
                    SELECT movie_id AS ID, primary_title AS Title, averagerating AS Rating, count(*) AS num_workers
                        FROM movie_basics
                    JOIN (movie_ratings)
                        USING (movie_id)
                    JOIN (principals)
                        USING (movie_id) 
                    GROUP BY movie_id
                    ORDER BY num_workers
                    LIMIT 5
                    ''', conn)) 

display(pd.read_sql('''
                    SELECT movie_id AS ID, primary_title AS Title, averagerating AS Rating, count(*) AS num_workers, (averagerating / count(*)) AS Ratio
                        FROM movie_basics
                    JOIN (movie_ratings)
                        USING (movie_id)
                    JOIN (principals)
                        USING (movie_id) 
                    GROUP BY movie_id
                    ORDER BY Ratio DESC
                    LIMIT 5
                    ''', conn)) 

Unnamed: 0,ID,Title,Rating,num_workers
0,tt9715646,Renegade,10.0,6
1,tt8730716,Pick It Up! - Ska in the '90s,10.0,10
2,tt7259300,Calamity Kevin,10.0,8
3,tt7227500,Ellis Island: The Making of a Master Race in A...,10.0,10
4,tt6991826,A Dedicated Life: Phoebe Brand Beyond the Group,10.0,4


Unnamed: 0,ID,Title,Rating,num_workers
0,tt0170651,T.G.M. - osvoboditel,7.5,1
1,tt10015432,Forest of the Dead Sharks,8.4,1
2,tt10027696,Patterns of Evidence: The Moses Controversy,7.9,1
3,tt10100842,William Tecumseh Sherman: Beyond the March to ...,7.1,1
4,tt10122528,Miel-Emile,8.2,1


Unnamed: 0,ID,Title,Rating,num_workers,Ratio
0,tt5089804,Fly High: Story of the Disc Dog,10.0,1,10.0
1,tt10378660,The Dark Knight: The Ballad of the N Word,10.0,1,10.0
2,tt9820678,Moscow we will lose,9.9,1,9.9
3,tt7541970,From Shock to Awe,9.7,1,9.7
4,tt8751896,Gangter in Morteni,9.6,1,9.6


### Note 2
There are movies with only one worker that have very high ratings. The ratio is somewhat misleading since the number of reviewer are different from one another when considering that movie could only have 1 or 2 low ratings. The next step would be to consider the movies with the mean number of reviewer to give a better interpretation of the results. 

In [64]:
display(pd.read_sql('''
                    SELECT AVG(numvotes)
                        FROM movie_ratings
                    ''', conn)) 


display(pd.read_sql('''
                    SELECT movie_id AS ID, primary_title AS Title, averagerating AS Rating, count(*) AS num_workers, (averagerating / count(*)) AS Ratio, numvotes
                        FROM movie_basics
                    JOIN (movie_ratings)
                        USING (movie_id)
                    JOIN (principals)
                        USING (movie_id) 
                    WHERE numvotes > 3523
                    GROUP BY movie_id
                    ORDER BY Ratio DESC
                    LIMIT 5
                    ''', conn)) 

Unnamed: 0,AVG(numvotes)
0,3523.662167


Unnamed: 0,ID,Title,Rating,num_workers,Ratio,numvotes
0,tt2396224,It's Such a Beautiful Day,8.3,1,8.3,9099
1,tt4359416,Taxi,7.3,1,7.3,11870
2,tt7905466,They Shall Not Grow Old,8.4,3,2.8,15612
3,tt2400291,The Barkley Marathons: The Race That Eats Its ...,7.8,4,1.95,3740
4,tt1667905,This Is Not a Film,7.5,4,1.875,4313


## Conclusion
We can conclude that the movie with the best rating to worker ratio is the following 5 movies listed above.

In [None]:
# conn.close() 