## Overview of data:
- All 4 datasets from MovieLens are stored in the central movies.db database.
- The 4 datasets are as follows:
    - **links.csv**: links the *movieId* (what we are using as our unique identifier) with the imdbId (which will be helpful when eventually referencing IMDB for plot info and images).
    - **movies.csv**: contains the *title* and *genre* for each *movieId*.
        - *sidenote*: the actual table in the db.file is called '*movielens*'.
    - **ratings.csv**: contains the *rating* that each user (denoted by a *userId*) gave a particular movie (linked to its *movieId*) at a particular time (represented by a *timestamp* as a string). Note that each user rated multiple movies.
    - **tags.csv**: contains the *tags* that a particular user (linked to his/her *userId*) gave to a particular movie (linked to its *movieId*), also at a particular *timestamp*. Note that each user tags multiple movies, and in turn may apply multiple tags to each movie.
- In SQL commands, when joining data, all queries should be joined on the **movieId**, since this is the common, unique identifier between all tables in the database.

In [1]:
import pandas as pd
import sqlite3

In [2]:
db = sqlite3.connect('data/movies.db')

In [3]:
query = '''SELECT * FROM movielens
        '''

In [4]:
df_out = pd.read_sql(query, db)
df_out.head()

Unnamed: 0,movieId,title,genres
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
1,2,Jumanji (1995),Adventure|Children|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama|Romance
4,5,Father of the Bride Part II (1995),Comedy


In [11]:
query2 = '''SELECT title, genres, ratings.*, tags.tag, tags.timestamp AS ts
            FROM movielens
            JOIN ratings ON movielens.movieId = ratings.movieId
            LEFT JOIN tags ON movielens.movieID = tags.movieID AND ratings.userId = tags.userId
         '''
df2 = pd.read_sql(query2, db)
df2

Unnamed: 0,title,genres,userId,movieId,rating,timestamp,tag,ts
0,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,1,1,4.0,964982703,,
1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,5,1,4.0,847434962,,
2,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,7,1,4.5,1106635946,,
3,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,15,1,2.5,1510577970,,
4,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,17,1,4.5,1305696483,,
5,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,18,1,3.5,1455209816,,
6,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,19,1,4.0,965705637,,
7,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,21,1,3.5,1407618878,,
8,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,27,1,3.0,962685262,,
9,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,31,1,5.0,850466616,,


In [None]:
db.close()