# Grundkompetenz Datenbanken

## Setup

Hier werden die benötigten Daten von der Konfigurationsdatei abgelesen, um eine Datenbankverbindung herzustellen.

In [1]:
from pymongo import MongoClient
import pandas
import configparser
import pprint

config = configparser.ConfigParser()
config.read('config.ini')

db_username = config['Database']['USER']
db_password = config['Database']['PASS']
db_hostname = config['Database']['HOST']

jsonp = pprint.pprint

Hier wird die Datenbankverbindung hergestellt.

In [2]:
client = MongoClient("mongodb+srv://{USERNAME}:{PASSWORD}@{HOSTNAME}".format(USERNAME = db_username, 
                                                                             PASSWORD = db_password, 
                                                                             HOSTNAME = db_hostname))
db = client.sample_mflix

Add join functionality with pandas

In [3]:
def join(result1, result2, column1, column2):
    result1 = pandas.DataFrame(list(result1))
    result2 = pandas.DataFrame(list(result2))
    return pandas.merge(result1, result2, left_on = column1, right_on = column2)

Testabfrage:

In [4]:
print(db.list_collection_names())

['sessions', 'movies', 'users', 'comments', 'theaters']


5 Users ausgeben

In [5]:
users = db.users.find({})[:5]
for user in users:
    jsonp(user)

{'_id': ObjectId('59b99db4cfa9a34dcd7885b6'),
 'email': 'sean_bean@gameofthron.es',
 'name': 'Ned Stark',
 'password': '$2b$12$UREFwsRUoyF0CRqGNK0LzO0HM/jLhgUCNNIJ9RJAqMUQ74crlJ1Vu'}
{'_id': ObjectId('59b99db5cfa9a34dcd7885b8'),
 'email': 'nikolaj_coster-waldau@gameofthron.es',
 'name': 'Jaime Lannister',
 'password': '$2b$12$6vz7wiwO.EI5Rilvq1zUc./9480gb1uPtXcahDxIadgyC3PS8XCUK'}
{'_id': ObjectId('59b99db6cfa9a34dcd7885ba'),
 'email': 'lena_headey@gameofthron.es',
 'name': 'Cersei Lannister',
 'password': '$2b$12$FExjgr7CLhNCa.oUsB9seub8mqcHzkJCFZ8heMc8CeIKOZfeTKP8m'}
{'_id': ObjectId('59b99dbacfa9a34dcd7885c2'),
 'email': 'richard_madden@gameofthron.es',
 'name': 'Robb Stark',
 'password': '$2b$12$XPLvWQW7tjWc/PX9jMVRnO8w.lR6hv144ee8pc8nDsWIAWxfwxHzy'}
{'_id': ObjectId('59b99dbbcfa9a34dcd7885c4'),
 'email': 'isaac_hempstead_wright@gameofthron.es',
 'name': 'Bran Stark',
 'password': '$2b$12$Z7/ztVm8eWMDwTg.doS.UO7JbsbA9IbLomND1VxIZEdAN3keW6csS'}


Kommentare ausgeben

In [6]:
comments = db.comments.find({})[:5]
for comment in comments:
    jsonp(comment)

{'_id': ObjectId('5a9427648b0beebeb6957a21'),
 'date': datetime.datetime(1981, 11, 8, 4, 32, 25),
 'email': 'tom_wlaschiha@gameofthron.es',
 'movie_id': ObjectId('573a1390f29313caabcd516c'),
 'name': "Jaqen H'ghar",
 'text': 'Minima odit officiis minima nam. Aspernatur id reprehenderit eius '
         'inventore amet laudantium. Eos unde enim recusandae fugit sint.'}
{'_id': ObjectId('5a9427648b0beebeb6957abd'),
 'date': datetime.datetime(1972, 4, 16, 14, 52, 53),
 'email': 'john_bishop@fakegmail.com',
 'movie_id': ObjectId('573a1391f29313caabcd6f98'),
 'name': 'John Bishop',
 'text': 'Accusamus qui distinctio ut ab saepe tenetur. Quae optio aut eius '
         'deleniti veritatis error. Eligendi ducimus rerum recusandae '
         'doloribus. Natus quisquam expedita voluptatum voluptatibus natus '
         'quidem.'}
{'_id': ObjectId('5a9427648b0beebeb6957bde'),
 'date': datetime.datetime(1997, 7, 26, 0, 4, 16),
 'email': 'roger_ashton-griffiths@gameofthron.es',
 'movie_id': ObjectId(

find() Nach einer bestimmten Bedingung ausgeben

In [7]:
users2 = db.users.find({'name':"Cersei Lannister"})
for user in users2:
    jsonp(user)

{'_id': ObjectId('59b99db6cfa9a34dcd7885ba'),
 'email': 'lena_headey@gameofthron.es',
 'name': 'Cersei Lannister',
 'password': '$2b$12$FExjgr7CLhNCa.oUsB9seub8mqcHzkJCFZ8heMc8CeIKOZfeTKP8m'}


In [40]:
query = {'title': 'Interstellar'}
movie = db.movies.find(query)
for film in movie:
    jsonp(film)

{'_id': ObjectId('573a13b9f29313caabd4ddff'),
 'awards': {'nominations': 100,
            'text': 'Won 1 Oscar. Another 44 wins & 100 nominations.',
            'wins': 45},
 'cast': ['Ellen Burstyn',
          'Matthew McConaughey',
          'Mackenzie Foy',
          'John Lithgow'],
 'countries': ['USA', 'UK', 'Canada'],
 'directors': ['Christopher Nolan'],
 'fullplot': 'In the near future, Earth has been devastated by drought and '
             'famine, causing a scarcity in food and extreme changes in '
             'climate. When humanity is facing extinction, a mysterious rip in '
             'the space-time continuum is discovered, giving mankind the '
             'opportunity to widen its lifespan. A group of explorers must '
             'travel beyond our solar system in search of a planet that can '
             'sustain life. The crew of the Endurance are required to think '
             'bigger and go further than any human in history as they embark '
             'on 

find() mit "$gt" und "$lt" abfrage 
"$gt" => greater than
"$lt" => lower than

In [9]:
yearquery = {'year':{"$lt":1893}}
movie2 = db.movies.find(yearquery)[:3]
for film in movie2:
    jsonp(film)

{'_id': ObjectId('573a13a3f29313caabd0d5a4'),
 'awards': {'nominations': 0, 'text': '1 win.', 'wins': 1},
 'countries': ['USA'],
 'directors': ['William K.L. Dickson'],
 'fullplot': 'A young man stands before the camera holding a club in each '
             'hand, horizontal to the ground. He raises the heads of the two '
             'clubs in unison, by rotating the clubs without lifting his arms. '
             'The film then shows the same footage over again, at different '
             'speeds.',
 'genres': ['Documentary', 'Short'],
 'imdb': {'id': 241763, 'rating': 4.9, 'votes': 827},
 'languages': ['English'],
 'lastupdated': '2015-08-03 00:57:26.680000000',
 'num_mflix_comments': 1,
 'plot': 'An athlete swings Indian clubs.',
 'rated': 'NOT RATED',
 'runtime': 1,
 'title': 'Newark Athlete',
 'tomatoes': {'lastUpdated': datetime.datetime(2012, 9, 30, 0, 0)},
 'type': 'movie',
 'year': 1891}


find() mit "$regex" Abfrage nach einer Zeichenkette

titlequery = {'title':{"$regex":"^s"}}
movie3 = db.movies.find(titlequery)[:2]
for film in movie3:
    jsonp(film)

In [10]:
genrequery = {'genres':{"$regex":"^Drama"}}
movie4 = db.movies.find(genrequery)[:2]
for film in movie4:
    jsonp(film)

{'_id': ObjectId('573a1390f29313caabcd4eaf'),
 'awards': {'nominations': 0, 'text': '1 win.', 'wins': 1},
 'cast': ['Jane Gail', 'Ethel Grandin', 'William H. Turner', 'Matt Moore'],
 'countries': ['USA'],
 'directors': ['George Loane Tucker'],
 'genres': ['Crime', 'Drama'],
 'imdb': {'id': 3471, 'rating': 6.0, 'votes': 371},
 'languages': ['English'],
 'lastupdated': '2015-09-15 02:07:14.247000000',
 'num_mflix_comments': 1,
 'plot': 'A woman, with the aid of her police officer sweetheart, endeavors to '
         'uncover the prostitution ring that has kidnapped her sister, and the '
         'philanthropist who secretly runs it.',
 'poster': 'https://m.media-amazon.com/images/M/MV5BYzk0YWQzMGYtYTM5MC00NjM2LWE5YzYtMjgyNDVhZDg1N2YzXkEyXkFqcGdeQXVyMzE0MjY5ODA@._V1_SY1000_SX677_AL_.jpg',
 'rated': 'TV-PG',
 'released': datetime.datetime(1913, 11, 24, 0, 0),
 'runtime': 88,
 'title': 'Traffic in Souls',
 'tomatoes': {'dvd': datetime.datetime(2008, 8, 26, 0, 0),
              'lastUpdated':

find() Meherere User nach mehreren Bedingungen ausgeben

In [11]:
users3 = db.users.find({"$or":[{'name':"Cersei Lannister"},{'name':"Catelyn Stark"}]})
for user in users3:
    jsonp(user)

{'_id': ObjectId('59b99db6cfa9a34dcd7885ba'),
 'email': 'lena_headey@gameofthron.es',
 'name': 'Cersei Lannister',
 'password': '$2b$12$FExjgr7CLhNCa.oUsB9seub8mqcHzkJCFZ8heMc8CeIKOZfeTKP8m'}
{'_id': ObjectId('59b99db5cfa9a34dcd7885b9'),
 'email': 'michelle_fairley@gameofthron.es',
 'name': 'Catelyn Stark',
 'password': '$2b$12$fiaTH5Sh1zKNFX2i/FTEreWGjxoJxvmV7XL.qlfqCr8CwOxK.mZWS'}


find() mit sort() verknuepfen
-1 = absteigend
1  = aufsteigend

In [12]:
movie5 = db.movies.find().sort('runtime', -1)[:5] 
for film in movie5:
    jsonp(film)

{'_id': ObjectId('573a1397f29313caabce69db'),
 'awards': {'nominations': 2,
            'text': 'Nominated for 2 Golden Globes. Another 3 wins & 2 '
                    'nominations.',
            'wins': 5},
 'cast': ['Raymond Burr',
          'Barbara Carrera',
          'Richard Chamberlain',
          'Robert Conrad'],
 'countries': ['USA'],
 'fullplot': 'This is the story of the evolution of the town Centennial, '
             'Colorado. It follows the paths of dozens of people who come to '
             'the area for many reasons: money, freedom, or crime. It also '
             'shows the bigoted treatment of the Native Indians by the '
             'advancing US colonists. It is topped off with a murder mystery '
             'that takes 100 years to solve.',
 'genres': ['Action', 'Adventure', 'Drama'],
 'imdb': {'id': 76993, 'rating': 8.5, 'votes': 2071},
 'languages': ['English'],
 'lastupdated': '2015-09-02 00:38:39.193000000',
 'num_mflix_comments': 1,
 'plot': 'The economi

Mehere Abfragen gleichzeitig

In [38]:
movie6 = db.movies.find({ 'year': {'$gt': 2007}, '$and': [{'genres': 'Drama'},{'type': 'series'}]})[:2]
for film in movie6:
    jsonp(film)

{'_id': ObjectId('573a13aef29313caabd2c99e'),
 'awards': {'nominations': 32,
            'text': 'Nominated for 1 Golden Globe. Another 23 wins & 32 '
                    'nominations.',
            'wins': 24},
 'cast': ['James Badge Dale',
          'Joseph Mazzello',
          'Jon Seda',
          'Sebastian Bertoli'],
 'countries': ['USA'],
 'fullplot': 'The Pacific follows the lives of a U.S Marine Corps squad during '
             'the campaign within the Pacific against the Japanese Empire '
             'during WW2. Made by the creators of Band of Brothers, it follows '
             'a similar line of thought to outline the hardships of the common '
             'man during war. the Pacific is in parts a fast paced war series '
             'that can be enjoyed by action lovers whilst containing a more '
             'sensitive side when projecting the relationships (brotherhood) '
             'of Marines on the battlefield. where the Pacific takes a new '
             'direc

Alle genres von Movie abfragen mit distinct()

In [14]:
allgenres = db.movies.distinct('genres')
print(allgenres)

['Action', 'Adventure', 'Animation', 'Biography', 'Comedy', 'Crime', 'Documentary', 'Drama', 'Family', 'Fantasy', 'Film-Noir', 'History', 'Horror', 'Music', 'Musical', 'Mystery', 'News', 'Romance', 'Sci-Fi', 'Short', 'Sport', 'Talk-Show', 'Thriller', 'War', 'Western']


In [15]:
query = {'awards.nominations': {'$gt': 1}}
nomination = db.movies.find(query)
for num in nomination:
    print(num['awards']['nominations'])

2
2
3
2
3
3
2
3
4
2
2
5
4
3
5
4
4
2
4
2
6
3
3
2
2
7
3
5
3
4
3
3
5
2
2
3
2
2
5
5
2
2
4
6
6
2
3
4
2
4
7
2
4
2
6
2
3
3
7
5
2
7
6
5
3
4
2
4
2
4
2
2
5
5
3
6
3
4
2
2
4
4
2
5
6
4
5
2
2
4
3
2
2
2
4
2
2
2
4
3
2
5
3
2
8
2
4
9
5
2
3
2
5
4
2
2
2
6
7
3
3
2
5
10
2
5
9
3
11
5
8
2
3
3
6
3
4
7
4
3
4
7
10
13
7
5
4
9
5
3
6
6
3
3
4
5
3
4
5
6
5
7
3
2
3
2
5
6
2
3
3
2
5
2
3
7
3
2
2
3
2
2
4
3
14
2
2
3
4
2
2
2
2
2
4
8
2
4
9
2
2
4
4
2
4
16
9
4
2
2
4
2
3
2
2
2
2
2
3
4
8
2
3
5
2
2
2
2
6
2
3
7
4
3
14
2
2
10
8
4
5
3
2
5
2
2
2
2
2
3
13
4
2
3
2
3
7
3
2
2
2
2
2
2
2
3
2
2
2
2
4
3
5
3
4
5
9
2
2
3
4
5
9
4
3
7
2
6
2
2
2
3
6
2
4
3
5
8
4
10
3
3
7
10
2
5
8
6
2
2
2
14
2
2
3
9
5
4
7
7
3
4
8
2
4
3
2
8
2
3
2
4
2
4
4
2
2
2
3
5
2
2
4
2
5
6
3
7
7
4
2
7
3
2
4
7
6
4
2
9
2
3
2
4
4
3
5
3
5
4
2
10
2
11
6
6
2
3
3
2
3
2
2
2
2
2
3
4
5
6
4
7
4
2
2
4
2
4
2
3
5
7
2
6
8
6
5
6
7
3
2
2
2
6
13
4
4
5
3
5
4
2
3
3
5
8
4
4
3
2
8
6
4
3
5
3
9
5
3
2
5
6
2
5
2
2
2
2
5
13
2
2
3
5
3
8
4
3
6
4
4
4
4
2
8
2
5
2
19
4
2
5
3
5
2
5
5
2
5
2
2
6
2
3
5
12
4
6
3
2
2


7
18
3
25
5
2
2
16
4
10
6
69
9
4
8
10
7
2
24
18
6
8
5
2
5
24
8
3
2
6
3
3
6
6
2
3
40
2
6
35
2
7
7
3
2
7
4
2
2
15
2
2
2
2
2
48
15
12
95
2
9
6
7
3
4
10
2
3
6
6
6
5
5
2
2
3
35
16
2
9
4
3
5
8
2
2
5
2
9
9
3
2
34
3
2
2
6
2
4
17
27
16
3
2
3
3
50
3
3
3
2
4
20
7
8
19
6
2
2
4
3
107
2
4
8
7
5
2
3
7
2
4
3
5
6
7
2
5
7
2
7
8
5
2
2
8
2
2
2
5
3
3
2
8
2
15
2
3
25
3
2
4
2
11
31
3
7
5
2
11
7
3
3
5
4
11
5
2
30
3
2
3
7
2
4
21
8
2
22
9
4
5
4
5
6
16
5
3
5
5
8
7
5
2
3
2
5
7
8
2
17
44
5
4
16
5
4
3
3
2
2
17
2
16
3
3
10
2
10
22
3
6
35
8
8
3
5
2
3
2
2
14
10
42
2
17
9
5
9
2
6
6
12
2
27
3
3
2
4
3
2
2
11
4
3
6
17
21
3
5
2
8
14
3
4
2
2
3
2
3
7
3
2
26
3
4
14
4
6
9
7
2
60
3
6
3
5
2
3
3
28
112
19
3
3
2
9
3
7
50
2
3
23
2
6
20
2
8
2
2
3
3
14
6
43
58
7
3
7
8
2
3
4
23
7
7
2
10
2
8
3
10
8
9
6
12
14
2
6
77
6
3
3
5
8
5
5
3
5
14
3
6
4
6
8
3
3
5
4
3
3
46
3
7
3
3
3
2
6
10
2
3
11
4
2
2
2
2
8
2
6
5
7
2
8
5
7
5
11
10
9
5
4
3
4
3
5
4
6
5
4
2
2
11
7
2
5
5
2
2
5
7
16
7
2
3
2
2
2
21
4
3
17
23
7
5
2
8
9
8
22
4
4
2
2
14
4
7
11
5
20
33
59
9

3
20
7
2
3
2
2
4
2
3
3
5
5
12
2
2
12
2
2
22
6
5
2
2
5
8
2
5
9
4
2
2
33
10
20
4
8
3
3
3
4
7
4
3
13
12
2
3
6
4
10
2
6
4
28
43
4
2
7
16
4
5
3
11
5
4
7
2
7
37
32
7
7
5
7
3
9
5
3
134
2
6
6
17
3
4
30
39
2
2
6
4
14
12
8
3
5
14
4
17
14
5
26
30
22
3
9
6
3
2
20
6
7
6
7
54
5
5
2
15
16
2
7
4
8
6
2
12
11
3
3
21
9
8
11
3
7
4
3
4
8
6
10
8
2
27
2
2
8
37
13
10
2
14
17
9
2
6
3
13
2
12
56
5
3
5
6
4
18
5
5
5
7
5
3
18
4
7
27
9
19
3
9
3
2
12
43
2
9
4
10
3
2
9
2
3
2
6
2
21
11
45
39
14
8
2
10
3
21
3
3
2
10
2
18
2
3
4
2
13
2
3
2
12
2
47
6
2
3
2
14
4
2
2
35
13
3
2
3
7
19
7
2
35
5
21
2
8
4
7
11
3
2
2
10
5
4
4
2
5
2
2
12
4
3
2
2
64
3
3
19
49
13
7
16
7
2
4
4
25
3
2
13
3
11
6
86
9
4
4
2
5
3
2
48
11
7
5
2
9
3
4
3
2
4
3
9
3
2
2
9
3
6
2
12
33
2
11
4
6
10
3
5
6
3
11
21
77
21
7
5
4
2
2
3
2
8
8
7
8
4
4
7
6
5
2
4
7
7
6
2
5
3
3
39
95
3
2
7
2
9
5
3
2
2
11
5
2
4
13
28
5
10
3
2
30
20
3
8
2
5
3
4
9
38
3
5
4
9
5
14
2
2
3
2
13
9
65
4
2
12
7
5
4
3
11
5
9
3
14
5
9
9
5
40
3
2
5
4
2
9
5
6
19
3
2
7
4
6
14
12
6
172
11
14
3
11
12
2
2
4

In [44]:
titles = db.movies.find({ 'year': {'$gt': 2013}, '$and': [{'genres': 'Sci-Fi'},{'type': 'movie'},{'awards.wins': {'$gt': 44}}]})
for film in titles:
    jsonp(film['title'])

'Interstellar'
'Guardians of the Galaxy'


Join of movies and comments

In [17]:
allmovies = db.movies.find({})
allcomments = db.comments.find({})

print(join(allmovies, allcomments, "_id", "movie_id"))

                          _id_x  \
0      573a1390f29313caabcd4eaf   
1      573a1390f29313caabcd516c   
2      573a1390f29313caabcd587d   
3      573a1391f29313caabcd6d40   
4      573a1391f29313caabcd6f98   
...                         ...   
41074  573a13f9f29313caabdea062   
41075  573a13faf29313caabdeba84   
41076  573a13faf29313caabdebcea   
41077  573a13faf29313caabdec056   
41078  573a13faf29313caabdeca48   

                                                    plot  \
0      A woman, with the aid of her police officer sw...   
1      Original advertising for the film describes it...   
2      At 10 years old, Owens becomes a ragged orphan...   
3      A tipsy doctor encounters his patient sleepwal...   
4      A romantic rivalry among members of a secret s...   
...                                                  ...   
41074                                                NaN   
41075  A man who supposley killed his family hides at...   
41076  In the form of an animated docu-

Sammlung Komplexer Fragestellungen
Welche Filme erhalten die meisten Kommentare? (Schauspieler, Genres etc..)
Welche Filme haben die meisten Kommentaren?
Welche Filme mit 10 awards Gewinnen haben die meisten Kommentar

In [32]:
test = db.movies.aggregate([
    {"$group" : {"_id":{"_id":"$movie_id"}, "count":{"$sum":1}}}
])
for test2 in test:
    jsonp(test2)

{'_id': {'_id': None}, 'count': 23530}
