Skip to content

Commit 3d29468

Browse files
committed
Final transcripts (ch 9 and 10) for course.
1 parent edf7c0c commit 3d29468

File tree

26 files changed

+1457
-0
lines changed

26 files changed

+1457
-0
lines changed

Diff for: transcripts/ch10-conclusion/1.txt

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
00:00 There it is, the finish line!
2+
00:03 That's right, you've made it all the way to the end of this course,
3+
00:05 I hope you found it super interesting and you've learned a lot,
4+
00:08 because I believe you now have enough to build production ready applications
5+
00:13 and deploy them based on MongoDB.
6+
00:15 So really, the big question you need to be asking yourself is
7+
00:17 what are you going to build now,
8+
00:19 you have this amazing new power, this amazing new database,
9+
00:22 and way of writing data driven applications, what are you going to build?
10+
00:24 I hope you take what you learned in this course,
11+
00:26 and you go build something amazing.
12+
00:28 Now, before you do leave, and you go build that thing,
13+
00:31 let's talk about a few wrap up details;
14+
00:33 first of all, make sure you get the materials from the github repository,
15+
00:36 if you haven't already, go to
16+
00:38 github.com/mikeyckennedy/mongodb-for-python-developers,
17+
00:42 the url is there at the bottom, and star this, and consider also forking it
18+
00:46 so you have a permanent version for yourself.
19+
00:49 As far as I know, the git materials are entirely finished and published,
20+
00:54 there is a chance that somebody will find a small bug
21+
00:57 throughout the course and I'll amend that,
22+
00:59 so very likely what you see at this github repository is the final materials,
23+
01:04 it's certainly what you saw me create online during these videos.

Diff for: transcripts/ch10-conclusion/2.txt

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
00:01 Before we put the wraps on this course
2+
00:04 let's do a quick lightning review of each chapter that we've covered.
3+
00:06 We're certainly not going to cover everything that we covered in the chapter,
4+
00:09 this is just a really quick review, but maybe the main takeaway from each chapter.
5+
00:13 So we began the course by talking about what is NoSql,
6+
00:17 and I think there's a little bit of a misunderstanding
7+
00:20 or maybe multiple definitions of what NoSql means
8+
00:23 sometimes people say it's not only sql,
9+
00:26 sometimes you people say it means that there's no sql, the language involved in this.
10+
00:31 Well what we saw is looking at the history back in 2009,
11+
00:34 this concept of NoSql came about by a meeting of people
12+
00:39 working on horizontal scales type of databases,
13+
00:42 like what trade-offs do they make against relational databases,
14+
00:45 so that they are more easily horizontally scalable,
15+
00:48 and basically cluster friendly databases.
16+
00:50 That world it's not whether or not there's no sequel
17+
00:53 or there is sequel in the language, it's really about the style of databases
18+
00:57 and the trade-offs around how they work with that data.

Diff for: transcripts/ch10-conclusion/3.txt

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
00:01 The MongoDB shell and native query syntax;
2+
00:03 we saw that the MongoDB shell which you start by typing the word 'mongo'
3+
00:07 and it just runs the shell, tries to talk to the local one,
4+
00:10 there's all the different ways to get it to connect to different servers as we've seen.
5+
00:13 So once it starts you get this little greater than prompt
6+
00:16 and you write Javascript so we interact with MongoDB at the lowest level
7+
00:22 in Javascript in a textual way
8+
00:24 and actually this is converted to bson a binary extended version of json.
9+
00:29 So here we type something like db so this is the database we have active
10+
00:33 and book would be the collection name
11+
00:35 or table if you're still thinking relationally, but the collection name,
12+
00:38 and we say things like find or count or sort, or things like this
13+
00:42 and what we give it is this prototypical json object
14+
00:45 and what we get back are all the things that match the elements of that prototype.
15+
00:50 So here you can see we got two records back
16+
00:53 and they both had the same title as the title we indicated here.
17+
00:56 So it's very much about passing these prototypical json documents,
18+
01:00 however sometimes we have to do more than just say
19+
01:04 I want basically equality in my search,
20+
01:07 I would like to express things like greater than.
21+
01:09 So this query here that we have written
22+
01:12 is actually doing a couple of very interesting things,
23+
01:14 maybe the thing that stands out the most is this greater than operator,
24+
01:17 so the dollar gte is indicating, the dollar indicates an operator,
25+
01:20 and gte is the name the greater than or equal to operator,
26+
01:23 so instead of just saying ratings.value is nine,
27+
01:26 we're saying I'd like all the ratings where the value is either equal to or greater than nine.
28+
01:30 The other powerful and interesting thing here is
29+
01:33 we're actually traversing this hierarchy of the document
30+
01:36 we're going to find the ratings array which is a list of subdocuments
31+
01:39 which has a value as an integer,
32+
01:41 so we're actually reaching down inside that document
33+
01:44 and we're doing this query with this operator.

Diff for: transcripts/ch10-conclusion/4.txt

+63
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
00:00 Next step we worked with— PyMongo.
2+
00:02 So we put our Javascript away, we said all right enough with the Javascript stuff,
3+
00:05 we're going to write in Python basically for the rest of this course.
4+
00:08 So the lowest level way to talk to MongoDB from Python is with PyMongo.
5+
00:13 So let's look at a couple of the crud operations here.
6+
00:16 We'll start of course by importing the package, import PyMongo,
7+
00:20 and if you don't have it just pip install it;
8+
00:22 and then we need to create a Mongo client by passing a connection string,
9+
00:26 I believe if you actually get a hold of the PyMongo connection
10+
00:29 you can use it directly, but you should not, because the Mongo client handles
11+
00:34 reconnects and connection pulling is stuff like that
12+
00:36 whereas the connection itself wouldn't do those kinds of things.
13+
00:39 Then if we want to work with the database,
14+
00:42 we have this sort of interesting highly dynamic api,
15+
00:46 we go to the client and we just say . (dot) the name of the database
16+
00:49 so we say client.the_small_bookstore, and we assign that to db
17+
00:54 so it looks like the rest of the shell stuff that we have been doing,
18+
00:57 but technically that's optional.
19+
00:59 This database doesn't even have to exist,
20+
01:02 we could create the database in this style just by doing our first insert into it.
21+
01:05 Whether or not it exists, we get all the database
22+
01:08 and now we can operate on the collections.
23+
01:11 Let's imagine that in that database there's a collection called books
24+
01:15 and we want to know how many of them are,
25+
01:17 we would just say db.books.count
26+
01:20 and that would actually go there and do this operation.
27+
01:22 If it happens to be that either the database of the collection doesn't exist,
28+
01:25 it doesn't crash, you get zero.
29+
01:27 We could also do a find_one, this line here is notable
30+
01:31 because in the Javascript api is findOne
31+
01:34 and they've made a pythonic version here, so find_one
32+
01:39 just be aware that it's not always a one to one
33+
01:42 exact verbatim match from the native query syntax over to PyMongo.
34+
01:46 We can also do an actual search,
35+
01:50 before we said find_one I basically got the first
36+
01:54 here we're going to say I want to find a book by isbn, I want to pass it over,
37+
01:57 here we use Python dictionaries
38+
01:59 which play the role of those prototypical json objects.
39+
02:01 We also insert new data, so here we're going to say
40+
02:06 insert this thing which is a dictionary, it has a title called new book
41+
02:10 and an isbn of whatever is written there and we get back this result,
42+
02:15 the result will have this object id in the field inserted _id,
43+
02:20 we can go requery it and do all sorts of stuff with it.
44+
02:23 Basically when we say insert one, we get this result
45+
02:26 which, if it succeeds has the inserted id.
46+
02:29 Now these are the straightforward crud operations,
47+
02:31 we can also use our fancy in place operators,
48+
02:34 so here let's just insert this book, so we see what we get,
49+
02:36 and we grab a hold of the inserted id,
50+
02:38 and now suppose we want to add a field called favorited_by,
51+
02:43 and this is going to be a list, and we want the list to be basically distinct
52+
02:47 we're adding the ids of the customers or people visiting our site
53+
02:50 who have favorited in this book, and we'd like to put them in there
54+
02:54 but there's no reason to have them in there twice,
55+
02:56 that can cause all sorts of problems.
56+
02:58 We're going to use the dollar add to set, so we run this,
57+
03:01 run it again for 1002, and hey we could run it a second time for 1002,
58+
03:05 and what we'll end up with is an object that looks like this,
59+
03:08 the two things we inserted, the generated_id
60+
03:11 and his favorited_by list which has 1001 and 1002.
61+
03:15 Definitely keep in mind these in place operators
62+
03:19 because they're very powerful and they leverage some of the special properties
63+
03:23 of the way MongoDB treats documents atomically.

Diff for: transcripts/ch10-conclusion/5.txt

+48
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
00:01 Next up was document design.
2+
00:03 Some of the concepts and ideas of relational databases still apply here,
3+
00:07 you still are modeling data, you still put it into a database,
4+
00:10 but many of the techniques fall down,
5+
00:13 this whole concept of third normal form
6+
00:15 doesn't make nearly as much sense as it does in a relational database.
7+
00:18 What more we focus on often is really
8+
00:21 how do we make relationships either between documents or within documents.
9+
00:25 We saw the primary question, not the only one, but the most challenging one,
10+
00:30 the one you have to think most carefully about is to embed or not to embed,
11+
00:34 and I gave you a few rules or tips to help you guide this decision.
12+
00:38 One— is the embedded data wanted and you use it 80 percent of the time or more,
13+
00:44 most of the time when you get that containing document?
14+
00:48 If that's true, you probably want to embed,
15+
00:51 if that's false, maybe consider that as a warning sign not to.
16+
00:54 How often do you want the embedded document without the outer containing document?
17+
00:59 If often what you really want to get access to is these little inside pieces,
18+
01:03 there's a lot of overhead and it really kind of complicates the way
19+
01:07 you access it through your application,
20+
01:09 if you want to get them most of the time, or frequently, on their own.
21+
01:13 Is the embedded data abounded set?
22+
01:16 Remember, these documents can only be sixteen megabytes or larger,
23+
01:19 the number is way higher than you really want it to be,
24+
01:22 if this is an unbounded set you're going to continue to add to it,
25+
01:25 it very easily could outgrow the actual size that you're allowed to store.
26+
01:28 Really for a performance reason though, is it abounded set and is that set small?
27+
01:34 Because if you put huge amounts of data in there,
28+
01:36 you're going to really slow down your read time
29+
01:38 for these database operations that involve this document.
30+
01:41 These are the four main rules here,
31+
01:43 you also want to consider how your application accesses this data,
32+
01:47 it might be really easy to answer these four questions
33+
01:50 because there's a very constrained and small set of queries
34+
01:53 you run against your database;
35+
01:55 or it could be that you ask all sorts of questions in a highly varied ways
36+
01:59 in which case it's harder to answer those questions,
37+
02:02 the more types of queries you have the harder it is to know
38+
02:05 whether most of the time you want the embedded data for example.
39+
02:08 The more varied your queries are, the more you'll trend
40+
02:11 towards third normal form, relational style and less embedding.
41+
02:15 One of the situations where you have lots of varied queries is
42+
02:18 if you have this thing called an integration database,
43+
02:21 which we talked about sort of sharing a database across different applications,
44+
02:24 versus having one dedicated to a particular application
45+
02:27 where you can understand these questions very clearly.
46+
02:30 So when you're designing these documents
47+
02:33 you want to really think most carefully about do you want to embed this data
48+
02:36 or create a soft foreign key type of relationship.

Diff for: transcripts/ch10-conclusion/6.txt

+72
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
00:00 After we talked about document design
2+
00:03 and we talked about the raw access from PyMongo
3+
00:05 we said let's take this up a level of abstraction,
4+
00:08 let's actually build classes and map those over ORM style into MongoDB.
5+
00:14 We saw a really nice way to do that is with the ODM called MongoEngine.
6+
00:19 Let's review the main way that we sort of define classes
7+
00:23 and add constraints and things like that.
8+
00:25 Over here we are going to create this car object, this is our dealership example
9+
00:30 and we are going to store the car in the database.
10+
00:33 The way we create something that MongoEngine can manage
11+
00:37 in MongoDB as a top level document,
12+
00:40 is that we're going to derive from mongoengine.document.
13+
00:43 And then every field is going to be one of these fundamental field types,
14+
00:46 like StringField, IntField, FloatField and so on.
15+
00:50 And we can have some of them required, the first three required,
16+
00:53 we can have some of them with basic default values, like mileage defaults to zero
17+
00:59 but we can also have interesting functions,
18+
01:01 for example the vin number is automatically generated
19+
01:04 and we're based in this on the uuid4 random alphanumeric thing,
20+
01:08 so what we have here so far is really sort of equivalent
21+
01:11 to what you might have in a traditional relational database,
22+
01:15 there's entry and there is a flat set of what you would call columns,
23+
01:19 this is only part of the story,
24+
01:21 remember we can have nested documents,
25+
01:24 we can have actually a rich hierarchy of nested objects.
26+
01:27 One thing we might want to store in the car is an engine
27+
01:30 and the engine itself is a special type,
28+
01:33 here in the field it's going to be an embedded document field
29+
01:36 an engine derives from mongoengine.EmbeddedDocument,
30+
01:40 not document, embedded document.
31+
01:42 These we're never going to directly insert into the database,
32+
01:44 in fact, we're going to always put them into a car,
33+
01:48 so this is like a strong relationship between a car and its engine,
34+
01:51 we can even mark it as required.
35+
01:53 Now going a little further than that,
36+
01:55 our service history actually contains a list of subdocuments,
37+
01:58 each one modeled by the service record.
38+
02:00 The service record has things like the customer satisfaction,
39+
02:03 what service was performed and so on.
40+
02:06 Now if we take this, put some appropriate data into it and store it,
41+
02:10 we'll get something looking along the lines of this,
42+
02:12 in our document database in MongoDB,
43+
02:15 so here we have the first few elements that are just the flat fields
44+
02:18 and then we have the nested engine, one of them,
45+
02:21 we have the nested array of nested items for the service histories,
46+
02:24 and this really gets at the power of MongoDB,
47+
02:28 this nesting and these strong relationships
48+
02:31 where you get this aggregate object the car,
49+
02:34 that always contains everything we need to know about it.
50+
02:37 How about queering— we're not going to write now in the low level api,
51+
02:42 we're going to use basically the properties of these objects.
52+
02:46 Here's the function that we wrote where we wanted to ask the question
53+
02:49 what percentage of cars have bad customer rating,
54+
02:53 that would be average or below,
55+
02:56 so we're going to go to the car and we say objects,
56+
02:58 we could do lots of these objects.filter.filter.filter
57+
03:02 but if you just have one query you can just stick it in object,
58+
03:04 so as the objects service_history, now we can't say dot here,
59+
03:08 because service_history . customer_rating
60+
03:10 would not be a valid variable name or parameter name in Python,
61+
03:13 so we're going to traverse a hierarchy with a double underscore.
62+
03:17 We also might want to apply one of the operators,
63+
03:18 in this case we're going to say less than 4,
64+
03:21 so we're going to use again this double underscore,
65+
03:24 but in this case it's going to say on the left is the name of the target
66+
03:28 and on the right is the operator we're going to apply to it.
67+
03:31 You don't put the dollar again, that wouldn't be valid in Python,
68+
03:34 but double underscore __lt, and then we can ask
69+
03:38 things like count, or go and get the first one, or things like that.
70+
03:42 We can even do paging by slicing on that result.
71+
03:45 This syntax lets us use almost the entire spectrum of the way of creating MongoDB
72+
03:50 really straightforward and in a way that ties back to the car object that we defined.

Diff for: transcripts/ch10-conclusion/7.txt

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
00:01 At this point, we pretty much had MongoDB
2+
00:03 doing everything we needed it to do,
3+
00:05 and we'd heard MongoDB was fast,
4+
00:07 but it turned out it didn't really seem to be behaving as quickly as maybe we hoped,
5+
00:11 we put a ton of data from our dealership in there,
6+
00:14 and we were getting query times of like one second, 700 milliseconds, stuff like that.
7+
00:17 It was okay, but really, we saw it can do much better.
8+
00:20 What levers and knobs do we have to turn to make this faster?
9+
00:24 The most important one, even more important than in relational databases,
10+
00:28 are the indexes, we'll see MongoEngine as well as PyMongo in the shell
11+
00:33 all have really good ways to deal with this.
12+
00:35 Document design is really important, mostly around this embedding question
13+
00:39 but there are many ways to think about document design,
14+
00:42 there's a lot of really non intuitive and powerful patterns,
15+
00:45 design patterns you can apply here.
16+
00:48 What is your query style, maybe one query is better than another
17+
00:51 and using projections to only pull back a subset of responses,
18+
00:56 suppose we have a car that has a ton of those service histories
19+
00:59 and we don't care about them for a particular query
20+
01:02 we could suppress returning those from the database
21+
01:04 which saves us a lot of bandwidth on the network,
22+
01:07 disks reads on the database server and deserialization processing on our side.
23+
01:11 We also saw there is some network apology things we can do,
24+
01:15 replication and sharding, and those are both interesting and powerful
25+
01:19 but not part of this course, so go check that out on your own if you're interested.
26+
01:23 For indexes, we took an example like our car
27+
01:27 and we said let's suppose we have make here
28+
01:30 that we're interested in querying by a service history,
29+
01:32 and if you look below how service history is defined as the service record objects
30+
01:36 and they have a description and a customer rating
31+
01:39 and things like this, price for example,
32+
01:41 so our goal is to query these things, the make, the service history and stuff, quickly,
33+
01:45 so we saw adding an index which really a powerful way to do that,
34+
01:48 so all we've got to do is go to our meta object, our meta element here
35+
01:52 and say these are the index as an array
36+
01:55 now these indexes can simply be the name of the thing,
37+
01:58 like make that's super straightforward,
38+
02:01 they could traverse the hierarchy using the Javascript style, using the dot,
39+
02:05 so we'll service_history.customer_rating
40+
02:08 and that would go down and let us do queries deep into these cars
41+
02:12 and say let's find the ones that are either good or low customer ratings
42+
02:17 and we can even do composite indexes,
43+
02:19 so here we're having a composite index on price and description,
44+
02:22 within the service history, so we do that by having this fields dictionary thing
45+
02:27 and the fields are an array, so you can use the simple version
46+
02:29 or if you need to, you can get a more complex definition of the index there.

0 commit comments

Comments
 (0)