# Introduction to Indexes

<img src="img/2.png">

<img src="img/1.png">

<img src="img/3.png">

---

# How Data is Stored on Disk

<img src="img/4.png">

<img src="img/5.png">

<img src="img/6.png">

# Single Field Indexes

In [None]:
# execute the following query and collect execution statistics
db.people.find({ "ssn" : "720-38-5636" }).explain("executionStats")

<img src="img/7.png">

<img src="img/8.png">

In [None]:
# create an ascending index on ssn
db.people.createIndex( { ssn : 1 } )

In [None]:
# create an explainable object for the people collection
exp = db.people.explain("executionStats")

# execute the same query again (should use an index)
exp.find( { "ssn" : "720-38-5636" } ) 

<img src="img/9.png">

<img src="img/10.png">

In [None]:
# execute a new query on the explainable object (can't use the index)
exp.find( { last_name : "Acevedo" } )

<img src="img/11.png">

<img src="img/12.png">

In [None]:
# insert a documents with an embedded document
db.examples.insertOne( { _id : 0, subdoc : { indexedField: "value", otherField : "value" } } )
db.examples.insertOne( { _id : 1, subdoc : { indexedField : "wrongValue", otherField : "value" } } )

# create an index using dot-notation
db.examples.createIndex( { "subdoc.indexedField" : 1 } )

# explain a query using dot-notation
db.examples.explain("executionStats").find( { "subdoc.indexedField" : "value" } )

<img src="img/13.png">

In [None]:
# explain a range query (using an index)
exp.find( { ssn : { $gte : "555-00-0000", $lt : "556-00-0000" } } )

In [None]:
# explain a query on a set of values
exp.find( { "ssn" : { $in : [ "001-29-9184", "177-45-0950", "265-67-9973" ] } } )

In [None]:
# explain a query where only part of the predicates use a index
exp.find( { "ssn" : { $in : [ "001-29-9184", "177-45-0950", "265-67-9973" ] }, last_name : { $gte : "H" } } )

# Understanding Explain

In [None]:
# switch to the m201 database
use m201

In [None]:
# create an explainable object with no parameters
# 越下面的情況回復的越複雜
# 下面這種寫法是還沒有 execute前看看 performance
exp = db.people.explain()

In [None]:
# create an explainable object with the 'executionStats' parameter
# 但是這個就會 execute
expRun = db.people.explain("executionStats")

In [None]:
# and one final explainable object with the 'allPlansExecution' parameter
# 這個也會 execute
expRunVerbose = db.people.explain("allPlansExecution")

In [None]:
# execute and explain the query, collecting execution statistics
expRun.find({"last_name":"Johnson", "address.state":"New York"})

In [None]:
# create an index on last_name
db.people.createIndex({last_name:1})

In [None]:
# rerun the query (uses the index)
# 不錯但不是最好
expRun.find({"last_name":"Johnson", "address.state":"New York"})

In [None]:
# create a compound index
db.people.createIndex({"address.state": 1, last_name: 1})

In [None]:
# rerun the query (uses the new index)
# 更好的表現
# 會有 rejectedplan
expRun.find({"last_name":"Johnson", "address.state":"New York"})

In [None]:
# run a sort query
var res = db.people.find({"last_name":"Johnson", "address.state":"New York"}).sort({"birthday":1}).explain("executionStats")

In [None]:
# checkout the execution stages (doing an in-memory sort)
# executionStages 是由下往上看，所以最後一步是 sort
res.executionStats.executionStages

<img src="img/14.png">

### 範例01

In [None]:
"executionStats" : {
  "executionSuccess" : true,
  "nReturned" : 23217,
  "executionTimeMillis" : 91,
  "totalKeysExamined" : 23217,
  "totalDocsExamined" : 23217,
  "executionStages" : {
    "stage" : "SORT",
    "nReturned" : 23217,
    "executionTimeMillisEstimate" : 26,
    "works" : 46437,
    "advanced" : 23217,
    "needTime" : 23219,
    "needYield" : 0,
    "saveState" : 363,
    "restoreState" : 363,
    "isEOF" : 1,
    "sortPattern" : {
      "stars" : 1
    },
    "memUsage" : 32522511,
    "memLimit" : 33554432,

# Understanding Explain for Sharded Clusters

In [None]:
# switch to the m201 database
use m201

In [None]:
# enable sharding on the m201 database
sh.enableSharding("m201")

In [None]:
# shard the people collection on the _id index
sh.shardCollection("m201.people", {_id: 1})

In [None]:
# after the import, check the shard distribution (data should be on both shards)
db.people.getShardDistribution()

In [None]:
# checkout the explain output for a sharded collection
db.people.find({"last_name":"Johnson", "address.state":"New York"}).explain("executionStats")

---

### 範例01

# Sorting with Indexes

<img src="img/15.png">

<img src="img/16.png">

In [None]:
# switch to the m201 database
use m201

In [None]:
db.people.dropIndexes()

In [None]:
db.people.getIndexes()

In [None]:
db.people.createIndex({ "ssn": 1 })

In [None]:
# find all documents and sort them by ssn
db.people.find({}, { _id : 0, last_name: 1, first_name: 1, ssn: 1 }).sort({ ssn: 1 })

In [None]:
# create an explainable object for the people collection
var exp = db.people.explain('executionStats')

# and rerun the query (uses the index for sorting)
exp.find({}, { _id : 0, last_name: 1, first_name: 1, ssn: 1 }).sort({ ssn: 1 })

<img src="img/17.png">

In [None]:
# this time, sort by first_name (didn't use the index for sorting)
exp.find({}, { _id : 0, last_name: 1, first_name: 1, ssn: 1 }).sort({ first_name: 1 })

<img src="img/18.png">

In [None]:
# and rerun the first query, but sort descending (walks the index backward)
exp.find({}, { _id : 0, last_name: 1, first_name: 1, ssn: 1 }).sort({ ssn: -1 })

In [None]:
# filtering and sorting in the same query (both using the index, backward)
exp.find( { ssn : /^555/ }, { _id : 0, last_name: 1, first_name: 1, ssn: 1 } ).sort( { ssn : -1 } )

<img src="img/19.png">

In [None]:
# drop all indexes
db.people.dropIndexes()

# create a new descending (instead of ascending) index on ssn
db.people.createIndex({ ssn: -1 })

In [None]:
# rerun the same query, now walking the index forward
exp.find( { ssn : /^555/ }, { _id : 0, last_name: 1, first_name: 1, ssn: 1 } ).sort( { ssn : -1 } )

---

# Querying on Compound Indexes 

<img src="img/20.png">

<img src="img/21.png">

<img src="img/22.png">

<img src="img/23.png">

<img src="img/24.png">

### Index Prefixes

<img src="img/25.png">

<img src="img/26.png">

<img src="img/27.png">

<img src="img/28.png">

<img src="img/29.png">

# When you can sort with Indexes

In [None]:
# confirm you still have an index on job, employer, last_name, & first_name
db.people.getIndexes()

In [None]:
# create an explainable object for the people collection
var exp = db.people.explain("executionStats")

In [None]:
# sort all documents using the verbatim index key pattern
exp.find({}).sort({ job: 1, employer: 1, last_name : 1, first_name : 1 })

In [None]:
# sort all documents using the first two fields of the index (uses the index)
exp.find({}).sort({ job: 1, employer: 1 })

In [None]:
# sort all documents, swapping employer and job (doesn't use the index)
exp.find({}).sort({ employer: 1, job: 1 })

In [None]:
# all of these queries can use the index
db.people.find({}).sort({ job: 1 })
db.people.find({}).sort({ job: 1, employer: 1 })
db.people.find({}).sort({ job: 1, employer: 1, last_name: 1 })

In [None]:
# will still use the index (for sorting)
exp.find({ email:"jenniferfreeman@hotmail.com" }).sort({ job: 1 })

In [None]:
# use the index for filtering and sorting
# 重要
# 如果第二個 index 是有 range 應該就沒辦法用 index 去 sort
exp.find({ job: 'Graphic designer', employer: 'Wilson Ltd' }).sort({ last_name: 1 })

In [None]:
# doesn't follow an index prefix, and can't use the index for sorting, only filtering
exp.find({ job: 'Graphic designer' }).sort({ last_name: 1 })

---

In [None]:
# create a new compound index
db.coll.createIndex({ a: 1, b: -1, c: 1 })

In [None]:
# walk the index forward
db.coll.find().sort({ a: 1, b: -1, c: 1 })

In [None]:
# walk the index backward, by inverting the sort predicate
db.coll.find().sort({ a: -1, b: 1, c: -1 })

In [None]:
# all of these queries use the index for sorting
db.coll.find().sort({ a: 1 })
db.coll.find().sort({ a: 1, b: -1 })
db.coll.find().sort({ a: -1 })
db.coll.find().sort({ a: -1, b: 1 })

In [None]:
# uses the index for sorting
exp.find().sort({job: -1, employer: -1})

In [None]:
# sorting is done in-memory
exp.find().sort({job: -1, employer: 1})

---

# Multikey Indexes

In [None]:
# switch to the m201 database
use m201

In [None]:
# insert a document into the products collection
db.products.insert({
  productName: "MongoDB Short Sleeve T-Shirt",
  categories: ["T-Shirts", "Clothing", "Apparel"],
  stock: { size: "L", color: "green", quantity: 100 }
})

In [None]:
# create an index on stock.quantity
db.products.createIndex({ "stock.quantity": 1})

In [None]:
# create an explainable object on the products collection
var exp = db.products.explain()

# look at the explain output for the query (uses an index, isMultiKey is false)
exp.find({ "stock.quantity": 100 })

In [None]:
# insert a document where stock is now an array
db.products.insert({
  productName: "MongoDB Long Sleeve T-Shirt",
  categories: ["T-Shirts", "Clothing", "Apparel"],
  stock: [
    { size: "S", color: "red", quantity: 25 },
    { size: "S", color: "blue", quantity: 10 },
    { size: "M", color: "blue", quantity: 50 }
  ]
})

In [None]:
# rerun our same query (still uses an index, but isMultiKey is now true)
exp.find({ "stock.quantity": 100 })

In [None]:
# creating an index on two array fields will fail
db.products.createIndex({ categories: 1, "stock.quantity": 1 })

In [None]:
# but compound indexes with only 1 array field are good
db.products.createIndex({ productName: 1, "stock.quantity": 1 })

In [None]:
# productName can be an array if stock isn't
db.products.insert({
  productName: [
    "MongoDB Short Sleeve T-Shirt",
    "MongoDB Short Sleeve Shirt"
  ],
  categories: ["T-Shirts", "Clothing", "Apparel"],
  stock: { size: "L", color: "green", quantity: 100 }
});

In [None]:
# but this will fail, because both productName and stock are arrays
db.products.insert({
  productName: [
    "MongoDB Short Sleeve T-Shirt",
    "MongoDB Short Sleeve Shirt"
  ],
  categories: ["T-Shirts", "Clothing", "Apparel"],
  stock: [
    { size: "S", color: "red", quantity: 25 },
    { size: "S", color: "blue", quantity: 10 },
    { size: "M", color: "blue", quantity: 50 }
  ]
})

---

### 範例01

In [None]:
{ name: 1, emails: 1 }

In [None]:
{
  "name": "Beatrice McBride",
  "age": 26,
  "emails": [
      "puovvid@wamaw.kp",
      "todujufo@zoehed.mh",
      "fakmir@cebfirvot.pm"
  ]
}

In [None]:
"Beatrice McBride", "puovvid@wamaw.kp"
"Beatrice McBride", "todujufo@zoehed.mh"
"Beatrice McBride", "fakmir@cebfirvot.pm"

# Partial Indexes

<img src="img/30.png">

<img src="img/31.png">

### Partial Index Restrictions

In [None]:
# switch to the m201 database
use m201

In [None]:
# insert a restaurant document
db.restaurants.insert({
   "name" : "Han Dynasty",
   "cuisine" : "Sichuan",
   "stars" : 4.4,
   "address" : {
      "street" : "90 3rd Ave",
      "city" : "New York",
      "state" : "NY",
      "zipcode" : "10003"
   }
})

In [None]:
# and run a find query on city and cuisine
db.restaurants.find({'address.city': 'New York', 'cuisine': 'Sichuan'})

In [None]:
# create an explainable object
var exp = db.restaurants.explain()

# and rerun the query
exp.find({'address.city': 'New York', cuisine: 'Sichuan'})

In [None]:
# create a partial index
db.restaurants.createIndex(
  { "address.city": 1, cuisine: 1 },
  { partialFilterExpression: { 'stars': { $gte: 3.5 } } }
)

In [None]:
# rerun the query (doesn't use the partial index)
db.restaurants.find({'address.city': 'New York', 'cuisine': 'Sichuan'})

In [None]:
exp.find({'address.city': 'New York', 'cuisine': 'Sichuan'})

In [None]:
# adding the stars predicate allows us to use the partial index
# 要是子集
exp.find({'address.city': 'New York', cuisine: 'Sichuan', stars: { $gt: 4.0 }})

---

### 範例01

# Text Indexes

<img src="img/32.png">

<img src="img/33.png">

<img src="img/34.png">

### Many index keys

<img src="img/35.png">

In [None]:
# switch to the m201 database
use m201

In [None]:
# insert 2 example documents
db.textExample.insertOne({ "statement": "MongoDB is the best" })
db.textExample.insertOne({ "statement": "MongoDB is the worst." })

In [None]:
# create a text index on "statement"
db.textExample.createIndex({ statement: "text" })

In [None]:
# Search for the phrase "MongoDB best"
# 注意是會找有 MongoDB or best
db.textExample.find({ $text: { $search: "MongoDB best" } })

In [None]:
# Display each document with it's "textScore"
db.textExample.find({ $text: { $search : "MongoDB best" } }, { score: { $meta: "textScore" } })

In [None]:
# Sort the documents by their textScore so that the most relevant documents return first
db.textExample.find({ $text: { $search : "MongoDB best" } }, { score: { $meta: "textScore" } }).sort({ score: { $meta: "textScore" } })

---

### 範例01

# Collations

<img src="img/36.png">

In [None]:
# switch to the m201 database
use m201

In [None]:
# create a collection-level collation for Portuguese
db.createCollection( "foreign_text", {collation: {locale: "pt"}})

In [None]:
# insert an example document
db.foreign_text.insert({ "name": "Máximo", "text": "Bom dia minha gente!"})

In [None]:
# explain the following query (uses the Portuguese collation)
db.foreign_text.find({ _id: {$exists:1 } } ).explain()

<img src="img/37.png">

In [None]:
# specify an Italian collation for a find query
db.foreign_text.find({ _id: {$exists:1 } }).collation({locale: 'it'})

In [None]:
# specify a Spanish collation for an aggregation query
db.foreign_text.aggregate([ {$match: { _id: {$exists:1 }  }}], {collation: {locale: 'es'}})

In [None]:
# create an index with a collation that differs from the collection collation
db.foreign_text.createIndex( {name: 1},  {collation: {locale: 'it'}} )

In [None]:
# uses the collection collation (Portuguese)
db.foreign_text.find( {name: 'Máximo'}).explain()

<img src="img/38.png">

<img src="img/39.png">

In [None]:
# uses the index collation (Italian)
db.foreign_text.find( {name: 'Máximo'}).collation({locale: 'it'}).explain()

<img src="img/40.png">

In [None]:
# create a case-insensitive index via collations
db.createCollection( "no_sensitivity", {collation: {locale: 'en', strength: 1}})

In [None]:
# insert some documents
db.no_sensitivity.insert({name: 'aaaaa'})
db.no_sensitivity.insert({name: 'aAAaa'})
db.no_sensitivity.insert({name: 'AaAaa'})

In [None]:
# sort them by name ascending
db.no_sensitivity.find().sort({name:1})

In [None]:
# even if we change the sort-order, the documents will be returned in the same
# order because of the case-insensitive collation
db.no_sensitivity.find().sort({name:-1})

### 範例01

( ) MongoDB only allows collations to be defined at collection level

(勾) Collations allow the creation of case insensitive indexes

( ) Creating an index with a different collation from the base collection
    implies overriding the base collection collation.

(勾) We can define specific collations in an index

# Wildcard Index Type

<img src="img/41.png">

In [None]:
db.data.createIndex({ "$**": 1 })

In [None]:
db.data.find({ "waveMeasurement.waves.height": 0.5 }).pretty()

<img src="img/42.png">

In [None]:
db.data.find({ "waveMeasurement.waves.height": 0.5, "waveMeasurement.seaState.quality": "9" }).pretty()

<img src="img/43.png">

<img src="img/44.png">

In [None]:
# 只對 waveMeasurement 裡面的 fields 做 indexes
db.data.createIndex({ "$**": 1 }, { "wildcardProjection": { waveMeasurement: 1 } })

<img src="img/45.png">

In [None]:
# 另一種寫法
db.data.createIndex({ "waveMeasurement.waves.$**": 1 })

<img src="img/46.png">

<img src="img/47.png">

---

# Wildcard Index Use Cases

<img src="img/48.png">

---

<img src="img/49.png">

<img src="img/50.png">

<img src="img/51.png">

---

### 範例01

In this lab you're going to determine which queries are able to successfully use a given index for both filtering and sorting.

Given the following index:

In [None]:
{ "first_name": 1, "address.state": -1, "address.city": -1, "ssn": 1 }

### 範例02

In [None]:
> db.people.find({
    "address.state": "Nebraska",
    "last_name": /^G/,
    "job": "Police officer"
  })

In [None]:
> db.people.find({
    "job": /^P/,
    "first_name": /^C/,
    "address.state": "Indiana"
  }).sort({ "last_name": 1 })

In [None]:
> db.people.find({
    "address.state": "Connecticut",
    "birthday": {
      "$gte": ISODate("2010-01-01T00:00:00.000Z"),
      "$lt": ISODate("2011-01-01T00:00:00.000Z")
    }
  })