# Views & Capped Collections

Views in MongoDB are read only, meaning you can't insert into or update as the normal SQL views. Views benefits the users in giving role based access

References:
https://docs.mongodb.com/manual/core/views/

https://dzone.com/articles/taking-a-look-at-mongodb-views

In [1]:
import pymongo as pm
from pymongo import MongoClient
from bson import Code
# client = MongoClient('localhost', 27017)
client = MongoClient('mongodb://localhost:27017/')
stockDB = client.stock
companyDB = client.companyData
stockCol = stockDB.stocks
companyCol = companyDB.companies

import warnings
warnings.filterwarnings('ignore')

In [2]:
#Lets have a quick look at the company collection

companyCol.find_one()

{'_id': ObjectId('52cdef7c4bab8bd675297d8b'),
 'name': 'AdventNet',
 'permalink': 'abc3',
 'crunchbase_url': 'http://www.crunchbase.com/company/adventnet',
 'homepage_url': 'http://adventnet.com',
 'blog_url': '',
 'blog_feed_url': '',
 'twitter_username': 'manageengine',
 'category_code': 'enterprise',
 'number_of_employees': 600,
 'founded_year': 1996,
 'deadpooled_year': 2,
 'tag_list': '',
 'alias_list': 'Zoho ManageEngine ',
 'email_address': 'pr@adventnet.com',
 'phone_number': '925-924-9500',
 'description': 'Server Management Software',
 'created_at': datetime.datetime(2007, 5, 25, 19, 24, 22),
 'updated_at': 'Wed Oct 31 18:26:09 UTC 2012',
 'overview': '<p>AdventNet is now <a href="/company/zoho-manageengine" title="Zoho ManageEngine" rel="nofollow">Zoho ManageEngine</a>.</p>\n\n<p>Founded in 1996, AdventNet has served a diverse range of enterprise IT, networking and telecom customers.</p>\n\n<p>AdventNet supplies server and network management software.</p>',
 'image': {'avail

Now, let's say we're developing an API where users can view the overview of a company. In the web service, we might want to restrict access to the collections. What we can do is create a separate user and give that user access to the view instead of full collection

In [4]:
#Let's have a quick look at our pipeline

companies = ['Microsoft', 'Google', 'AdventNet']

overview_pipeline = [
    {"$match" : {"name" : {"$in": companies}}},
    {"$project" : {"_id" : 0, "name" : {"$toUpper" : "$name"}, "permalink" : {"$toLower" : "$permalink"},
                  "number_of_employees" : {"$toString" : "$number_of_employees"}, "phone_number": 1, "email_address": 1,
                  "description" : 1, "founded_year" : {"$toString" : "$founded_year"}}},
    {"$project" : {"name": 1, "Overview" : 
                   {"$concat" : ["$name", " was founded in ", "$founded_year", " having ","$number_of_employees",
                                " employees and specializing in ", "$description", ". They can be contacted through ",
                                "$phone_number", ", ", "$email_address" , " and ", "$permalink"]}}}
]

list(companyCol.aggregate(overview_pipeline))

[{'name': 'ADVENTNET',
  'Overview': 'ADVENTNET was founded in 1996 having 600 employees and specializing in Server Management Software. They can be contacted through 925-924-9500, pr@adventnet.com and abc3'},
 {'name': 'GOOGLE',
  'Overview': 'GOOGLE was founded in 1998 having 28000 employees and specializing in . They can be contacted through 650.253.0000, google@google.com and google'},
 {'name': 'MICROSOFT',
  'Overview': 'MICROSOFT was founded in 1974 having 90000 employees and specializing in . They can be contacted through ,  and microsoft'}]

In [5]:
#lets create view using the pipeline on the collection 'companies'

overview_pipeline = [
    {"$project" : {"_id" : 0, "name" : {"$toUpper" : "$name"}, "permalink" : {"$toLower" : "$permalink"},
                  "number_of_employees" : {"$toString" : "$number_of_employees"}, "phone_number": 1, "email_address": 1,
                  "description" : 1, "founded_year" : {"$toString" : "$founded_year"}}},
    {"$project" : {"name": 1, "Overview" : 
                   {"$concat" : ["$name", " was founded in ", "$founded_year", " having ","$number_of_employees",
                                " employees and specializing in ", "$description", ". They can be contacted through ",
                                "$phone_number", ", ", "$email_address" , " and ", "$permalink"]}}}
]

companyDB.command({
    "create": "companyOverview",
    "viewOn": "companies", 
    "pipeline": overview_pipeline
})

{'ok': 1.0}

In [7]:
#views will get saved as collections

companyDB.collection_names()

['system.views', 'companies', 'companyOverview']

In [8]:
#we can query the views as we query our collections

companyView = companyDB.companyOverview

In [9]:
list(companyView.find().limit(5))

[{'name': 'ADVENTNET',
  'Overview': 'ADVENTNET was founded in 1996 having 600 employees and specializing in Server Management Software. They can be contacted through 925-924-9500, pr@adventnet.com and abc3'},
 {'name': 'OMNIDRIVE', 'Overview': None},
 {'name': 'POSTINI', 'Overview': None},
 {'name': 'FLEKTOR', 'Overview': None},
 {'name': 'GENI',
  'Overview': 'GENI was founded in 2006 having 18 employees and specializing in Geneology social network site. They can be contacted through ,  and geni'}]

In [11]:
list(companyView.find({"name" : "GENI"}))

[{'name': 'GENI',
  'Overview': 'GENI was founded in 2006 having 18 employees and specializing in Geneology social network site. They can be contacted through ,  and geni'}]

# Capped Collections

Capped collections are fixed-size collections that support high-throughput operations that insert and retrieve documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection.

Insertion Order

Capped collections guarantee preservation of the insertion order. As a result, queries do not need an index to return documents in insertion order. Without this indexing overhead, capped collections can support higher insertion throughput.

Automatic Removal of Oldest Documents

To make room for new documents, capped collections automatically remove the oldest documents in the collection without requiring scripts or explicit remove operations.

Consider the following potential use cases for capped collections:

Store log information generated by high-volume systems. Inserting documents in a capped collection without an index is close to the speed of writing log information directly to a file system. Furthermore, the built-in first-in-first-out property maintains the order of events, while managing storage use.

Cache small amounts of data in a capped collections. Since caches are read rather than write heavy, you would either need to ensure that this collection always remains in the working set (i.e. in RAM) or accept some write penalty for the required index or indexes.

In [16]:
# db.createCollection( "caplog", { capped: true, size: 1000 } )

companyDB.command({
    "create": "caplog",
    "capped": True,
    "size": 1000
})

{'ok': 1.0}

In [17]:
capLog = companyDB.capLog

In [21]:
companyCol.options()

{}

In [23]:
companyDB.command('collstats','capLog')

{'ns': 'companyData.capLog',
 'size': 0,
 'count': 0,
 'storageSize': 0,
 'nindexes': 0,
 'totalIndexSize': 0,
 'indexDetails': {},
 'indexSizes': {},
 'scaleFactor': 1,
 'ok': 1.0}