# MongoDB

## SQL vs NoSQL

SQL:
- The model is of a relational nature
- Data is stored in tables
- Suitable for solutions where every record is of the same kind and possesses the same properties
- Adding a new property means you have to alter the whole schema
- The schema is very strict
- ACID transactions are supported
- Scales well vertically

NoSQL:
- The model is non-relational
- May be stored as JSON, key-value, etc. (depending on type of NoSQL database)
- Not every record has to be of the same nature, making it very flexible
- Add new properties to data without disturbing anything
- No schema requirements to adhere to
- Support for ACID transactions can vary depending on which NoSQL DB is used
- Consistency can vary
- Scales well horizontally

- Key-Value Store: DynamoDB, Riak KV, Redis, etcd, Memcached
- Document Store: MongoDB, Couchbase, CouchDB, RethinkDB
- Column Store: Cassandra, HBase

## MongoDB

## Install MongoDB

You can download a free MongoDB database at https://www.mongodb.com.

https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/

Docker (dodamo v docker-compose):


    version:  '3'
    services:
      jupyter-scipy-notebook:
        image: jupyter/tensorflow-notebook
        volumes:
          - C:\Users\18ICTA2\Desktop\Analitika2:/home/jovyan/work
        ports:
          - 8888:8888
      mongo-db:
        image: mongo
        restart: always
        environment:
          MONGO_INITDB_ROOT_USERNAME: root
          MONGO_INITDB_ROOT_PASSWORD: example



## Install the Python Driver

Dokumentacija: https://api.mongodb.com/python/current/

In [1]:
#!pip install pymongo



In [3]:
import pymongo

## PyMongo

### Establishing a Connection

In [4]:
from pymongo import MongoClient

client= MongoClient('mongodb://root:example@mongo-db:27017')

In [5]:
db = client.admin
db.command("serverStatus")

{'host': 'e68cc0732986',
 'version': '4.4.2',
 'process': 'mongod',
 'pid': 1,
 'uptime': 37935.0,
 'uptimeMillis': 37935192,
 'uptimeEstimate': 37935,
 'localTime': datetime.datetime(2020, 11, 30, 17, 35, 55, 194000),
 'connections': {'current': 12,
  'available': 838848,
  'totalCreated': 15,
  'active': 5,
  'exhaustIsMaster': 4,
  'exhaustHello': 0,
  'awaitingTopologyChanges': 4},
 'electionMetrics': {'stepUpCmd': {'called': 0, 'successful': 0},
  'priorityTakeover': {'called': 0, 'successful': 0},
  'catchUpTakeover': {'called': 0, 'successful': 0},
  'electionTimeout': {'called': 0, 'successful': 0},
  'freezeTimeout': {'called': 0, 'successful': 0},
  'numStepDownsCausedByHigherTerm': 0,
  'numCatchUps': 0,
  'numCatchUpsSucceeded': 0,
  'numCatchUpsAlreadyCaughtUp': 0,
  'numCatchUpsSkipped': 0,
  'numCatchUpsTimedOut': 0,
  'numCatchUpsFailedWithError': 0,
  'numCatchUpsFailedWithNewTerm': 0,
  'numCatchUpsFailedWithReplSetAbortPrimaryCatchUpCmd': 0,
  'averageCatchUpOps': 0.

### Creating and Accessing a Database

In [6]:
db = client['mydatabase']

### Check if Database Exists

In [11]:
client.list_database_names()

['admin', 'config', 'local']

In [8]:
if "mydatabase" in client.list_database_names():
    print("Database exists")

In [10]:
client.drop_database('business')

### Collections and Documents

<img src="https://webassets.mongodb.com/_com_assets/cms/JSON_Example_Python_MongoDB-mzqqz0keng.png" alt="JSON document example">

<table class="table table-bordered" summary="MongoDB vs Relational">
<thead>
    <tr>
        <th><b>Relational concept</b></th>
        <th><b>MongoDB equivalent</b></th>
</tr></thead>
<tbody>
    <tr>
        <td>Database</td>
        <td>Database</td>
    </tr>
    <tr>
        <td>Tables</td>
        <td>Collections</td>  
    </tr>
    <tr>
        <td>Rows</td>
        <td>Documents</td>
    </tr>
    <tr>
        <td>Index</td>
        <td>Index</td>    
    </tr>
</tbody>
</table>

### Inserting Documents

In [14]:
x = client['mydatabase']['custumers'].insert_one({'dsfsd': 'sfsf', 'sdsd':[1,2,3]})

In [15]:
x.inserted_id

ObjectId('5fc5300214c121b69c125983')

`_id`

In [16]:
mylist = [
  { "name": "Amy", "address": "Apple st 652"},
  { "name": "Hannah", "address": "Mountain 21"},
  { "name": "Michael", "address": "Valley 345"},
  { "name": "Sandy", "address": "Ocean blvd 2"},
  { "name": "Betty", "address": "Green Grass 1"},
  { "name": "Richard", "address": "Sky st 331"},
  { "name": "Susan", "address": "One way 98"},
  { "name": "Vicky", "address": "Yellow Garden 2"},
  { "name": "Ben", "address": "Park Lane 38"},
  { "name": "William", "address": "Central st 954"},
  { "name": "Chuck", "address": "Main Road 989"},
  { "name": "Viola", "address": "Sideway 1633"}
]



In [17]:
x = client['mydatabase']['custumers'].insert_many(mylist)

In [18]:
x.inserted_ids

[ObjectId('5fc5304714c121b69c125984'),
 ObjectId('5fc5304714c121b69c125985'),
 ObjectId('5fc5304714c121b69c125986'),
 ObjectId('5fc5304714c121b69c125987'),
 ObjectId('5fc5304714c121b69c125988'),
 ObjectId('5fc5304714c121b69c125989'),
 ObjectId('5fc5304714c121b69c12598a'),
 ObjectId('5fc5304714c121b69c12598b'),
 ObjectId('5fc5304714c121b69c12598c'),
 ObjectId('5fc5304714c121b69c12598d'),
 ObjectId('5fc5304714c121b69c12598e'),
 ObjectId('5fc5304714c121b69c12598f')]

In [20]:
mylist = [
  { "_id": 1, "name": "John", "address": "Highway 37"},
  { "_id": 2, "name": "Peter", "address": "Lowstreet 27"},
  { "_id": 3, "name": "Amy", "address": "Apple st 652"},
  { "_id": 4, "name": "Hannah", "address": "Mountain 21"},
  { "_id": 5, "name": "Michael", "address": "Valley 345"},
  { "_id": 6, "name": "Sandy", "address": "Ocean blvd 2"},
  { "_id": 7, "name": "Betty", "address": "Green Grass 1"},
  { "_id": 8, "name": "Richard", "address": "Sky st 331"},
  { "_id": 9, "name": "Susan", "address": "One way 98"},
  { "_id": 10, "name": "Vicky", "address": "Yellow Garden 2"},
  { "_id": 11, "name": "Ben", "address": "Park Lane 38"},
  { "_id": 12, "name": "William", "address": "Central st 954"},
  { "_id": 13, "name": "Chuck", "address": "Main Road 989"},
  { "_id": 14, "name": "Viola", "address": "Sideway 1633"}
]



In [21]:
x = client['mydatabase']['custumers'].insert_many(mylist)

In [22]:
x.inserted_ids

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

### Generating sample data code

In [23]:
from pymongo import MongoClient
from random import randint

client = MongoClient('mongodb://root:example@mongo-db:27017')
db=client['business']


names = ['Kitchen','Animal','State', 'Tastey', 'Big','City','Fish', 'Pizza','Goat', 'Salty','Sandwich','Lazy', 'Fun']
company_type = ['LLC','Inc','Company','Corporation']
company_cuisine = ['Pizza', 'Bar Food', 'Fast Food', 'Italian', 'Mexican', 'American', 'Sushi Bar', 'Vegetarian']
for x in range(1, 501):
    business = {
        'name' : names[randint(0, (len(names)-1))] + ' ' + names[randint(0, (len(names)-1))]  + ' ' + company_type[randint(0, (len(company_type)-1))],
        'rating' : randint(1, 5),
        'cuisine' : company_cuisine[randint(0, (len(company_cuisine)-1))] 
    }
    

    result=db['reviews'].insert_one(business)
    
    #print('Created {0} of 500 as {1}'.format(x,result.inserted_id))
    
print('finished creating 500 business reviews')

finished creating 500 business reviews


### MongoDB Find

In [24]:
x = client['business']['reviews'].find_one()

In [25]:
x

{'_id': ObjectId('5fc530b214c121b69c125991'),
 'name': 'Goat Goat Company',
 'rating': 2,
 'cuisine': 'Mexican'}

In [41]:
x = client['business']['reviews'].find({'rating': {"$gt": 3}})

In [42]:
result = []

for el in x:
    result.append(el)

In [43]:
result[:6]

[{'_id': ObjectId('5fc530b214c121b69c125997'),
  'name': 'Big State Corporation',
  'rating': 4,
  'cuisine': 'Vegetarian'},
 {'_id': ObjectId('5fc530b214c121b69c125999'),
  'name': 'Pizza Fish Corporation',
  'rating': 4,
  'cuisine': 'Pizza'},
 {'_id': ObjectId('5fc530b214c121b69c1259a4'),
  'name': 'Goat Fish LLC',
  'rating': 5,
  'cuisine': 'Vegetarian'},
 {'_id': ObjectId('5fc530b214c121b69c1259ac'),
  'name': 'Lazy Fish LLC',
  'rating': 5,
  'cuisine': 'Sushi Bar'},
 {'_id': ObjectId('5fc530b214c121b69c1259b0'),
  'name': 'Kitchen Goat Company',
  'rating': 4,
  'cuisine': 'Sushi Bar'},
 {'_id': ObjectId('5fc530b214c121b69c1259b9'),
  'name': 'Fish Animal Inc',
  'rating': 4,
  'cuisine': 'Pizza'}]

In [44]:
import pandas as pd
df = pd.DataFrame(result)
df.head()

Unnamed: 0,_id,name,rating,cuisine
0,5fc530b214c121b69c125997,Big State Corporation,4,Vegetarian
1,5fc530b214c121b69c125999,Pizza Fish Corporation,4,Pizza
2,5fc530b214c121b69c1259a4,Goat Fish LLC,5,Vegetarian
3,5fc530b214c121b69c1259ac,Lazy Fish LLC,5,Sushi Bar
4,5fc530b214c121b69c1259b0,Kitchen Goat Company,4,Sushi Bar


### Delete Documents

In [39]:
x = client['business']['reviews'].delete_one({'name': 'Tastey Lazy Inc'})

In [40]:
x = client['business']['reviews'].delete_many({'cuisine': 'American'})

### Drop Collection

In [45]:
client['business']['reviews'].drop()

## Vaja: Uvoz JSON datoteke v MongoDB 

In [46]:
import json

In [47]:
with open('data/cities.json') as f:
    template_dc = json.load(f)

In [48]:
cities = client['city_data']['cities']

In [49]:
for cita in template_dc:
    cities.insert_one(cita)

In [51]:
cities_data = client['city_data']['cities'].find({})

data = []

for x in cities_data:
    data.append(x)
    
df = pd.DataFrame(data)
df.head()

Unnamed: 0,_id,name,id,nametype,recclass,mass,fall,year,reclat,reclong,geolocation,:@computed_region_cbhk_fwbd,:@computed_region_nnqa_25f4
0,5fc532b414c121b69c125b85,Aachen,1,Valid,L5,21,Fell,1880-01-01T00:00:00.000,50.775,6.08333,"{'type': 'Point', 'coordinates': [6.08333, 50....",,
1,5fc532b414c121b69c125b86,Aarhus,2,Valid,H6,720,Fell,1951-01-01T00:00:00.000,56.18333,10.23333,"{'type': 'Point', 'coordinates': [10.23333, 56...",,
2,5fc532b414c121b69c125b87,Abee,6,Valid,EH4,107000,Fell,1952-01-01T00:00:00.000,54.21667,-113.0,"{'type': 'Point', 'coordinates': [-113, 54.216...",,
3,5fc532b414c121b69c125b88,Acapulco,10,Valid,Acapulcoite,1914,Fell,1976-01-01T00:00:00.000,16.88333,-99.9,"{'type': 'Point', 'coordinates': [-99.9, 16.88...",,
4,5fc532b414c121b69c125b89,Achiras,370,Valid,L6,780,Fell,1902-01-01T00:00:00.000,-33.16667,-64.95,"{'type': 'Point', 'coordinates': [-64.95, -33....",,
