# Setting up MongoDB

It was decided the a noSQL database would be the best choice for querying, filtering, and returning the data.

To that end MongoDB was installed along with its shell, mongosh.

![title](images/mongosh.png)

The video series on Net Ninja about setting up MongoDB was very helpful.

https://www.youtube.com/watch?v=ojKJqNQYaOI



## BSON

All the data will need to be stored in binary JSON format (BSON). MangoDB will add a special ObjectID to each one of the documents so that it can be identified by that unique ID.

The documents themselves can be nested, or contains an array of information. This would be an option to not need to reference another document.


### Verified that service is running

![title](images/servicescheck.png)

### Uploaded csv file that was created to the database

![title](images/initialdatabase.png)

## Querying the New DB
One of the most powerful aspects of holding all the data in the noSQL documnet based database is how it can be queried. For example, there are not tags for strains that might produce the aromatic isoamyl acetate, which is perceived as banana. However, I can query the database and see if banana is in any of the descriptions.

I did that with the following query : {"Description": {"$regex": "banana", "$options": "i"}}

What returned was 31 of the 689 strains that were include banana in the description. The goal is to now connect this database to a front end user interface so that a user can search it however they want, essentially "google" all the information. Added to this will be the beer style that each strain is good for, which should allow a user to type in "stout" and have returned all the strains that are good for stout.


![title](images/banana.png)

## Using the MangoDB Shell

test: show dbs
- lists all the databases

: use database
- switches to whatever database you want to use

: show collections
- shows all the collections inside the database

: var name = "mario"
- creates a variable named "mario:

: help
- lists all the commands

: db.collection
- returns the collection

: db.collections.insertOne({})
- adds a new document to the collection

: db.collections.insertMany([{},{},{}])
- insert and array of documents

: db.collection.find()
- find and returns the first 20 documents in the collection

![title](images/findall.png)


: db.collection.find({supplier: "Propagate Lab"})
- this can be used to find all the yeast supplied by Propagate Lab

![title](images/findsupplier.png)


: db.collection.find({supplier: "Propagate Lab", Flocculation: "Low"})
- this will return all the yeast supplied by propagate lab that have a low flocculation rate

![title](images/supplierflocculation.png)

: db.collection.find({Supplier: "Propagate Lab"}, {Strain_Name: 1)
- filtering the returned results

![title](images/strainfilter.png)

: db.collection.find({}, {Supplier: 1, Temperature: 1)
- returns everything in the collection, but just the supplier and temperature

: db.collection.find({Supplier: "Propagate Lab").count()
- returns a count of yeast supplied by propagate lab

: db.collection.find().limit(3).count()
- would return three and then count them

: db.collection.find().sort({ x: 1 OR -1})
- would sort by whatever field you want, 1 or -1 sorts the list AZ or ZA

: db.collection.find().sort({}).limit()
- would find, sort, and limit

## Edited Environmental Variables Path

Added mongosh to the path in enviornmental variables, it can now be run from any command line

![title](images/mongosh2.png)

## Connecting Jupyter Lab to MongoDB


In [1]:
# Installing the MongoClient package

#pip install pymongo

from pymongo import MongoClient


In [2]:
# connecting to the local client
client = MongoClient("localhost", 27017)

In [3]:
# Seeting if the collection is still present
db = client['MSDS_696']
db

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'MSDS_696')

In [4]:
db.list_collection_names()

['yeastCollection']

In [5]:
client.list_database_names()

['MSDS_696', 'admin', 'config', 'local']

In [6]:
collection = db.yeastCollection

In [7]:
import pprint
pprint.pprint(collection.find_one())

{'Alcohol_Tolerance': 'Low',
 'Attenuation': "Mid 70's",
 'Description': 'Parent strain is Mexican Lager OYL-113. Thiolized Mexican '
                'Lager - passion fruit, guava, NZ sauvignon blanc fruitiness, '
                'grapefruit',
 'Fermentation_Temperature': "50's",
 'Flocculation': 'Medium',
 'Number': 'MIP-043',
 'STA+': 'Negative',
 'Strain_Name': 'Lager Thiol',
 'Supplier': 'Propagate Lab',
 '_id': ObjectId('6601e26a2be1d29f48bc41d0')}


In [None]:
# finding all the ha

In [8]:
![title](images/picture.png)

'[title]' is not recognized as an internal or external command,
operable program or batch file.
