In [None]:
import pymongo
import pprint
from bson.objectid import ObjectId

# Python + MongoDB

MongoDB is a widely used NoSQL document-based database system.
With `pymongo` we can interact with a MongoDB.

Unlike sqlite, MongoDB is a server-based DB, so we first have to install MongoDB start a MongoDB instance, before we can start to interact with it via Python!

See here for more information: https://www.mongodb.com/docs/manual/administration/install-community/

## Create a DB Client

Assuming, the MongoDB server is running, we can connect to it by running the following command.

If no port or DB URL is specified, the client bound to a default port.

In [None]:
DATABASE_USERNAME = "..."
DATABASE_PASSWORD = "..."
PORT = 27017

# Use URI string format
client = pymongo.MongoClient(f"mongodb://{DATABASE_USERNAME}:{DATABASE_PASSWORD}@localhost:{PORT}/")

## Create a New Database
Having the client, we create a new database. A MongoDB instance can host multiple databases!

In [None]:
db = client.test_database

## Create a New Collection

MongoDB is a document-based database.
Data is organized in collections, which can roughly(!) be compared to the concept of tables in a relational database.
Like tables, collections contain data of logical units.
One elemnt of a collection is called "document".

In [None]:
# Create a new collection named "runners" in our database
runners = db.runners

## Add a Single Document

Un- or semi-structured data is organized in key-value pairs, like Python dicts or JSON objects.
A value can be a key-value pair itself ("nested structures").
In our example, we have a collection of runners, which shall contain "documents" of key-value pairs for every runner.
Since we are dealing with a schema-less DB, we do not have to specify up front, which key-value pairs are needed for a document.
Furthermore, each document can have different keys!

Each document gets a unique `_id` automatically, if it is not explicitly defined in the document dict.

In [None]:
# Let's specify a document in a dict-like structre
runner = {"Full Name": "Anna Einstein", "Shoe Size": 38, "Shirt Size": 38, "Team": 1}

In [None]:
#Insert the single document into the "runners" collection
runners.insert_one(runner)

## SELECT Data from MongoDB

We can select data from the a collection by using `find_one()` (which finds the first or none object) and `find()`.
Within the `()` we can specify some filters.
Note, that there is no possibility to join tables together and query data on combined informations.

In [None]:
pprint.pprint(runners.find_one())

## Add Many Documents

Now we want to add a bulk of documents into the runner collection:

In [None]:
list_runners = [
    {"Full Name": "Marius Fermi", "Shoe Size": 44, "Shirt Size": 60, "Distance": 2, "Team": 5, "Age": 54},
    {"Full Name": "James Pauli", "Shoe Size": 44, "Shirt Size": 42, "Team": 8, "Country": "USA"},
    {"Full Name": "Selma Meitner", "Shoe Size": 41, "Shirt Size": 40, "Team": 3, "Teamleader": True},
]

In [None]:
runners.insert_many(list_runners)

... and select them all by using `find`

In [None]:
for runner in runners.find():
    pprint.pprint(runner)

## Filter Data

We can filter data by passing a dict to the `find()` function.
For example: find all runner in team 1!

In [None]:
for runner in runners.find({"Team":1}):
    pprint.pprint(runner)

## List of Collections

In [None]:
db.list_collection_names()

## Add Another Collection

In [None]:
list_trainings = [
    {"runner": "Anna", "time": "38:45", "date": "2023-07-06", "distance": 6.4},
    {"runner": "Anna", "time": "35:22", "date": "2023-07-07", "distance": 6.4},
    {"runner": "Marius", "time": "1:04:22", "date": "2023-08-07", "distance": 7, "mean_heartrate": 121},
    {"runner": "James", "time": "45:22", "distance": "6 miles", "mean_velocity": "7.6 mph"}
]

In [None]:
trainings = db.trainings

In [None]:
trainings.insert_many(list_trainings)

In [None]:
for training in trainings.find():
    pprint.pprint(training)

## Update a Document

You can change a document by using the `update_one()` or `update_many()` function.
You need to specify the key-value pair you want to update and the operation. E.g. `$set` just sets a new value.

Other operators can be found here: https://www.mongodb.com/docs/manual/reference/operator/update/#std-label-update-operators

In [None]:
# Use avalid ID from the above print out
trainings.update_one({"_id": ObjectId("65141cbbc429eac58ebf20fb")}, {"$set": {"distance": 8.3}})

In [None]:
pprint.pprint(trainings.find_one({"_id": ObjectId("65141cbbc429eac58ebf20fb")}))

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright © [Point 8 GmbH](https://point-8.de)_