# MongoDB from Python
This notebook introduces how we can talk to a running instance of `MongoDB` from a Python program. Any Python program can fetch data from or put data into `MongoDB` -- it could be a standalone program, a Jupyter notebook like this, an `iPython` shell, or your `Flask` server.

First, we must connect to a running instance of `MongoDB`. Remember that `MongoDB` is just a program that stores and allows effecient querying of data. It can be running anywhere--on your machine (`localhost` or `127.0.0.1`), or perhaps a machine in the cloud (`vcm-0000.vm.duke.edu`), or a MongoDB Cloud server. This notebook demonstrates connecting to a MongoDB Atlas cloud database.  We connect to it as follows:

In [1]:
from pymodm import connect

In [2]:
connect("mongodb+srv://nh170:olina12345@cluster0.pld4uds.mongodb.net/db2023?retryWrites=true&w=majority")

In the above command, replace the string in the `connect()` function with the string you obtained from the MongoDB Atlas on-line interface.  Your string will already have the `<username>`, `<clustername>` and `<folder>` populated with the correct entries for your database.  You will need to provide the `<password>` you created when making the `<username>` database access account.

The `<folder>` portion of the connection URL above specifies which MongoDB database or "folder" we want to talk to. Each `MongoDB` instance can have multiple "databases" that are independant of each other. Just think of this as a namespace. If we connect to `example2` instead and it does not exist, a blank database will be created under the namespace `/example2`.  You can change this `<folder>` to whatever you want.

__NOTE__: If you get an error similar to `pymongo.errors.ServerSelectionTimeoutError:...[SSL:CERTIFICATE_VERIFY_ERROR]...`, do the following.  If you are using macOS, visit the <a href="https://github.com/dward2/BME547/blob/main/Resources/installations_mac.md#ssl-or-certificate-errors">installations_mac.md</a> page in the class repository for instructions on updating your certificate authority.  If that doesn't work, or you are using Windows, you will need to import the `ssl` module and then add `ssl_cert_reqs=ssl.CERT_NONE` to the connect command as so:

```python
import ssl

connect("<YourConnectString>", ssl_cert_reqs=ssl.CERT_NONE)
```

## Models (schemas)
MongoDB is very forgiving, and does not _require_ us to specify what collections (tables) we want to store upfront, nor do we have to specify the structure of data we are going to store.

**However**, it is very useful and important to specify some of this structure in advance, so that it is clear in your code what your expectation of data structure is going to be. For example, if we want to store a `User` in the databse, we want it to be very clear in the code what fields a User is going to have and what types each of those fields will be. This allows for validation when storing and retreiving `User`s. 

An example of a "model" or schema definition for our MongoDB interface library (`pymodm`) is below.

In [3]:
from pymodm import MongoModel, fields
class User(MongoModel):
    email = fields.EmailField(primary_key=True)
    first_name = fields.CharField()
    last_name = fields.CharField()
    age = fields.IntegerField()

As you can see, this `User` is just a normal old Python class. Since it inherits from the `MongoModel` class, it has many existing methods and properties, including an initialization method that is based on the fields (like `email`) we specified.

We can use and interact with `User` and its properties (variables) like a normal Python class.  See the examples below.

In [4]:
u = User(email="suyash@suyashkumar.com", first_name="Suyash", last_name="Kumar", age="1000")
print(u)

<User object>


In [5]:
print(u.email)

suyash@suyashkumar.com


In [6]:
print(u.first_name)

Suyash


## Save a User
The `User` class has some methods that allow us to interact with the MongoDB database. For example, if we want to save this `User` `u` to the MongoDB database we connected to we can simply call:

In [7]:
u.save()

User(email='suyash@suyashkumar.com', age=1000, first_name='Suyash', last_name='Kumar')

The user is now stored in the MongoDB database!

### Add more Users!
Let's add some more Users to this database:

In [8]:
u2 = User(email="mark@test.com", first_name="Mark", last_name="Palmeri", age="2000")
u2.save()
u3 = User(email="bob@test.com", first_name="Bob", last_name="Smith", age="2000")
u3.save()

User(email='bob@test.com', age=2000, first_name='Bob', last_name='Smith')

## Query Users
We can now search for **all** Users in our database as follows.

In [9]:
for user in User.objects.raw({}):
    print(user.email)

suyash@suyashkumar.com
mark@test.com
bob@test.com


As you can see, the `user` variable inside the loop is just an instance of the class `User` we created earlier! We can work with the `user` variable just like we are used to working with classes. We can even modify `user` and then call `user.save()` if we wanted to update the user.

We can also choose to **filter** the `User`s we want to query with conditions like this:

In [10]:
for user in User.objects.raw({"age": 2000}):
    print(user.email)

mark@test.com
bob@test.com


As you can see, only `mark` and `bob` have an "age" equal to 2000, so only those Users are fetched to iterate over.

If we expect that a certain query should only return one result, or we just want the first User of a query we can do the following:

In [16]:
mark_user = User.objects.raw({"first_name": "Mark"}).first()
print(mark_user.first_name)
print(mark_user.email)

Mark
mark@test.com


If we want to look at a range of possible results, we use comparisons.  Details on Comparison Query Operators in MongoDB can be found at <https://docs.mongodb.com/manual/reference/operator/query-comparison/>.  Below is example syntax of a greater than or equal query.

In [14]:
for user in User.objects.raw({"age": {"$gte": 1000}}):
    print(user.first_name)

Suyash
Mark
Bob


### Query and Update Users
As mentioned earlier, you can actually fetch a user from the database, update it in Python, and call `save()` to update that user in the database. For example if we wanted to update Bob's age:


In [None]:
bob_user = User.objects.raw({"first_name": "Bob"}).first()
bob_user.age = 9000
bob_user.save()

### Query by primary key

One thing you will notice is that we cannot query users using the email field. This is because we set `email` to be the `primary_key` when we defined our `User` model/schema. When you are looking for a single user, usually you should try to query by whatever the primary_key is.

In [None]:
suyash = User.objects.raw({"email": "suyash@suyashkumar.com"}).first()  # this will NOT work

Instead we must query primary key fields by the key `"_id"` like so:

In [None]:
suyash = User.objects.raw({"_id": "suyash@suyashkumar.com"}).first()
print(suyash.first_name)
print(suyash.email)