In [31]:
from pymongo import MongoClient

First we has to have a connection string.  

### MongoDB Connection String Guide

A MongoDB connection string is a URI that tells your client how to connect to the MongoDB server. Here is a detailed explanation.

---

#### 1. Basic Structure

```
mongodb://[username:password@]host1[:port1][,host2[:port2],...]/[database][?options]
```

Or for modern MongoDB Atlas clusters:

```
mongodb+srv://[username:password@]cluster-address/[database]?options
```

---

#### 2. Components Explained

1. **`mongodb://` or `mongodb+srv://`**

   * `mongodb://` → standard connection.
   * `mongodb+srv://` → DNS seed list connection (recommended for Atlas clusters).

2. **`[username:password@]`** *(optional for authentication)*

   * Include credentials if your database requires authentication.

3. **`host[:port]`**

   * Server address and port (default: 27017).
   * Can list multiple hosts for replica sets.

4. **`/database`** *(optional)*

   * Database to connect to initially.

5. **`?options`** *(optional query parameters)*

   * Additional settings like authentication mechanism, read preferences, SSL, etc.

---

#### 3. Examples

##### Local MongoDB without authentication

```python
from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db = client.mydatabase
```

##### Local MongoDB with username/password

```python
client = MongoClient("mongodb://myuser:mypass@localhost:27017/mydatabase")
db = client["mydatabase"]
```

##### MongoDB Atlas cloud cluster

```python
client = MongoClient("mongodb+srv://myuser:mypass@cluster0.abcd.mongodb.net/mydatabase?retryWrites=true&w=majority")
db = client["mydatabase"]
```

---

#### 4. Common Options

| Option                              | Meaning                                                                   |
| ----------------------------------- | ------------------------------------------------------------------------- |
| `retryWrites=true`                  | Automatically retries certain write operations if they fail               |
| `w=majority`                        | Write concern; ensures writes are replicated to a majority of nodes       |
| `tls=true` or `ssl=true`            | Use TLS/SSL for encrypted connections                                     |
| `authSource=admin`                  | Database to use for authentication (default: the database you connect to) |
| `readPreference=secondaryPreferred` | Read from secondary nodes if possible                                     |



In [32]:
conn_str = "mongodb://root:1234@localhost:27017/myNewDatabase?authSource=admin"

In [33]:
client = MongoClient(conn_str)

Below we initialize database connection.

In [34]:
client

MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, authsource='admin')

In [35]:
client.list_database_names()

['admin', 'config', 'local', 'myNewDatabase', 'mySecondDatabase']

Below is construct an object that representing a database.

In [36]:
myNewDatabase = client.myNewDatabase

In [37]:
myNewDatabase

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, authsource='admin'), 'myNewDatabase')

In [38]:
myNewDatabase.list_collection_names()

['MyCollection']

Below return an iterable object (in case we have more than one document in the collection).

In [39]:
myNewDatabase.MyCollection.find()

<pymongo.synchronous.cursor.Cursor at 0x14eaa17b110>

In [40]:
import pprint

In [41]:
for i in myNewDatabase.MyCollection.find():
    pprint.pprint(i)

{'_id': ObjectId('68b25621aaccadc23089b03d'), 'hp': 99, 'name': 'jay'}
{'_id': ObjectId('68b2565aaaccadc23089b03e'), 'hp': 0, 'name': 'toon'}


### Create a new database

We can do it by implicitly refer to it.

In [42]:
client.mySecondDatabase

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, authsource='admin'), 'mySecondDatabase')

In [43]:
mySecondDatabase = client.mySecondDatabase

I can create a new collection implicitly the same way.

In [44]:
second_list_doc=[{'mana':77, 'name':'jay'},{'mana':888, 'name':'toon'}]

mySecondDatabase.SecondCollection.insert_many(second_list_doc)

InsertManyResult([ObjectId('68b333099ae1f910574bb69b'), ObjectId('68b333099ae1f910574bb69c')], acknowledged=True)

In [45]:
mySecondDatabase.SecondCollection.find_one()

{'_id': ObjectId('68b30f209ae1f910574bb698'), 'mana': 77, 'name': 'jay'}

**Note** : There are two ways to initialize database and collection object.
1. `<client variable name>.<database name>` and `<database variable name>.<collection name>`
2. `<client variable name>["database name"]` and `<database variable name>["collection name"]`

The second option is recommended since it work with any database name, including names with spaces, dashes, or special characters.

In [46]:
sec_col = mySecondDatabase["SecondCollection"]

In [47]:
sortedcol=sec_col.aggregate([{"$sortByCount":"$mana"}])

for i in sortedcol:
    pprint.pprint(i)

{'_id': 77, 'count': 2}
{'_id': 888, 'count': 2}


For more aggregation, see https://pymongo.readthedocs.io/en/stable/examples/aggregation.html

### Read and write from csv file

In [48]:
import pandas as pd

In [52]:
third_col = mySecondDatabase["ThirdCollection"]

df = pd.read_csv("username.csv", sep=";")

data = df.to_dict(orient="records")

third_col.insert_many(data)

for doc in third_col.find().limit(5):
    print(doc)

{'_id': ObjectId('68b334cf9ae1f910574bb6ac'), 'Username': 'booker12', ' Identifier': 9012, 'First name': 'Rachel', 'Last name': 'Booker'}
{'_id': ObjectId('68b334cf9ae1f910574bb6ad'), 'Username': 'grey07', ' Identifier': 2070, 'First name': 'Laura', 'Last name': 'Grey'}
{'_id': ObjectId('68b334cf9ae1f910574bb6ae'), 'Username': 'johnson81', ' Identifier': 4081, 'First name': 'Craig', 'Last name': 'Johnson'}
{'_id': ObjectId('68b334cf9ae1f910574bb6af'), 'Username': 'jenkins46', ' Identifier': 9346, 'First name': 'Mary', 'Last name': 'Jenkins'}
{'_id': ObjectId('68b334cf9ae1f910574bb6b0'), 'Username': 'smith79', ' Identifier': 5079, 'First name': 'Jamie', 'Last name': 'Smith'}


**Note `to_dict` by default each column becomes a dict with index as keys.   
I use `orient="records"` to make each row to becomes a dict.

### Don't forget to close the connection

In [54]:
client.close()