# Connecting to MongoDB using PyMongo

To connect to MongoDB and work with data using Python, we will be installing Pymongo driver.

The easiest way to install the driver is through the pip package management system. Execute the following on a command line
or Anaconda Prompt :

Command line:    
python -m pip install pymongo 

Anaconda Prompt:

pip install pymongo  

pip3 install 'pymongo[srv]'

Once PyMongo is installed we can write our first application that will return information about the MongoDB server. In your Python development environment or from a text editor enter the following code.

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [5]:
pip install pymongo

Collecting pymongo
  Downloading pymongo-4.11.3-cp312-cp312-macosx_11_0_arm64.whl.metadata (22 kB)
Collecting dnspython<3.0.0,>=1.16.0 (from pymongo)
  Downloading dnspython-2.7.0-py3-none-any.whl.metadata (5.8 kB)
Downloading pymongo-4.11.3-cp312-cp312-macosx_11_0_arm64.whl (895 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m895.1/895.1 kB[0m [31m10.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dnspython-2.7.0-py3-none-any.whl (313 kB)
Installing collected packages: dnspython, pymongo
Successfully installed dnspython-2.7.0 pymongo-4.11.3
Note: you may need to restart the kernel to use updated packages.


In [7]:
import pymongo
from pprint import pprint
# connect to MongoDB, change the << MONGODB URL >> to reflect your own connection string
client = pymongo.MongoClient("mongodb+srv://heeyaamin03:heeya@cluster0.ecfyk.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0")


### Exploring Collections and Documents

A collection in MongoDB is a container for documents. A database is the container for collections.
Some of the advantages of storing data in documents are dynamic & flexible schema and  the ability to store arrays can be seen from our simple Python scripts. 

##### Connecting to a specific Database in our cluster.


In [9]:
# alternative
#db = client.test
#collection = db.video # or db['video']

In [11]:
#connecting to database "video"
db = client.video  # or client['video']
# Issue the serverStatus command and print the results
serverStatusResult=db.command("serverStatus")
pprint(serverStatusResult)

{'$clusterTime': {'clusterTime': Timestamp(1743901188, 51),
                  'signature': {'hash': b"]'\xec\x8c\xdb(\xcb<\xdf\xadH\x1b"
                                        b'z\xbb\xfey\xa6\x7f\xba\n',
                                'keyId': 7427503780691705873}},
 'atlasVersion': {'gitVersion': '8f5553016078a4aec29ed583dfdc238f744b02e8',
                  'version': '20250312.0.0.1741888777'},
 'connections': {'available': 489, 'current': 11, 'totalCreated': 53},
 'extra_info': {'note': 'fields vary by platform', 'page_faults': 0},
 'host': 'cluster0-shard-00-02.ecfyk.mongodb.net:27017',
 'localTime': datetime.datetime(2025, 4, 6, 0, 59, 48, 666000),
 'mem': {'bits': 64,
         'mapped': 0,
         'mappedWithJournal': 0,
         'resident': 0,
         'supported': True,
         'virtual': 0},
 'metrics': {'aggStageCounters': {'search': 0,
                                  'searchBeta': 0,
                                  'searchMeta': 0,
                                  

##### Exploring different Collection in Database

With collection_names(), we get list available collections in the database.

In [13]:
with client:
    
    db = client.video
    print(db.list_collection_names())

['movieDetails']


##### Connecting to a specific Database in our cluster.

We need to run this step again as pymongo auto-closes the connection after previous step


In [15]:
# alternative
#db = client.test
#collection = db.video # or db['video']

In [19]:
#connecting to database "video"
#insert your mongodb connection string here
client = pymongo.MongoClient("mongodb+srv://heeyaamin03:heeya@cluster0.ecfyk.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0")

db = client.video  # or client['video']
# Issue the serverStatus command and print the results
serverStatusResult=db.command("serverStatus")
pprint(serverStatusResult)

{'$clusterTime': {'clusterTime': Timestamp(1743901226, 45),
                  'signature': {'hash': b'\xec\x17\xc1E\xb9\x87QJ3\xd2at'
                                        b'+{\xfb\xd7\xe7\xed\x85\xf9',
                                'keyId': 7427503780691705873}},
 'atlasVersion': {'gitVersion': '8f5553016078a4aec29ed583dfdc238f744b02e8',
                  'version': '20250312.0.0.1741888777'},
 'connections': {'available': 489, 'current': 11, 'totalCreated': 56},
 'extra_info': {'note': 'fields vary by platform', 'page_faults': 0},
 'host': 'cluster0-shard-00-02.ecfyk.mongodb.net:27017',
 'localTime': datetime.datetime(2025, 4, 6, 1, 0, 26, 535000),
 'mem': {'bits': 64,
         'mapped': 0,
         'mappedWithJournal': 0,
         'resident': 0,
         'supported': True,
         'virtual': 0},
 'metrics': {'aggStageCounters': {'search': 0,
                                  'searchBeta': 0,
                                  'searchMeta': 0,
                                  'v

##### Reading Data in Sorted Order

In [21]:
from pymongo import  DESCENDING
for x in db.myMovies.find().sort("year", DESCENDING):
    print(x)

##### Creating Sample Data in Movies Collection

In [23]:
#Step 1: Create sample data
_id = ['tt0084740','tt0084741','tt0084742','tt0084743','tt0084744','tt0084745','tt0084746','tt0084747','tt0084748','tt0084749',
      'tt0084750','tt0084751']
title = ['Avengers','Batman Begins','Dark Knight', 'Dark Knight Rises', 
         'Wonder Women','Iron Man','Ant Man', 'Thor','Dr Strange', 'Captain America','X- Men:First Class','Superman', 'Hulk']
year = [2012,2012,2013,2014,2018,2012,2018,2013,2013,2012,2013,2018] #Dont go on real release dates :P

for x in range(0, 12):
    movie = {
        'id': _id[x],
        'title' : title[x],
        'year' : year[x],
        'type' : 'movies'
    }
    #Step 2: Insert movies object directly into MongoDB via isnert_one
    result=db.myMovies.insert_one(movie)
    #Step 3: Print to the console the ObjectID of the new document
    print('Created {0} of 12 as {1}'.format(x,result.inserted_id))
#Step 4: Tell us that you are done
print('finished creating 12 movies')

Created 0 of 12 as 67f1d236318468ab1ed84169
Created 1 of 12 as 67f1d237318468ab1ed8416a
Created 2 of 12 as 67f1d237318468ab1ed8416b
Created 3 of 12 as 67f1d237318468ab1ed8416c
Created 4 of 12 as 67f1d237318468ab1ed8416d
Created 5 of 12 as 67f1d237318468ab1ed8416e
Created 6 of 12 as 67f1d237318468ab1ed8416f
Created 7 of 12 as 67f1d237318468ab1ed84170
Created 8 of 12 as 67f1d237318468ab1ed84171
Created 9 of 12 as 67f1d237318468ab1ed84172
Created 10 of 12 as 67f1d237318468ab1ed84173
Created 11 of 12 as 67f1d237318468ab1ed84174
finished creating 12 movies


##### Accessing collection myMovies

 In MongoDB the find_one command is used to query for a single document much like select statements are used in relational databases. To use the find_one command in PyMongo we pass a Python dictionary that specifies the search criteria.

In [25]:
result = db.myMovies.find_one({'year': 2013}) #please feel free to try different year incase you are getting a 'None'
print(result)

{'_id': ObjectId('67f1d237318468ab1ed8416b'), 'id': 'tt0084742', 'title': 'Dark Knight', 'year': 2013, 'type': 'movies'}


The function "find” will return all documents that match the search criteria. These cursors also support methods like count() which returns the number of results in the query.

In [27]:
movies_2013 = db.myMovies.count_documents({'year': 2013})
print(movies_2013)

4


Consider the scenario where you want to sum the occurrence of each year across the entire data set. You could simply issue a single query using the MongoDB aggregation.

In [29]:
stargroup=db.myMovies.aggregate(
# The Aggregation Pipeline is defined as an array of different operations
[
# The first stage in this pipe is to group data
{ '$group':
    { '_id': "$year",
     "count" : 
                 { '$sum' :1 }
    }
},
# The second stage in this pipe is to sort the data
{"$sort":  { "_id":1}
}
# Close the array with the ] tag             
] )
# Print the result
for group in stargroup:
    print(group)

{'_id': 2012, 'count': 4}
{'_id': 2013, 'count': 4}
{'_id': 2014, 'count': 1}
{'_id': 2018, 'count': 3}


### Update data 

There exists functions to help you update your MongoDB data including update_one, update_many and replace_one. The update_one method will update a single document based on a query that matches a document. 

Following code updates a document with this new “star_rating” field.

In [31]:

SampleRecord = db.myMovies.find_one({'year': 2013})
print('A sample document:')
pprint(SampleRecord)

result = db.myMovies.update_one({'_id' : SampleRecord.get('_id') }, {'$inc': {'star_rating': 5}})
print('Number of documents modified : ' + str(result.modified_count))

UpdatedDocument = db.myMovies.find_one({'_id':SampleRecord.get('_id')})
print('The updated document:')
pprint(UpdatedDocument)

A sample document:
{'_id': ObjectId('67f1d237318468ab1ed8416b'),
 'id': 'tt0084742',
 'title': 'Dark Knight',
 'type': 'movies',
 'year': 2013}
Number of documents modified : 1
The updated document:
{'_id': ObjectId('67f1d237318468ab1ed8416b'),
 'id': 'tt0084742',
 'star_rating': 5,
 'title': 'Dark Knight',
 'type': 'movies',
 'year': 2013}


Notice that the original document did not have the “star_rating” field and an update allowed us to easily add the field to the document. This ability to dynamically add keys without the hassle of costly Alter_Table statements is the power of MongoDB’s flexible data model. It makes rapid application development a reality.

If you wanted to update all the fields of the document and keep the same ObjectID you will want to use the replace_one function. 

### Deleting documents

 Functions such as delete_one and delete_many take a query that matches the document to delete as the first parameter.

In [33]:
#check data you want to delete

result = db.myMovies.find_one({'year': 2018})
print(result)

{'_id': ObjectId('67f1d237318468ab1ed8416d'), 'id': 'tt0084744', 'title': 'Wonder Women', 'year': 2018, 'type': 'movies'}


In [35]:
# Delete the records
result = db.myMovies.delete_many({"year": 2018})

In [37]:
#check if delete or not

result = db.myMovies.find_one({'year': 2018})
print(result)

None


If you are deleting a large number of documents it may be more efficient to drop the collection instead of deleting all the documents.