# MongoDB Queries in Python <br>

## Muzammil Mushtaq <br>


## Tasks <br>

In this concise project, our objective was to establish a connection with a MongoDB database, create a collection, and populate it with documents using a Python client. Additionally, we crafted a Python application designed to generate random JSON-style student data. Subsequently, we explored the process of pushing this student dataset into MongoDB. Lastly, we implemented a Python application to execute various MongoDB queries, including insertion, updating, calculating averages, counting, and deleting documents within the MongoDB database.

### *Test Database Connection*

In [2]:
import pymongo
connection_string = "mongodb://localhost:27017"

try:
    client = pymongo.MongoClient(connection_string)
    database = client['Test_database']  # replace with your_database_name

    # List all available databases
    print("List of available databases:")
   # print(client.list_database_names())

    # Access a collection in the database
    collection = database['Test_collection'] # replace with your_collection_name

    # Count the documents in the collection
    document_count = collection.count_documents({})
    print(f"Number of documents in 'Test_collection': {document_count}")

    print("Connection test successful!")

except pymongo.errors.ConnectionFailure as e:
    print(f"Could not connect to MongoDB: {e}")


List of available databases:
Number of documents in 'Test_collection': 0
Connection test successful!


### *Develop a Python application that generates random JSON-style documents of student dataset.*

In [3]:
import json
import random

# Sample data for generating random student records
student_names = ["Alice", "Bob", "Charlie", "David", "Emma", "Frank", "Grace", "Henry", "Ivy", "Jack"]
subjects = ["Math", "Science", "History", "Literature", "Programming", "Data Science", "Astrophysics"]

def generate_student():
    # Generate a random student record
    name = random.choice(student_names)
    age = random.randint(18, 25)
    subject = random.choice(subjects)

    student = {
        "name": name,
        "age": age,
        "subject": subject
    }
    return student

def generate_student_records(num_records):
    # Generate multiple random student records
    students = [generate_student() for _ in range(num_records)]
    return students

if __name__ == "__main__":
    num_of_records = 100  # Number of student records to generate

    # Generate random student records
    students_data = generate_student_records(num_of_records)

    # Save generated data to a JSON file
    with open('students_data.json', 'w') as file:
        json.dump(students_data, file, indent=2)

    print(f"Generated {num_of_records} student records and saved to students_data.json")


Generated 100 student records and saved to students_data.json


### *Push the Student dataset (student_data.json) into MongoDB*

In [4]:
import pymongo
import json

with open('students_data.json') as file:
    students_data = json.load(file)
    
# Insert the data into the collection
result = collection.insert_many(students_data)

# Print inserted document IDs
print("Uncomment to print the Inserted document IDs:")
#for id in result.inserted_ids:
#    print(id)

Uncomment to print the Inserted document IDs:


### *MongoDB Queries*

In [5]:
'''                                 Insert new documents
'''
insert_student = [{
        "name": 'Muzammil',
        "age": 31,
        "subject": 'Astrophysics'
    },
    {
        "name": 'Kira',
        "age": 26,
        "subject": 'German Language'
    }
]

# Insert the new student data into the collection
result = collection.insert_many(insert_student)

# Print the inserted document IDs
for doc_id in result.inserted_ids:
    print("Inserted document ID:", doc_id)
    
print (150*'*')
#**********************************************************************************

'''                                Delete documents
'''
delete_criteria = {
    "name": "Muzammil",
    "subject": "Astrophysics"
}

# Delete documents that match the specified criteria
result = collection.delete_one(delete_criteria)

# Print the number of documents deleted
print("Number of documents deleted:", result.deleted_count)

print (150*'*')
#**********************************************************************************

'''                                Average Operation
'''

pipeline_avg = [
    {"$group": {
    "_id": None,
    "average_age": {"$avg": "$age"}
    }
    }
]
result_avg = list(collection.aggregate(pipeline_avg))
print("Average score for all students:", int(result_avg[0]['average_age']))

print (150*'*')
#**********************************************************************************

'''                             GROUPBY and COUNT operation students in each department
'''
pipeline_group_count = [
    {
        "$group": {
            "_id": "$subject",  # Group by the 'age' field
            "count": {"$sum": 1}  # Count occurrences in each group
        }
    }
]
result_group_count = list(collection.aggregate(pipeline_group_count))
pretty_result = json.dumps(result_group_count, indent=1)
print(pretty_result)

print (150*'*')
#**********************************************************************************

'''                             Updating the documents with specific criteria
'''
update_student = {
        "name": 'Jack',
        "age": 22,
        "subject": 'Science'
    }
filter_criteria = {'name': update_student['name']}
update_data = {'$set': update_student}  # Update the entire document with the new data

result = collection.update_one(filter_criteria, update_data)

Inserted document ID: 65a5686be6ecb0e20c00ac0e
Inserted document ID: 65a5686be6ecb0e20c00ac0f
******************************************************************************************************************************************************
Number of documents deleted: 1
******************************************************************************************************************************************************
Average score for all students: 21
******************************************************************************************************************************************************
[
 {
  "_id": "Astrophysics",
  "count": 19
 },
 {
  "_id": "German Language",
  "count": 1
 },
 {
  "_id": "Science",
  "count": 15
 },
 {
  "_id": "History",
  "count": 11
 },
 {
  "_id": "Programming",
  "count": 20
 },
 {
  "_id": "Math",
  "count": 6
 },
 {
  "_id": "Data Science",
  "count": 14
 },
 {
  "_id": "Literature",
  "count": 15
 }
]
************************************