## Introduction to Databases
- A database is an organized collection of data that is stored and managed electronically. 
- Databases are designed to store large amounts of information in a structured way, making it easy to retrieve, update and delete information efficiently and securely.
- The data in a database is typically organized in tables composed of rows and columns. Each row represents a record, and each column represents a field in that record.
- Databases are used in various applications, from simple personal data storage to complex systems like banking, e-commerce, and enterprise resource planning (ERP). 
- They provide the backbone for managing the data needed by software applications to function effectively.

### Importance of Databases
- Efficient Data Management: Databases organize and store large amounts of data efficiently, making retrieval and updates quick and easy.
- Data Integrity: They ensure data is accurate and consistent through rules and transaction management.
- Security: Databases protect data with access controls and encryption, ensuring only authorized users can view or modify it.
- Data Retrieval: Using SQL and indexes, databases allow fast and precise data queries and searches.
- Backup and Recovery: Automated backups and recovery options protect data from loss or corruption.
- Concurrency Control: Databases manage simultaneous data access by multiple users without conflicts.
- Data Relationships: They enable linking related data across tables, simplifying complex queries.
- Data Redundancy: Databases minimize duplicate data and enforce consistency through normalization.
- Reporting and Analytics: They support data analysis and reporting, aiding in informed decision-making.
- Flexibility and Customization: Databases allow custom data structures and extend functionality with features like stored procedures.
- Collaboration and Sharing: Centralized databases enable easy data sharing and consistent information across teams and applications.
- Data Migration and Integration: They facilitate moving and integrating data between systems, ensuring smooth transitions.

### Types of Databases

#### Relational Databases:
- Data is stored in tables with rows and columns.
- Tables can be linked (related) based on common data, and this relationship makes it easy to retrieve and manage the data using Structured Query Language (SQL).

Examples:
- SQLite: A lightweight, serverless, and self-contained relational database engine. It’s embedded in many applications, including browsers and mobile apps, and is ideal for applications with a smaller data footprint. It's built into Python
- MySQL: Widely used in web development for applications like WordPress.
- PostgreSQL: Known for its robustness and support for advanced features.
- Oracle Database: Used in large enterprises for handling vast amounts of data.

#### NoSQL Databases
- Used for more flexible or unstructured data (e.g., JSON-like documents).

Examples:

- MongoDB
- Firebase
- Cassandra

#### Object-Oriented Databases:

- These databases store data in the form of objects, just like how we define objects in Object-Oriented Programming (OOP) in Python.
- Each object includes both data and behavior (methods).
- This type of database works best when you're building software in an OOP language and want a seamless way to store complex data.

Use Case:
- Storing structured data like customers, orders, shapes, or multimedia files in object form.
- Games, simulations, or CAD systems where object properties and behaviors need to be stored together.

Examples:
- db4o: An object-oriented database for Java and .NET developers.
- ObjectDB: A high-performance object database for Java.

#### Graph Databases:
- These databases store data using nodes, edges, and properties — just like a graph in math.
    - Nodes = entities (like people)
    - Edges = relationships (like “follows” or “friends with”)
    - Properties = extra info (like age or city)
- They're perfect when you want to model complex relationships.

Use Case:
- Social networks (e.g., who is connected to whom)
- Fraud detection (tracking suspicious links)
- Recommendation systems (suggesting friends, products)

Examples:
- Neo4j – The most popular graph database, widely used for relationship-heavy data.
- Amazon Neptune – AWS-managed service for scalable graph applications.

#### In-Memory Databases:

- Instead of storing data on a hard disk, these databases keep everything in RAM (memory), making them extremely fast.
- Since memory is faster than disk, these databases are used where speed and real-time performance are critical.

Use Case:
- Real-time analytics
- High-frequency trading
- Gaming leaderboards or session data
- Caching frequently accessed data

Examples:
- SAP HANA – Used for enterprise-level real-time analytics.
- Memcached – A caching system to speed up dynamic websites by storing frequently used data in memory.

#### Cloud Databases:

- These are databases that run on cloud platforms like AWS, Google Cloud, or Azure. You don’t have to manage the hardware or software — it’s all handled for you.
- They offer scalability, automatic backups, and easy remote access.

Use Case:
- Web and mobile apps that need to scale quickly
- Teams working remotely or globally
- Startups that don’t want to manage their own database servers

 Examples:
- Amazon RDS – A managed SQL database that supports MySQL, PostgreSQL, Oracle, and more.
- Google Cloud Firestore – A NoSQL document database for building serverless mobile/web apps.

### Real-World Examples 
- Social Media Platforms: Databases are used to store user profiles, posts, comments, likes, and more. For instance, Facebook uses databases to manage billions of user records.
- E-commerce Websites: Online stores like Amazon use databases to manage product inventories, customer orders, payment information, and shipment tracking.
- Banking Systems: Banks use databases to track customer accounts, transactions, loans, and financial histories securely and accurately.
- Healthcare Systems: Hospitals and clinics use databases to store patient records, treatment histories, medication prescriptions, and insurance information.

###  Core Concepts in Relational Databases
| Term            | Meaning                                              |
| --------------- | ---------------------------------------------------- |
| **Table**       | A set of rows and columns (like Excel sheets)        |
| **Row**         | A single record (e.g., one student)                  |
| **Column**      | A field of data (e.g., name, age, grade)             |
| **Primary Key** | A unique identifier for each row                     |
| **Foreign Key** | A link between tables (e.g., student to course)      |
| **Query**       | A command to get or change data (`SELECT`, `INSERT`) |

### Structured Query Language (SQL) 
- Structured Query Language (SQL) is the standard language for interacting with relational databases. 
- SQL allows you to perform various operations, including creating and managing tables, inserting and querying data, and even automating tasks with stored procedures and triggers.

### Common SQL Commands
| SQL Command    | What It Does                   |
| -------------- | ------------------------------ |
| `CREATE TABLE` | Create a new table             |
| `INSERT INTO`  | Add data                       |
| `SELECT`       | Retrieve data                  |
| `UPDATE`       | Modify existing data           |
| `DELETE`       | Remove data                    |
| `WHERE`        | Add condition (filtering data) |

### Using SQLite in Python

#### 1. Import the Library

In [1]:
import sqlite3

#### 2. Create a New Database and Connect to It
This creates a file called `school.db`:

In [None]:
# Create or connect to the database
conn = sqlite3.connect("school.db")
conn.execute("PRAGMA foreign_keys = ON")  # Enforces foreign key constraints
cursor = conn.cursor()

`conn = sqlite3.connect("school.db")`
- `sqlite3.connect()` is a function that connects Python to an SQLite database file.
- `"school.db"` is the name of the database file you’re connecting to.
    - If this file already exists, Python connects to it.
    - If it doesn't exist, Python creates it automatically in the current directory.
- `conn` is the connection object, which keeps the link between Python and the database open.

`cursor = conn.cursor()`
- A cursor is like a “control tool” that lets you execute SQL commands (like SELECT, INSERT, UPDATE, DELETE) on the database.
- You need the cursor object to run any SQL queries.
- Think of `conn` as the door to the database, and `cursor` as the pen that writes commands.

 #### 3. Create the students Table

In [30]:
# cursor.execute sends an SQL command to the database through the cursor
cursor.execute("""     
CREATE TABLE IF NOT EXISTS students (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL,
    age INTEGER,
    grade TEXT,
    email TEXT UNIQUE
)
""")
conn.commit() #saves any changes made to the database permanently.


#### 3.1 Add a new column
You can use ALTER TABLE to add a column:

In [None]:
cursor.execute("ALTER TABLE students ADD COLUMN phone TEXT")
conn.commit()

 `Note:` SQLite only allows limited structural changes via ALTER TABLE. You can add columns, but not delete or rename columns directly.

#### 3.2 Rename a column (SQLite ≥ 3.25.0)

In [None]:
cursor.execute("ALTER TABLE students RENAME COLUMN grade TO class_level")
conn.commit()

#### 3.3 Rename the table

In [None]:
cursor.execute("ALTER TABLE students RENAME TO learners")
conn.commit()

#### 3.4 To delete or change column types
`SQLite doesn't support directly dropping columns or changing data types. Instead, you'll need to:`

Step-by-step workaround:
Rename the old table:

In [None]:
cursor.execute("ALTER TABLE students RENAME TO students_old")

Create a new table with the updated structure:

In [None]:
cursor.execute("""
CREATE TABLE students (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL,
    age INTEGER,
    email TEXT UNIQUE,
    phone TEXT
)
""")

##### Copy the data:
Only include columns that exist in both tables.

In [None]:
cursor.execute("""
INSERT INTO students (id, first_name, last_name, age, email)
SELECT id, first_name, last_name, age, email FROM students_old
""")

In [None]:
cursor.execute("DROP TABLE students_old")

Commit the changes:

In [None]:
conn.commit()

#### 4. Insert a Record into students

In [31]:
cursor.execute("""
INSERT INTO students (first_name, last_name, age, grade, email)
VALUES (?, ?, ?, ?, ?)
""", ("Cate", "Michael", 17, "10th", "john.doe@gmail.com"))
conn.commit()


IntegrityError: UNIQUE constraint failed: students.email

#### 5. Select All Records (Read)

In [32]:
cursor.execute("SELECT * FROM students")
students = cursor.fetchall()

for student in students:
    print(student)

(2, 'Alice', 'Smith', 16, '10th', 'alice.smith@example.com')
(3, 'John', 'Doe', 15, '10th', 'john.doe@example.com')
(4, 'Cate', 'Michael', 17, '10th', 'john.doe@gmail.com')
(5, 'Melody', 'Bonareri', 16, '10th', 'melody@gmail.com')
(6, 'Joy', 'Kilaha', 15, '10th', 'joy@gmail.com')
(7, 'Jude', 'Wandera', 17, '11th', 'jude.@gmail.com')
(8, 'Alice', 'Odhiambo', 15, '9th', 'alice.odhiambo@gmail.com')
(9, 'Brian', 'Mutua', 16, '10th', 'brian.mutua@gmail.com')
(10, 'Cynthia', 'Mwende', 17, '11th', 'cynthia.mwende@gmail.com')


#### 6. Update a Student’s Grade
Update student with `id = 1`:

In [33]:
cursor.execute("""
UPDATE students
SET age = 18
WHERE id = 2
""")
conn.commit()

In [34]:
cursor.execute("SELECT * FROM students")
students = cursor.fetchall()
students

[(2, 'Alice', 'Smith', 18, '10th', 'alice.smith@example.com'),
 (3, 'John', 'Doe', 15, '10th', 'john.doe@example.com'),
 (4, 'Cate', 'Michael', 17, '10th', 'john.doe@gmail.com'),
 (5, 'Melody', 'Bonareri', 16, '10th', 'melody@gmail.com'),
 (6, 'Joy', 'Kilaha', 15, '10th', 'joy@gmail.com'),
 (7, 'Jude', 'Wandera', 17, '11th', 'jude.@gmail.com'),
 (8, 'Alice', 'Odhiambo', 15, '9th', 'alice.odhiambo@gmail.com'),
 (9, 'Brian', 'Mutua', 16, '10th', 'brian.mutua@gmail.com'),
 (10, 'Cynthia', 'Mwende', 17, '11th', 'cynthia.mwende@gmail.com')]

#### 7. Delete a Student Record
Delete the student with `id = 1`:

In [35]:
cursor.execute("DELETE FROM students WHERE id = 10")
conn.commit()

In [36]:
cursor.execute("SELECT * FROM students")
students = cursor.fetchall()
students

[(2, 'Alice', 'Smith', 18, '10th', 'alice.smith@example.com'),
 (3, 'John', 'Doe', 15, '10th', 'john.doe@example.com'),
 (4, 'Cate', 'Michael', 17, '10th', 'john.doe@gmail.com'),
 (5, 'Melody', 'Bonareri', 16, '10th', 'melody@gmail.com'),
 (6, 'Joy', 'Kilaha', 15, '10th', 'joy@gmail.com'),
 (7, 'Jude', 'Wandera', 17, '11th', 'jude.@gmail.com'),
 (8, 'Alice', 'Odhiambo', 15, '9th', 'alice.odhiambo@gmail.com'),
 (9, 'Brian', 'Mutua', 16, '10th', 'brian.mutua@gmail.com')]

#### 8. Create a Second Table (courses) and Add a Foreign Key

- A `foreign key` is a column or a set of columns in one table that refers to the primary key in another table. 
- The table containing the foreign key is often referred to as the child table, while the table being referenced is called the parent table.
- Foreign keys are used to create a relationship between two tables, ensuring that the data in the child table corresponds to valid entries in the parent table.

In [None]:
cursor.execute("""
CREATE TABLE IF NOT EXISTS courses (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    student_id INTEGER,
    course_name TEXT,
    FOREIGN KEY(student_id) REFERENCES students(id) 
)
""")
conn.commit()
# This will allow us to link each course to a specific student.

The foreign key constraint is defined by `FOREIGN KEY (student_id) REFERENCES students(id)`, which ensures that every student_id in the courses table must correspond to a valid id in the students table.

### Why Use Foreign Keys?
- Referential Integrity: Foreign keys ensure that relationships between tables remain consistent. For example, you cannot add a record to the courses table with a student_id that does not exist in the students table.
- Cascading Actions: Foreign keys can be used with cascading actions to automate updates or deletions. For example, if a student is deleted from the students table, all their associated courses can also be deleted automatically.

### Enforcing Referential Integrity
Let’s say you try to insert a course with a student_id that doesn’t exist in the students table:

In [19]:
cursor.execute("""
               INSERT INTO courses (student_id, course_name)
               VALUES (?,?)""",(99, 'Physics'))
conn.commit()

IntegrityError: FOREIGN KEY constraint failed

### Enabling Foreign Keys in SQLite

By default, SQLite may not enforce foreign key constraints unless they are explicitly enabled.

`PRAGMA foreign_keys = ON;`

`PRAGMA` is a special command in SQLite that lets you query or change settings of your database engine.

### Using Foreign Keys with Cascading Actions
SQLite supports several actions that can be performed automatically when a foreign key constraint is violated:

- `ON DELETE CASCADE`: Automatically deletes the related rows in the child table when a row in the parent table is deleted.
- `ON UPDATE CASCADE`: Automatically updates the related rows in the child table when a row in the parent table is updated.

Example: Adding ON DELETE CASCADE

Suppose we want to ensure that when a student is deleted, all their associated courses are also deleted:

```sql
CREATE TABLE courses (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    student_id INTEGER,
    course_name TEXT NOT NULL,
    FOREIGN KEY (student_id) REFERENCES students(id) ON DELETE CASCADE
);
```

#### 9. Inserting Data with Foreign Keys
Now, let’s insert some data into both tables.

In [10]:
cursor.execute("INSERT INTO students (first_name, last_name, age, grade, email) VALUES (?, ?, ?, ?, ?)", 
               ("Alice", "Smith", 16, "10th", "alice.smith@example.com"))

cursor.execute("INSERT INTO courses (student_id, course_name) VALUES (?, ?)",
               (1, "Mathematics"))
conn.commit()

Perform a JOIN:

In [None]:
cursor.execute("""
SELECT students.first_name, students.last_name, courses.course_name
FROM students
JOIN courses ON students.id = courses.student_id
""")

for row in cursor.fetchall():
    print(row)

### Inserting Multiple Students

In [23]:
students_data = [
    ("Alice", "Odhiambo", 15, "9th", "alice.odhiambo@gmail.com"),
    ("Brian", "Mutua", 16, "10th", "brian.mutua@gmail.com"),
    ("Cynthia", "Mwende", 17, "11th", "cynthia.mwende@gmail.com"),
]

cursor.executemany("""
    INSERT INTO students (first_name, last_name, age, grade, email)
    VALUES (?, ?, ?, ?, ?)
""", students_data)

conn.commit()

In [26]:
cursor.execute("""SELECT * FROM students""")
students = cursor.fetchall()

for student in students:
    print(student)

(2, 'Alice', 'Smith', 16, '10th', 'alice.smith@example.com')
(3, 'John', 'Doe', 15, '10th', 'john.doe@example.com')
(4, 'Cate', 'Michael', 17, '10th', 'john.doe@gmail.com')
(5, 'Melody', 'Bonareri', 16, '10th', 'melody@gmail.com')
(6, 'Joy', 'Kilaha', 15, '10th', 'joy@gmail.com')
(7, 'Jude', 'Wandera', 17, '11th', 'jude.@gmail.com')
(8, 'Alice', 'Odhiambo', 15, '9th', 'alice.odhiambo@gmail.com')
(9, 'Brian', 'Mutua', 16, '10th', 'brian.mutua@gmail.com')
(10, 'Cynthia', 'Mwende', 17, '11th', 'cynthia.mwende@gmail.com')


### Deleting a Record with ON DELETE CASCADE
Now, if you delete Alice from the students table:

In [27]:
cursor.execute("""DELETE FROM students WHERE id = 1""")
conn.commit()

All records related to Alice in the courses table will also be deleted automatically, thanks to the ON DELETE CASCADE rule.

In [28]:
cursor.execute("""SELECT * FROM courses""")
courses = cursor.fetchall()

for course in courses:
    print(course)

(1, 1, 'Mathematics')
(2, 99, 'Physics')


#### 10. Close the Database Connection

In [12]:
conn.close()

### Overview of NoSQL Databases
- NoSQL databases are non-relational databases designed to handle large volumes of unstructured, semi-structured, or structured data.

### Properties of NoSQL Databases
- `Schema Flexibility`: You don’t need to define the structure of your data before storing it. This means you can add new fields anytime without breaking your database.
    - Perfect for situations where the structure of your data keeps changing — like adding new features to a web app or working with data from different sources
- `Scalability`: NoSQL databases are built to scale horizontally, meaning they can handle more traffic by adding more machines (servers), not just upgrading one big server.
    - Ideal for systems like social media, e-commerce, or real-time apps that grow fast and need to support lots of users and data.
- `High Performance`: Designed to handle large volumes of data with low latency, NoSQL databases are optimized for performance, particularly in big data and real-time web applications.
- `Distributed Architecture`: NoSQL databases store and manage data across multiple machines by default. This makes them more fault-tolerant.
    - If one server goes down, the others keep the system running. This makes NoSQL a great choice for high-availability systems like cloud apps or global platforms.



### Types of NoSQL Databases
- Document Stores: Store data as JSON-like documents.
    - Example: MongoDB
- Key-Value Stores: Data is stored as key-value pairs.
    - Example: Redis, Amazon DynamoDB
- Wide-Column Stores: Data is stored in tables, rows, and columns, but with flexible columns.
    - Example: Apache Cassandra, HBase
- Graph Databases: Focus on relationships between entities, represented as nodes and edges.
    - Example: Neo4j, Amazon Neptune

### Applications of NoSQL Databases
1. Big Data: Handling vast amounts of data across distributed systems (e.g., Hadoop and Cassandra).
2. Real-Time Web Applications: Storing and retrieving data quickly for dynamic, real-time applications like social media, e-commerce, and online gaming (e.g., MongoDB, Redis).
3. Content Management Systems (CMS): Flexible storage for various types of content without predefined schema requirements (e.g., MongoDB).
4. Internet of Things (IoT): Efficiently managing streams of data from IoT devices (e.g., Cassandra, InfluxDB).

### Real-Life Examples
1. MongoDB:
- Application: Used by companies like Uber and eBay for flexible, scalable data storage, handling dynamic user data and real-time analytics.
2. Cassandra:
- Application: Used by Facebook for its inbox search, handling massive amounts of data with high availability and fault tolerance.
3. Neo4j:
- Application: Used by LinkedIn for managing and analyzing its social graph, tracking complex relationships between users.

# Introduction to MongoDB

**MongoDB** is a popular **NoSQL database** known for being:
- **Scalable** – easily handles growing data.
- **Flexible** – stores data without needing a fixed structure.
- **Easy to use** – works well with modern applications.

It stores data in **JSON-like documents**. 

Unlike SQL databases (which use tables and rows), MongoDB uses collections and documents.

---

## Key Features of MongoDB

- **Document-Oriented:**  
  Stores data in **BSON** (Binary JSON) format, which supports complex, nested data.

- **Schema-Less:**  
  Collections don’t require a fixed structure. Each document can look different — perfect for unstructured or semi-structured data.

- **Horizontal Scalability (Sharding):**  
  Can **spread data across servers**, helping your app scale as needed.

- **High Availability (Replication):**  
  Uses **replica sets** to copy data across multiple servers. If one fails, another takes over.

- **Indexing:**  
  Supports various indexes (e.g. text, geospatial, compound) to **speed up queries**.

- **Aggregation Framework:**  
  Built-in tools for **filtering, grouping, sorting**, and more—right inside the database.

- **File Storage (GridFS):**  
  Allows storing large files (e.g. images, videos) directly in MongoDB.

---

## MongoDB Architecture Overview

- **Document:**  
  - A record in MongoDB. 
  - Stored in BSON (JSON-like) format.
```json
{
  "firstName": "Alice",
  "age": 25,
  "skills": ["Python", "MongoDB"]
}
```

- **Collection:**  
  - Group of documents (like a table in SQL). 
  - Documents inside a collection can have different structures.

- **Database:**  
  - A container that holds multiple collections.

---

### MongoDB Terminology (vs SQL)
| SQL Term    | MongoDB Term                  |
| ----------- | ----------------------------- |
| Table       | Collection                    |
| Row         | Document                      |
| Column      | Field                         |
| Primary Key | `_id` field                   |
| JOIN        | Manual reference or embedding |


### How to Create an Account and Set Up MongoDB Atlas
#### 1. Create a MongoDB Atlas Account
#### 2. Sign Up:
    - Go to the MongoDB Atlas website.
    - Click on “Start Free” and sign up using your email or Google account.
#### 3. Create a New Project:
    - After logging in, create a new project (e.g., “School Database Project”).
#### 4.Create a Cluster:
    - Select a cloud provider (AWS, GCP, Azure) and a region.
    - Choose the free tier option and create the cluster.
#### 5. Create a Database User:
    - Go to “Database Access” and create a new user with a username and password.
#### 6. Set Up Network Access:
    - In “Network Access”, add your IP address to allow connections from your machine.
#### 7. Get Your Connection String:
    - Go to “Clusters”, click on “Connect”, and choose “Connect your application”. Copy the provided connection string.

### Connecting MongoDB Atlas Using Python (pymongo)
#### 1. Install pymongo:

Open your terminal and install the pymongo package:

```bash
pip install pymongo
```

#### 2. Connect to MongoDB Atlas:

In [2]:
from pymongo import MongoClient

In [5]:
# Replace <username>, <password>, and <cluster-url> with your actual connection string details
client = MongoClient("mongodb+srv://melodybonareri:Perpetua123*@cluster0.s9jyobm.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0")

- MongoClient() connects your Python app to your MongoDB Atlas cluster.
- Replace `<username>`, `<password>`, and `<cluster-url>` with the values from your MongoDB connection string.
- This gives you access to the database from Python.

In [27]:
# Connect to the 'school' database
db = client.school    # selects or creates the "school" database

- This connects to a database called school.
- And then accesses a collection (like a table) called students.

In [39]:
# Access the 'students' collection
student_collection = db.students

### CRUD Operations with MongoDB Atlas
#### Create a Collection and Insert a Document
Insert a Single Document:


In [40]:
student_doc = {
    "name": "John",
    "age": 18,
    "grade": "12th"
}

student_collection.insert_one(student_doc)

InsertOneResult(ObjectId('685ae67cce8d866fc75b53d8'), acknowledged=True)

Insert Multiple Documents:

In [41]:
student_documents = [
    {"name": "Alice", "age": 14, "grade": "9th"},
    {"name": "Bob", "age": 16, "grade": "11th"}
]
result = student_collection.insert_many(student_documents)
print(f"Inserted documents with IDs: {result.inserted_ids}")

Inserted documents with IDs: [ObjectId('685ae689ce8d866fc75b53d9'), ObjectId('685ae689ce8d866fc75b53da')]


### Read Documents (Query Data)
Find a Single Document:

In [43]:
student = student_collection.find_one({"name": "John"})
print(f"Found student: {student}")

Found student: {'_id': ObjectId('685ae55fce8d866fc75b53d5'), 'name': 'John', 'age': 18, 'grade': '12th'}


Find Multiple Documents:

In [45]:
students = student_collection.find({"grade": "9th"})
for student in students:
    print(student)

{'_id': ObjectId('685ae689ce8d866fc75b53d9'), 'name': 'Alice', 'age': 14, 'grade': '9th'}


Query Specific Fields:

In [47]:
students = student_collection.find({}, {"name": 1, "grade": 1, "_id": 0})  # Only return firstName and grade, exclude _id
for student in students:
    print(student)

{'name': 'John', 'grade': '12th'}
{'name': 'Alice', 'grade': '9th'}
{'name': 'Bob', 'grade': '11th'}


- {} = the filter part.
    - An empty dictionary means "match all documents" (no filtering).

- {"firstName": 1, "grade": 1} = the projection part.
→ This tells MongoDB to include only the firstName and grade fields in the output.

- By default, MongoDB will also include the _id field unless you explicitly exclude it with "_id": 0.

### Update Data
- Update a Single Document:

`“$set”` statement is used in an update operation to modify the value of a specific field in a document.

In [49]:
student_collection.update_one(
    {"name": "John"},
    {"$set": {"grade": "11th"}}
)
updated_student = student_collection.find_one({"name": "John"})
print(f"Updated student: {updated_student}")

Updated student: {'_id': ObjectId('685ae67cce8d866fc75b53d8'), 'name': 'John', 'age': 18, 'grade': '11th'}


- Update Multiple Documents:

In [54]:
student_collection.update_many(
    {"grade": "12th"},
    {"$set": {"grade": "11th"}}
)
students = student_collection.find({"grade": "11th"})
for student in students:
    print(student)

{'_id': ObjectId('685ae67cce8d866fc75b53d8'), 'name': 'John', 'age': 18, 'grade': '11th'}
{'_id': ObjectId('685ae689ce8d866fc75b53d9'), 'name': 'Alice', 'age': 14, 'grade': '11th'}
{'_id': ObjectId('685ae689ce8d866fc75b53da'), 'name': 'Bob', 'age': 16, 'grade': '11th'}


### Delete (Remove Data)
- Delete a Single Document:

In [56]:
student_collection.delete_one({"firstName": "John"})

DeleteResult({'n': 0, 'electionId': ObjectId('7fffffff0000000000000019'), 'opTime': {'ts': Timestamp(1750788398, 1), 't': 25}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1750788398, 1), 'signature': {'hash': b'$\x0b\x81\xeb)\xdb\x9a^\xca_\xc2&\n3\xae\xe3+\xf4n]', 'keyId': 7464341642175053826}}, 'operationTime': Timestamp(1750788398, 1)}, acknowledged=True)

- Delete Multiple Documents:

In [None]:
student_collection.delete_many({"grade": "9th"})

### Connect to the courses Collection

In [15]:
courses_collection = db.courses # selects or creates the "courses" collection

### Insert One or More Course Documents
- Insert One Course:

In [None]:
course_document = {
    "courseCode": "MATH101",
    "courseName": "Basic Mathematics",
    "teacher": "Grace Mwangi",
    "durationWeeks": 12,
    "credits": 3
}
course_result = courses_collection.insert_one(course_document)
print(f"Inserted course document with ID: {course_result.inserted_id}")
course_id = course_result.inserted_id

Inserted course document with ID: 685ad9b0ce8d866fc75b53cc


- Insert Multiple Courses:

In [17]:
course_documents = [
    {
        "courseCode": "ENG201",
        "courseName": "English Literature",
        "teacher": "John Okello",
        "durationWeeks": 10,
        "credits": 2
    },
    {
        "courseCode": "SCI301",
        "courseName": "Introduction to Physics",
        "teacher": "Mary Wanjiku",
        "durationWeeks": 14,
        "credits": 4
    }
]

courses_collection.insert_many(course_documents)
print("Multiple courses added to the 'courses' collection.")

Multiple courses added to the 'courses' collection.


### Verify the Insertion
You can list all courses like this:

In [20]:
for course in courses_collection.find({},{'courseCode': 1, 'courseName': 1, '_id': 0}):
    print(course)

{'courseCode': 'MATH101', 'courseName': 'Basic Mathematics'}
{'courseCode': 'ENG201', 'courseName': 'English Literature'}
{'courseCode': 'SCI301', 'courseName': 'Introduction to Physics'}


### Relationships in MongoDB
MongoDB doesn't support foreign keys, but you can relate data in two ways:

1. Manual Reference
```json
{
  "name": "John",
  "courseId": ObjectId("...")
}
```

2. Embedded Document
```json
{
  "name": "John",
  "course": {
    "name": "Math",
    "teacher": "Mr. Smith"
  }
}
```

### Library Database

In [21]:
db = client.library

### Create the authors Collection and Insert Documents

In [22]:
authors_collection = db.authors

author_1 = {
    "name": "Chinua Achebe",
    "nationality": "Nigerian"
}

author_2 = {
    "name": "Ngũgĩ wa Thiong'o",
    "nationality": "Kenyan"
}

result = authors_collection.insert_many([author_1, author_2])
author_ids = result.inserted_ids

### Create the books Collection and Insert Books with Author References

In [24]:
books_collection = db.books

book_1 = {
    "title": "Things Fall Apart",
    "year": 1958,
    "authorId": author_ids[0]  # foreign-key-like reference
}

book_2 = {
    "title": "Petals of Blood",
    "year": 1977,
    "authorId": author_ids[1]
}

books_collection.insert_many([book_1, book_2])
print("Books added to the 'books' collection.")

Books added to the 'books' collection.


### Query Books and Join Author Data Manually

In [25]:
books = books_collection.find()
for book in books:
    author = authors_collection.find_one({"_id": book["authorId"]})
    print(f"{book['title']} was written by {author['name']} ({author['nationality']})")


Things Fall Apart was written by Chinua Achebe (Nigerian)
Petals of Blood was written by Ngũgĩ wa Thiong'o (Kenyan)


### How to See Your Database in MongoDB Atlas
#### 1. Log into MongoDB Atlas
- Go to: https://cloud.mongodb.com
- Sign in with your MongoDB Atlas account.

#### 2. Select Your Project
On the dashboard, click on the project where you created your cluster (e.g., "School Database Project").

#### 3. Go to Your Cluster
Click on the name of your cluster (e.g., Cluster0).

This takes you to the cluster overview page.

#### 4. Open the Data Explorer
In the cluster view, click on “Browse Collections” or “Data Explorer” (depending on your interface).

You'll now see a list of:

Databases (like school)

Collections inside those databases (like students)

#### 5. Browse Your Data
Click on your database (e.g., school)

Click on the collection (e.g., students)

You’ll see the documents (records) you inserted via your Python script using pymongo.

`Tip:`
If you don’t see the database yet:

Make sure you inserted at least one document using your Python script. MongoDB only shows databases and collections after they contain data.