# MongoDB
Let's dive into MongoDB and how to use it via `mongosh` (MongoDB Shell). MongoDB is a popular NoSQL database, known for storing data in a flexible, JSON-like format called BSON. It’s widely used due to its scalability and ease of use. `mongosh` is the command-line interface for interacting with MongoDB.

To make this as clear as possible, I'll break down the process step-by-step, using analogies where appropriate to make the technical details easy to understand.

### MongoDB Concepts

1. **Database** (maps to Postgres DATABASE):
   - Think of a database as a file cabinet.

2. **Collection** (maps to Postgres TABLE):
   - A collection is like a drawer in the file cabinet.

3. **Document** (maps to Postgres row in a table):
   - A document is like a file within the drawer. It’s a single record in the collection, stored in JSON format.


### 1. Setting Up MongoDB and mongosh

Imagine MongoDB as a large, well-organized library where each book (database) contains chapters (collections) filled with pages (documents).

**Step 1: Pull Latest Mongo Docker Image**
  ```sh
  docker pull mongo
  ```

**Step 2: Create Mongo Container**

  ```sh
  docker run -d -p 27017:27017 --name=mongo-example mongo
  ```
**Step 3: Exec-ing into Mongo Container**

  ```sh
  docker exec -it mongo-example mongosh
  ```


### 2. Basic MongoDB Operations

#### 2.1 Databases

**Creating a Database**
- To create or switch to a database, use the `use` command:
  ```sh
  use myLibrary
  ```
  This doesn't create a database until you actually add data to it. Think of it as walking into a new section of the library and setting up shelves.

**Listing Databases**
- To list all databases:
  ```sh
  show dbs
  ```

#### 2.2 Collections

Collections in MongoDB are like books in the library. Each collection contains documents, analogous to pages in a book.

**Creating a Collection**
- To create a collection, you simply start using it:
  ```sh
  db.createCollection("myCollection")
  ```

**Listing Collections**
- To list all collections in the current database:
  ```sh
  show collections
  ```

#### 2.3 Documents

Documents are the main units of data in MongoDB, similar to how pages are units in a book. They are JSON-like structures (BSON).

**Inserting Documents**
- To insert a document into a collection:
  ```sh
  db.myCollection.insertOne({ title: "MongoDB Basics", author: "John Doe", year: 2023 })
  ```
  This command adds a page (document) with the given details to the book (collection).

**Finding Documents**
- To find documents in a collection:
  ```sh
  db.myCollection.find()
  ```
  This command retrieves all documents from the collection.

**Updating Documents**
- To update documents in a collection:
  ```sh
  db.myCollection.updateOne({ title: "MongoDB Basics" }, { $set: { year: 2024 } })
  ```
  This updates the `year` field of the document with the title "MongoDB Basics".

**Deleting Documents**
- To delete documents from a collection:
  ```sh
  db.myCollection.deleteOne({ title: "MongoDB Basics" })
  ```
  This deletes the document with the specified title.

### 3. Advanced Operations

#### 3.1 Indexes

Indexes in MongoDB are like the index in a book, helping you quickly find the information you need.

**Creating an Index**
- To create an index on a field:
  ```sh
  db.myCollection.createIndex({ title: 1 })
  ```
  This creates an ascending index on the `title` field.

#### 3.2 Aggregation

Aggregation operations process data records and return computed results. They are similar to SQL's GROUP BY operations.

**Aggregation Example**
- To group documents by author and count the number of books by each:
  ```sh
  db.myCollection.aggregate([
    { $group: { _id: "$author", count: { $sum: 1 } } }
  ])
  ```

### 4. Practical Example

Let’s go through a practical example where we create a library database, add a collection of books, and perform some operations.

1. **Switch to the Library Database**
   ```sh
   use library
   ```

2. **Create a Books Collection and Insert Documents**
   ```sh
   db.createCollection("books")
   db.books.insertMany([
     { title: "MongoDB Basics", author: "John Doe", year: 2023 },
     { title: "Advanced MongoDB", author: "Jane Smith", year: 2024 }
   ])
   ```

3. **Find All Books**
   ```sh
   db.books.find()
   ```

4. **Update a Book's Year**
   ```sh
   db.books.updateOne({ title: "MongoDB Basics" }, { $set: { year: 2024 } })
   ```

5. **Delete a Book**
   ```sh
   db.books.deleteOne({ title: "Advanced MongoDB" })
   ```



# MongoDB with a Sample Data
Data: https://github.com/neelabalan/mongodb-sample-dataset/blob/main/sample_airbnb/listingsAndReviews.json

**Step 1: Copy the downloaded data into the docker container**

```sh
cd Downloads
docker cp listingsAndReviews.json mongo-example:/
```


**Step 2: Use `mongoimport` command to pull the dample data into Mongo**

```sh
docker exec -it mongo-example bash
ls
mongoimport -h localhost:27017 --db sample_airbnb --collection listingsAndReviews --file listingsAndReviews.json
```

**Step 3: Exit from the docker container**

```sh
exit
```

**Step 4: Enter the docker container via `mongosh`**

```sh
docker exec -it mongo-example mongosh
```

**Step 5: Check if data import happened correctly**

```sh
show dbs
use sample_airbnb
db.listingsAndReviews.find();
```

Let's dive into MongoDB using the sample `listingsAndReviews.json` data from Airbnb.

Now, let’s break down the key components and concepts of MongoDB and how they relate to the `listingsAndReviews.json` data.


### Understanding the `listingsAndReviews` Collection

Let's take a closer look at a sample document from the `listingsAndReviews.json` data:

```json
{
  "listing_id": "10006546",
  "name": "Ribeira Charming Duplex",
  "summary": "Fantastic duplex apartment...",
  "property_type": "House",
  "space": {
    "bathrooms": 1,
    "bedrooms": 2,
    "beds": 3
  },
  "address": {
    "street": "Porto",
    "suburb": "Vila Nova de Gaia"
  },
  "reviews": [
    {
      "review_id": 1,
      "reviewer_name": "John",
      "comments": "Great place!"
    }
  ]
}
```

Each document includes:
- `listing_id`: A unique identifier for the listing.
- `name`: The name of the listing.
- `summary`: A brief description of the listing.
- `property_type`: The type of property (e.g., House, Apartment).
- `space`: An embedded document detailing the number of bathrooms, bedrooms, and beds.
- `address`: An embedded document containing the street and suburb.
- `reviews`: An array of embedded documents, each representing a review.

### MongoDB Operations on Sample Data

Now, let's perform various operations to interact with the data, including CRUD operations (Create, Read, Update, Delete).

#### 1. Inserting Data

To insert a document into a collection, you use the `insertOne` or `insertMany` methods.

```sh
db.listingsAndReviews.insertOne({
  "listing_id": "10006546",
  "name": "Ribeira Charming Duplex",
  "summary": "Fantastic duplex apartment...",
  "property_type": "House",
  "space": {
    "bathrooms": 1,
    "bedrooms": 2,
    "beds": 3
  },
  "address": {
    "street": "Porto",
    "suburb": "Vila Nova de Gaia"
  },
  "reviews": [
    {
      "reviewer_name": "John",
      "comments": "Great place!"
    }
  ]
});
```

In this example, a new listing is added to the `listingsAndReviews` collection.

#### 2. Querying Data

To find documents in a collection, use the `find` method.

```sh
db.listingsAndReviews.find({"address.street": "New York, NY, United States"});
```

This command retrieves all listings located on "New York, NY, United States" street.

#### 3. Updating Data

To update existing documents, use the `updateOne` or `updateMany` methods.

```sh
db.listingsAndReviews.updateOne(
  {"listing_id": "10006546"},
  {$set: {"summary": "Updated summary"}}
);
```

This command updates the summary of the listing with `listing_id` "10006546".

#### 4. Deleting Data

To remove documents, use the `deleteOne` or `deleteMany` methods.

```sh
db.listingsAndReviews.deleteOne({"listing_id": "10006546"});
```

This command deletes the listing with `listing_id` "10006546".



### MongoDB Aggregation Framework

The aggregation framework is a powerful feature in MongoDB that allows you to perform data processing and transformation tasks. It works by passing documents through a series of stages, each performing a specific operation. The output of one stage is the input to the next.

#### Example 1: Finding the Average Number of Beds by Property Type

Let's calculate the average number of beds for each property type.

```sh
db.listingsAndReviews.aggregate([
  {
    $group: {
      _id: "$property_type",
      averageBeds: { $avg: "$space.beds" }
    }
  },
  {
    $sort: { averageBeds: -1 }
  }
]);
```

**Explanation:**
- `$group`: Groups documents by `property_type` and calculates the average number of beds (`$avg: "$space.beds"`).
- `_id`: The field by which to group (in this case, `property_type`).
- `$sort`: Sorts the results in descending order based on the `averageBeds`.

Equivalent SQL:
```sh
SELECT property_type, AVERAGE(SPACE_BEDS) AS averageBeds FROM listingsAndReviews GROUP BY property_type ORDER BY 2 DESC;
```

#### Example 2: Counting Listings per Suburb

To count the number of listings in each suburb:

```sh
db.listingsAndReviews.aggregate([
  {
    $group: {
      _id: "$address.suburb",
      listingCount: { $sum: 1 }
    }
  },
  {
    $sort: { listingCount: -1 }
  }
]);
```

**Explanation:**
- `$group`: Groups documents by `suburb` and counts the number of listings (`$sum: 1`).
- `$sort`: Sorts the results in descending order based on the `listingCount`.

#### Example 3: Aggregating Reviews Data

Let’s aggregate review data to find the average number of reviews per listing:

```sh
db.listingsAndReviews.aggregate([
  {
    $project: {
      listing_id: 1,
      reviewCount: { $size: "$reviews" }
    }
  },
  {
    $group: {
      _id: null,
      averageReviews: { $avg: "$reviewCount" }
    }
  }
]);
```

**Explanation:**
- `$project`: Creates a new field `reviewCount` which is the size of the `reviews` array.
- `$group`: Groups all documents (since `_id` is `null`) and calculates the average number of reviews (`$avg: "$reviewCount"`).

#### Example 4: Filtering Listings with High Review Counts

Find listings with more than 10 reviews:

```sh
db.listingsAndReviews.aggregate([
  {
    $project: {
      listing_id: 1,
      name: 1,
      reviewCount: { $size: "$reviews" }
    }
  },
  {
    $match: { reviewCount: { $gt: 10 } }
  }
]);
```

**Explanation:**
- `$project`: Creates a new field `reviewCount` which is the size of the `reviews` array and includes `listing_id` and `name` in the output.
- `$match`: Filters documents where `reviewCount` is greater than 10.


#### Example 5: Listings with Specific Amenities

Find listings that have both "Wifi" and "Kitchen":

```sh
db.listingsAndReviews.find({
  amenities: { $all: ["Wifi", "Kitchen"] }
});
```

**Explanation:**
- `$all`: Ensures that the `amenities` array contains both "Wifi" and "Kitchen".

#### Example 6: Text Search for Keywords in Summary

Perform a text search for listings with the keywords "beautiful" and "cozy" in the summary:

```sh
db.listingsAndReviews.createIndex({ summary: "text" });
db.listingsAndReviews.find({
  $text: { $search: "beautiful cozy" }
});
```

**Explanation:**
- `createIndex`: Creates a text index on the `summary` field to enable text search.
- `$text`: Searches the `summary` field for the keywords "beautiful" and "cozy".


# MongoDB & Python
Let's dive into MongoDB and how to use it with Python. We'll use the `pymongo` library to interact with MongoDB from Python.

### Step-by-Step Guide

#### 1. Install `pymongo`
Install the `pymongo` library using pip:
```bash
pip install pymongo
```

#### 2. Connect to MongoDB
We'll start by connecting to the MongoDB server. Here's how you can do it:

```python
from pymongo import MongoClient

# Create a connection to the MongoDB server
client = MongoClient('localhost', 27017)

# Access the 'example_database'
db = client['library']
```

#### 3. Create Sample Data
Let's create some sample data to insert into the MongoDB database. We'll use a collection (similar to a table in relational databases) called `example_collection`.

Here’s a simple dataset of books:

```python
books = [
    {"title": "The Catcher in the Rye", "author": "J.D. Salinger", "published_year": 1951, "genres": ["Fiction"]},
    {"title": "To Kill a Mockingbird", "author": "Harper Lee", "published_year": 1960, "genres": ["Fiction", "Classic"]},
    {"title": "1984", "author": "George Orwell", "published_year": 1949, "genres": ["Fiction", "Dystopian"]},
    {"title": "The Great Gatsby", "author": "F. Scott Fitzgerald", "published_year": 1925, "genres": ["Fiction", "Classic"]},
    {"title": "Moby Dick", "author": "Herman Melville", "published_year": 1851, "genres": ["Fiction", "Adventure"]}
]
```

#### 4. Insert Data into MongoDB
We will now insert this sample data into the `example_collection`.

```python
# Access the 'example_collection'
collection = db['books']

# Insert the books into the collection
collection.insert_many(books)
```

#### 5. Query Data from MongoDB
Let's perform some queries to fetch data from the MongoDB collection.

**a. Fetch all documents:**
```python
for book in collection.find():
    print(book)
```

**b. Fetch books published after 1950:**
```python
for book in collection.find({"published_year": {"$gt": 1950}}):
    print(book)
```

**c. Fetch books by a specific author:**
```python
for book in collection.find({"author": "George Orwell"}):
    print(book)
```

**d. Fetch books that belong to the "Classic" genre:**
```python
for book in collection.find({"genres": "Classic"}):
    print(book)
```

### Complete Example Code
Here’s the complete code to perform the above steps:

```python
from pymongo import MongoClient

# Create a connection to the MongoDB server
client = MongoClient('localhost', 27017)

# Access the 'example_database'
db = client['library']

# Sample data
books = [
    {"title": "The Catcher in the Rye", "author": "J.D. Salinger", "published_year": 1951, "genres": ["Fiction"]},
    {"title": "To Kill a Mockingbird", "author": "Harper Lee", "published_year": 1960, "genres": ["Fiction", "Classic"]},
    {"title": "1984", "author": "George Orwell", "published_year": 1949, "genres": ["Fiction", "Dystopian"]},
    {"title": "The Great Gatsby", "author": "F. Scott Fitzgerald", "published_year": 1925, "genres": ["Fiction", "Classic"]},
    {"title": "Moby Dick", "author": "Herman Melville", "published_year": 1851, "genres": ["Fiction", "Adventure"]}
]

# Access the 'example_collection'
collection = db['books']

# Insert the books into the collection
collection.insert_many(books)

# Fetch all documents
print("All books:")
for book in collection.find():
    print(book)

# Fetch books published after 1950
print("\nBooks published after 1950:")
for book in collection.find({"published_year": {"$gt": 1950}}):
    print(book)

# Fetch books by George Orwell
print("\nBooks by George Orwell:")
for book in collection.find({"author": "George Orwell"}):
    print(book)

# Fetch books in the 'Classic' genre
print("\nBooks in the 'Classic' genre:")
for book in collection.find({"genres": "Classic"}):
    print(book)
```

### Recapping the Code
1. **Connection**: We connect to the MongoDB server running on `localhost` at port `27017`.
2. **Database Access**: We access (or create) a database named `example_database`.
3. **Data Preparation**: We prepare a list of dictionaries, each representing a book with various attributes.
4. **Data Insertion**: We insert the list of books into a collection named `example_collection`.
5. **Queries**: We perform various queries to retrieve and print data based on different criteria.


# Database Paradigms

https://www.youtube.com/watch?v=W2Z7fbCLSTw&ab_channel=Fireship