# Mongo DB 

## MongoDB

https://pypi.org/project/torch/ <br>
https://www.mongodb.com/docs/

> **Dopo video**
>
> [Dataset and DataLoader in PyTorch.](https://www.youtube.com/watch?v=_BxXrFStVOQ)

In [1]:
!pip install torch



In [2]:
!pip install pymongo



In [None]:
from pymongo import MongoClient
from torch.utils.data import Dataset, DataLoader

### Define the BooksDataset Class

In [20]:
class BooksDataset(Dataset):
    def __init__(self, db):
        self.books = db["books"]
        self.reviews = db["reviews"]  

        self.pipeline = [
            {
                "$lookup": {
                    "from": "reviews",
                    "localField": "_id",
                    "foreignField": "book_id",
                    "as": "book_reviews"
                }
            },
            {
                "$unwind": "$book_reviews"
            }
        ]

        self.books_data = list(self.books.aggregate(self.pipeline))
        #print(self.books_data)
        self.batch_count = 0 

    def __len__(self):
        return len(self.books_data)

    def __getitem__(self, idx):
        book = self.books_data[idx]
        review = book['book_reviews']

        return {
            'name': book['name'],
            'author': book['author'],
            'genre': book['genre'],
            'review': review['text'] 
        }

    def increase_batch_count(self):
        self.batch_count += 1

### Connect to MongoDB Database

In [21]:
user = 'user'
password = 'root'
databaseName = 'Library'

In [22]:
#client = MongoClient('mongodb+srv://{user}:{password}@cluster0.djza6my.mongodb.net/')
client = MongoClient('mongodb+srv://user:root@cluster0.djza6my.mongodb.net/')
db = client[databaseName]
collection = db['books']

### Create Dataset and Dataloader

In [23]:
dataset = BooksDataset(db)
dataloader = DataLoader(dataset, batch_size=2)

### Process and Print Data

In [24]:
for batch in dataloader:
    for name, author, genre, review in zip(batch['name'], batch['author'], batch['genre'], batch['review']):
        print(f"Book: {name} by {author} ({genre})")
        print(f"  Review: {review}")

    dataset.increase_batch_count()

Book: 1984 by George Orwell (Dystopian Fiction)
  Review: I loved this book!
Book: 1984 by George Orwell (Dystopian Fiction)
  Review: Not as good as I expected
Book: The Hunger Games by Suzanne Collins (Science Fiction)
  Review: A beautiful and tragic story
Book: The Hunger Games by Suzanne Collins (Science Fiction)
  Review: One of my all-time favorites
Book: To Kill a Mockingbird by Harper Lee (Fiction)
  Review: I couldn't put this book down!
Book: To Kill a Mockingbird by Harper Lee (Fiction)
  Review: This book was amazing!
Book: The Da Vinci Code by Dan Brown (Mystery)
  Review: A fascinating exploration of human b
Book: The Da Vinci Code by Dan Brown (Mystery)
  Review: A fascinating exploration of human nature
Book: The Hitchhiker's Guide to the Galaxy by Douglas Adams (Science Fiction)
  Review: I enjoyed this book more than I thought I would
Book: The Hitchhiker's Guide to the Galaxy by Douglas Adams (Science Fiction)
  Review: An interesting premise, but not well-executed


In [25]:
print("Batch Count:", dataset.batch_count)

Batch Count: 10


### How to get info from MongoDB


##### V MongoDB není přímý ekvivalent SQL dotazu SELECT * FROM table_name;, protože MongoDB je dokumentově orientovaná databáze, zatímco SQL databáze jsou tabulkové. Nicméně, nejbližší ekvivalentní operace v MongoDB, která vrátí všechny dokumenty z dané kolekce, je použití metody find bez specifikace žádných podmínek


In [26]:
cursor = db.books.find({})

for document in cursor:
    print(document)

{'_id': ObjectId('645f93cce0c7d86ffe1b7151'), 'name': '1984', 'author': 'George Orwell', 'genre': 'Dystopian Fiction'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7158'), 'name': 'The Hunger Games', 'author': 'Suzanne Collins', 'genre': 'Science Fiction'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7150'), 'name': 'To Kill a Mockingbird', 'author': 'Harper Lee', 'genre': 'Fiction'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7159'), 'name': 'The Da Vinci Code', 'author': 'Dan Brown', 'genre': 'Mystery'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7156'), 'name': "The Hitchhiker's Guide to the Galaxy", 'author': 'Douglas Adams', 'genre': 'Science Fiction'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7157'), 'name': 'The Hobbit', 'author': 'J.R.R. Tolkien', 'genre': 'High Fantasy'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7155'), 'name': 'The Lord of the Rings', 'author': 'J.R.R. Tolkien', 'genre': 'High Fantasy'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7153'), 'name': 'The Great Gatsby', 'author': 'F. Scott Fitzgerald', 

##### Najití specifické informace pomocí spevifického kritéria


In [27]:
cursor = db.books.find({"genre": "Fiction"})
for document in cursor:
    print(document)

{'_id': ObjectId('645f93cce0c7d86ffe1b7150'), 'name': 'To Kill a Mockingbird', 'author': 'Harper Lee', 'genre': 'Fiction'}
{'_id': ObjectId('645f93cce0c7d86ffe1b7153'), 'name': 'The Great Gatsby', 'author': 'F. Scott Fitzgerald', 'genre': 'Fiction'}


##### Agregační dotaz s operací $lookup je ukázán v předchozích krocích a slouží k propojení dvou kolekcí, které mají propojující parametr


In [28]:
from typing import List
import motor.motor_asyncio

In [30]:
async def execute_query(query):
    try:
        client = motor.motor_asyncio.AsyncIOMotorClient('mongodb+srv://user:root@cluster0.djza6my.mongodb.net/')
        db = client['Library']
        collection = db['books']

        result = await collection.find(query).to_list(None)

        return result
    except Exception as e:
        print('Error:', e)

async def getFromMongo(ids: List[str]):
    query = {'_id': {'$in': ids}}
    result = await execute_query(query)

    return result
