## Тестирование производительности Mongo

В рамках исследования было рассмотрено хранилище Mongo.

### Требования

- количество пользователей: 500_000
- количество фильмов: 20_000
- максимальное время ответа БД: 200мс

### Запуск

In [1]:
!docker-compose up -d

Creating network "mongo_default" with the default driver
Creating volume "mongo_ugc_mongo_cluster_data1" with default driver
Creating volume "mongo_ugc_mongo_cluster_data2" with default driver
Creating volume "mongo_ugc_mongo_cluster_data3" with default driver
Creating volume "mongo_ugc_mongo_cluster_data4" with default driver
Creating volume "mongo_ugc_mongo_cluster_data5" with default driver
Creating volume "mongo_ugc_mongo_cluster_data6" with default driver
Creating volume "mongo_ugc_mongo_cluster_config1" with default driver
Creating volume "mongo_ugc_mongo_cluster_config2" with default driver
Creating volume "mongo_ugc_mongo_cluster_config3" with default driver
Creating mongors2n3 ... 
Creating mongocfg2  ... 
Creating mongors2n1 ... 
Creating mongocfg3  ... 
Creating mongors1n3 ... 
Creating mongors1n1 ... 
Creating mongors2n2 ... 
Creating mongocfg1  ... 
Creating mongors1n2 ... 
[7Bting mongors2n1 ... [32mdone[0m[8A[2K[6A[2K[4A[2K[5A[2K[3A[2K[2A

### Инициализация кластера и БД

In [2]:
!docker-compose exec mongocfg1 sh -c "mongo < /scripts/init_config_server.js"
!docker-compose exec mongors1n1 sh -c "mongo < /scripts/init_shard_01.js"
!docker-compose exec mongors2n1 sh -c "mongo  < /scripts/init_shard_02.js"

!sleep 30

!docker-compose exec mongos1 sh -c "mongo < /scripts/init_router.js"
!docker-compose exec mongos1 sh -c "mongo < /scripts/init_db.js"

MongoDB shell version v5.0.5
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("fe3e11cc-6d4a-488a-8101-7c143d9f3468") }
MongoDB server version: 5.0.5
which delivers improved usability and compatibility.The "mongo" shell has been deprecated and will be removed in
an upcoming release.
For installation instructions, see
https://docs.mongodb.com/mongodb-shell/install/
{
	"ok" : 1,
	"$gleStats" : {
		"lastOpTime" : Timestamp(1640244451, 1),
		"electionId" : ObjectId("000000000000000000000000")
	},
	"lastCommittedOpTime" : Timestamp(1640244451, 1)
}
bye
MongoDB shell version v5.0.5
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("47cbf436-8471-428b-91b4-15099df283b7") }
MongoDB server version: 5.0.5
which delivers improved usability and compatibility.The "mongo" shell has been deprecated and will be removed in
a

### Загрузка тестовых данных

БД разделена на следующие коллекции:

- **movies**
    - схема данных:

            {
                "_id": <uuid_string>,
                "ratings_qty": <integer>,
                "ratings_sum": <integer>,
                "reviews": [<uuid_string>, ...]
            }
    - не шардируется

- **users**
    - схема данных:

            {
                "_id": <uuid_string>,
                "bookmarks": [<uuid_string>, ...]
            }
    - ключ шардирования: **_id**

- **movie_ratings**
    - схема данных:

            {
                "_id": <uuid_string>,
                "movie_id": <uuid_string>,
                "user_id": <uuid_string>,
                "score": <integer>
            }
    - ключ шардирования: **user_id**

- **reviews**
    - схема данных:

            {
                "_id": <uuid_string>,
                "author_id": <uuid_string>,
                "movie_id": <uuid_string>,
                "text": <string>,
                "pub_date": <datetime>,
                "movie_rating_id": <uuid_string>,
                "movie_rating_score": <integer>,
                "review_rating_sum": <integer>,
                "review_rating_qty": <integer>,
            }
    - ключ шардирования: **author_id**

In [4]:
!export PYTHONPATH="${PYTHONPATH}:${PWD}/../.."

In [5]:
from multiprocessing import Pool

import tqdm
from pymongo import MongoClient

from config import DB_NAME, MONGO_HOST, MONGO_PORT
from utils.test_data_gen import (
    generate_movie_and_related_documents,
    generate_user_documents,
    movie_ids
)


def upload_users_documents():
    client = MongoClient(MONGO_HOST, MONGO_PORT)
    db = client.get_database(DB_NAME)

    collection = db.get_collection('users')
    collection.insert_many(generate_user_documents(), ordered=False)


def upload_movie_ratings_and_reviews(movie_id):
    # https://pymongo.readthedocs.io/en/stable/faq.html?highlight=never%20do%20this#using-pymongo-with-multiprocessing
    client = MongoClient(MONGO_HOST, MONGO_PORT)
    db = client.get_database(DB_NAME)

    movie, ratings, reviews = generate_movie_and_related_documents(movie_id)

    movies_coll = db.get_collection('movies')
    movies_coll.insert_one(movie)

    if ratings:
        ratings_coll = db.get_collection('movie_ratings')
        ratings_coll.insert_many(ratings, ordered=False)

    if reviews:
        reviews_coll = db.get_collection('reviews')
        reviews_coll.insert_many(reviews, ordered=False)

    client.close()

In [6]:
upload_users_documents()

with Pool() as pool:
    r = list(tqdm.tqdm(
        pool.imap(upload_movie_ratings_and_reviews, movie_ids),
        total=len(movie_ids)
    ))

100%|██████████| 20000/20000 [1:50:26<00:00,  3.02it/s]  


### Выполнение тестовых запросов

#### Чтение

In [11]:
from utils.test_scenarios import READ_SCENARIOS

for scenario in READ_SCENARIOS:
    func = scenario.get('func')
    kwargs = scenario.get('kwargs')
    func(**kwargs)

Average execution time for get_movie_reviews_sort_pub_date (over 10 runs): 0.0920 seconds
Execution result:
 [{'_id': 'b409f231-e929-4bee-b5fe-ea3a9723e714', 'author_id': '229cc0e3-fe5a-4f4c-8926-1beb9eb4e286', 'movie_id': 'f2088dce-cf73-4638-b44c-c18516f5ff12', 'pub_date': datetime.datetime(2021, 11, 24, 7, 36, 5), 'text': 'Test review for f2088dce-cf73-4638-b44c-c18516f5ff12 by 229cc0e3-fe5a-4f4c-8926-1beb9eb4e286', 'movie_rating_id': '4c96dd2e-bcd5-4405-a4b0-c8ab2cd40e98', 'movie_rating_score': 3, 'review_rating_qty': 8, 'review_rating_sum': 37}, {'_id': '8dd8bf29-557a-412b-b01e-a6c5c849dce7', 'author_id': 'd7dba8f9-9e30-42ca-adb3-52b42de619d7', 'movie_id': 'f2088dce-cf73-4638-b44c-c18516f5ff12', 'pub_date': datetime.datetime(2021, 9, 1, 13, 56, 59), 'text': 'Test review for f2088dce-cf73-4638-b44c-c18516f5ff12 by d7dba8f9-9e30-42ca-adb3-52b42de619d7', 'movie_rating_id': '2cffd421-2844-4271-8f41-41a9b26d89a0', 'movie_rating_score': 5, 'review_rating_qty': 20, 'review_rating_sum': 10

#### Запись

In [12]:
from utils.test_scenarios import WRITE_SCENARIOS

for scenario in WRITE_SCENARIOS:
    func = scenario.get('func')
    kwargs = scenario.get('kwargs')
    func(**kwargs)

Average execution time for add_movie_rating (over 10 runs): 0.1124 seconds
Execution result:
 Inserted rating with id: 360efcf1-ce75-47b1-8760-540b4fc4dac9

Average execution time for add_review (over 10 runs): 0.1330 seconds
Execution result:
 Added movie_review with id: 023e57b5-4984-471d-86b5-ff9d9aa2106c

Average execution time for add_bookmark (over 10 runs): 0.0232 seconds
Execution result:
 Added bookmark for movie: e86bf293-ec9e-4989-b161-488633facb4b to user: 4585147c-0f63-4461-b388-bf9622e9d3cc



### Вывод

Mongo удовлетворяет указанным требованиям.

### Остановка

In [13]:
!docker-compose down -v

Stopping mongos1    ... 
Stopping mongocfg1  ... 
Stopping mongors1n2 ... 
Stopping mongors1n3 ... 
Stopping mongors1n1 ... 
Stopping mongors2n2 ... 
Stopping mongors2n1 ... 
Stopping mongocfg3  ... 
Stopping mongocfg2  ... 
Stopping mongors2n3 ... 
[7Bping mongors1n3 ... [32mdone[0m[6A[2K[9A[2K[3A[2K[8A[2K[4A[2K[5A[2K[1A[2K[2A[2K[7A[2KRemoving mongos1    ... 
Removing mongocfg1  ... 
Removing mongors1n2 ... 
Removing mongors1n3 ... 
Removing mongors1n1 ... 
Removing mongors2n2 ... 
Removing mongors2n1 ... 
Removing mongocfg3  ... 
Removing mongocfg2  ... 
Removing mongors2n3 ... 
[10BRemoving network mongo_default0m[2A[2K[1A[2K[9A[2K[6A[2K[4A[2K
Removing volume mongo_ugc_mongo_cluster_data1
Removing volume mongo_ugc_mongo_cluster_data2
Removing volume mongo_ugc_mongo_cluster_data3
Removing volume mongo_ugc_mongo_cluster_data4
Removing volume mongo_ugc_mongo_cluster_data5
Removing volume mongo_ugc_mongo_cluster_data6
Removing v