Тестирование производительности Mongo
В рамках исследования было рассмотрено хранилище Mongo.

Требования
количество пользователей: 500_000
количество фильмов: 20_000
максимальное время ответа БД: 200мс
Запуск

In [None]:
!docker-compose up -d

Инициализация кластера и БД

In [None]:
!docker-compose exec mongocfg1 sh -c "mongo < /scripts/init_config_server.js"
!docker-compose exec mongors1n1 sh -c "mongo < /scripts/init_shard_01.js"
!docker-compose exec mongors2n1 sh -c "mongo  < /scripts/init_shard_02.js"

!sleep 30

!docker-compose exec mongos1 sh -c "mongo < /scripts/init_router.js"
!docker-compose exec mongos1 sh -c "mongo < /scripts/init_db.js"


In [None]:
!export PYTHONPATH="${PYTHONPATH}:${PWD}/../.."

In [None]:
from multiprocessing import Pool

import tqdm
from pymongo import MongoClient

from config import DB_NAME, MONGO_HOST, MONGO_PORT
from utils.test_data_gen import (
    generate_movie_and_related_documents,
    generate_user_documents,
    movie_ids
)


def upload_users_documents():
    client = MongoClient(MONGO_HOST, MONGO_PORT)
    db = client.get_database(DB_NAME)

    collection = db.get_collection('users')
    collection.insert_many(generate_user_documents(), ordered=False)


def upload_movie_ratings_and_reviews(movie_id):
    # https://pymongo.readthedocs.io/en/stable/faq.html?highlight=never%20do%20this#using-pymongo-with-multiprocessing
    client = MongoClient(MONGO_HOST, MONGO_PORT)
    db = client.get_database(DB_NAME)

    movie, ratings, reviews = generate_movie_and_related_documents(movie_id)

    movies_coll = db.get_collection('movies')
    movies_coll.insert_one(movie)

    if ratings:
        ratings_coll = db.get_collection('movie_ratings')
        ratings_coll.insert_many(ratings, ordered=False)

    if reviews:
        reviews_coll = db.get_collection('reviews')
        reviews_coll.insert_many(reviews, ordered=False)

    client.close()

In [None]:
upload_users_documents()

with Pool() as pool:
    r = list(tqdm.tqdm(
        pool.imap(upload_movie_ratings_and_reviews, movie_ids),
        total=len(movie_ids)
    ))


Выполнение тестовых запросов

Чтение

In [None]:
from utils.test_scenarios import READ_SCENARIOS

for scenario in READ_SCENARIOS:
    func = scenario.get('func')
    kwargs = scenario.get('kwargs')
    func(**kwargs)

Запись

In [None]:
from utils.test_scenarios import WRITE_SCENARIOS

for scenario in WRITE_SCENARIOS:
    func = scenario.get('func')
    kwargs = scenario.get('kwargs')
    func(**kwargs)

Вывод

Mongo удовлетворяет указанным требованиям.

Остановка

In [None]:
!docker-compose down -v