Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB ram drive work all over including Windows and Mac #9

Closed
goldbergyoni opened this issue Oct 1, 2020 · 11 comments · Fixed by #17
Closed

DB ram drive work all over including Windows and Mac #9

goldbergyoni opened this issue Oct 1, 2020 · 11 comments · Fixed by #17
Assignees

Comments

@goldbergyoni
Copy link
Contributor

No description provided.

@mikicho mikicho changed the title Mac DB ram drive work all over including Windows and Mac DB ram drive work all over including Windows and Mac Oct 16, 2020
@mikicho
Copy link
Contributor

mikicho commented Oct 16, 2020

I created a benchmark to validate this recommendation.

Postgres:

  1. Ram is about ~3 times faster than disk.
  2. The benchmark is a simple 5 select, insert, and update statements inside a transaction.
  3. tps = transactions-per-seconds, higher is better.

Steps to run the benchmark (WSL2, Linux):

# On Disk:
docker run -it --rm -e POSTGRES_PASSWORD=secret postgres:13-alpine sh
# On Ram:
docker run -it --rm -e POSTGRES_PASSWORD=secret --tmpfs /var/lib/postgresql/data postgres:13-alpine sh

docker-entrypoint.sh postgres &
pg_isready -U postgres
pgbench -U postgres -i -s 100
pgbench -U postgres -T 240

Results (higher is better):

  • Windows 10, WSL 2
  • 16 GB (12.45 available to the container)
  • Intel Core i5-6400 @ 2.7GHz

On Disk:

/ # pgbench -U postgres -T 240
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 240 s
number of transactions actually processed: 34755
latency average = 3.453 ms
tps = 289.620594 (including connections establishing)
tps = 289.625784 (excluding connections establishing)

On Ram:
(on Windows or Linux, Docker for Mac doesn't support --tmps option)

On Ram:
/ # pgbench -U postgres -T 240
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 240 s
number of transactions actually processed: 101654
latency average = 1.180 ms
tps = 847.112136 (including connections establishing)
tps = 847.125445 (excluding connections establishing)

MySQL

  1. Ram is about ~4-5 times faster than disk, The Memory Engine has poor results, probably because it isn't built for write-intensive scenarios.
  2. The benchmark is sysbnech oltp read-write
  3. tps = transactions-per-seconds, higher is better.

Steps to run the benchmark (WSL2, Linux):

# On Disk:
docker run --rm -it -e MYSQL_ROOT_PASSWORD=secret mysql:8 bash
# On Ram:
docker run --rm -it -e MYSQL_ROOT_PASSWORD=secret --tmpfs /var/lib/mysql mysql:8 bash

apt-get update
apt-get -qq -y install curl
curl -s https://packagecloud.io/install/repositories/akopytov/sysbench/script.deb.sh | bash
apt -y install sysbench
# You can commit the container at this stage: docker commit the-container-id mysql_bench
./entrypoint.sh mysqld --default-authentication-plugin=mysql_native_password &
mysqladmin -psecret ping
mysql -psecret -e "create database sbtest; SET GLOBAL max_heap_table_size = 1024 * 1024 * 1024 * 2;"
sysbench \
  --db-driver=mysql \
  --mysql-user=root \
  --mysql-password=secret \
  --tables=16 \
  --table-size=100000 \
  --time=240 \
  /usr/share/sysbench/oltp_read_write.lua prepare
sysbench \
  --db-driver=mysql \
  --mysql-user=root \
  --mysql-password=secret \
  --tables=16 \
  --table-size=100000 \
  --time=10 \
  --time=240 \
  /usr/share/sysbench/oltp_read_write.lua run

To run with Memory Engine:

sysbench \
  --db-driver=mysql \
  --mysql-user=root \
  --mysql-password=secret \
  --mysql_storage_engine=heap \
  --tables=16 \
  --table-size=100000 \
  --time=240 \
  /usr/share/sysbench/oltp_read_write.lua prepare

sysbench \
  --db-driver=mysql \
  --mysql-user=root \
  --mysql-password=secret \
  --mysql_storage_engine=heap \
  --tables=16 \
  --table-size=100000 \
  --time=240 \
  /usr/share/sysbench/oltp_read_write.lua run

Results (higher is better):

  • Windows 10, WSL 2
  • 16 GB (12.45 available to the container)
  • Intel Core i5-6400 @ 2.7GHz

On Disk:

SQL statistics:
    queries performed:
        read:                            164808
        write:                           47088
        other:                           23544
        total:                           235440
    transactions:                        11772  (49.05 per sec.)
    queries:                             235440 (980.97 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          240.0061s
    total number of events:              11772

Latency (ms):
         min:                                    7.64
         avg:                                   20.38
         max:                                  247.43
         95th percentile:                       34.33
         sum:                               239953.27

Threads fairness:
    events (avg/stddev):           11772.0000/0.00
    execution time (avg/stddev):   239.9533/0.00

Memory Engine:

SQL statistics:
    queries performed:
        read:                            54222
        write:                           15492
        other:                           7746
        total:                           77460
    transactions:                        3873   (16.13 per sec.)
    queries:                             77460  (322.67 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          240.0558s
    total number of events:              3873

Latency (ms):
         min:                                   51.10
         avg:                                   61.98
         max:                                  136.51
         95th percentile:                       69.29
         sum:                               240038.79

Threads fairness:
    events (avg/stddev):           3873.0000/0.00
    execution time (avg/stddev):   240.0388/0.00

On Ram:
(on Windows or Linux, Docker for Mac doesn't support --tmps option)

SQL statistics:
    queries performed:
        read:                            776020
        write:                           221720
        other:                           110860
        total:                           1108600
    transactions:                        55430  (230.95 per sec.)
    queries:                             1108600 (4619.05 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          240.0044s
    total number of events:              55430

Latency (ms):
         min:                                    2.72
         avg:                                    4.33
         max:                                   72.76
         95th percentile:                        5.37
         sum:                               239833.33

Threads fairness:
    events (avg/stddev):           55430.0000/0.00
    execution time (avg/stddev):   239.8333/0.00

@mikicho
Copy link
Contributor

mikicho commented Oct 17, 2020

Another interesting thing I learned is there are 2 types of "on-ram" file-systems in Linux tmpfs and ramfs.
the difference that relevant for us is that tmpfs may use swap so users less likely get OOM exceptions but the performance will be decreased.
https://www.jamescoyle.net/knowledge/951-the-difference-between-a-tmpfs-and-ramfs-ram-disk

@mikicho
Copy link
Contributor

mikicho commented Oct 17, 2020

@goldbergyoni @jhenaoz @Thormod
I get an illogical result when running postgres-on-ram-benchmark on mac.

This is what I tried:

  1. diskutil partitionDisk $(hdiutil attach -nomount ram://10240000) 1 GPTFormat APFS 'ramdisk' '100%'
  2. Enable file sharing on /Volumes/ramdisk (on docker edge)

And then:

docker run -it --rm -e POSTGRES_PASSWORD=secret -v /Volumes/ramdisk/data:/var/lib/postgresql/data postgres:13-alpine sh
docker-entrypoint.sh postgres &
pg_isready -U postgres
pgbench -U postgres -i -s 100
pgbench -U postgres -T 240

But it slower than running it on disk.
Any of you have experience with this?

@goldbergyoni
Copy link
Contributor Author

goldbergyoni commented Oct 23, 2020

These benchmarks are gold, once we go live and show these stats (in nice graphs, even maybe Video), it will greatly help in building trust with the audience. Who doesn't want to get 3x timer faster tests?:)

Core thoughts:

  1. Showing these improvements in a typical cloud CI and MAC is really important. There should be our focus.
  2. Let's show this with real tests and docker-compose, as this is the real product. We can create a parameterized test (test.each) with 10K records, including DB insert and get in every test, then run these tests with this in-mem setup. This will show the value over real tests but also allow running this in cloud CI. Note that some CI vendors like CircleCI require to run the docker-compose in machine executors, @Thormod can share an example here or maybe even set the circleci for us. I can create the parameterized test or @mikicho (if you want to train this)

My main point here - This great value should be presented in the environment where 90% of the tests run - CI and Mac

  1. About the Mac issue - Let's nail this issue, I just learned about pgbench and my results are consistent with yours 😢. Let's learn more, we may also ask/investigate the Java project testcontainers which is very established one

My results:
pgbench -U postgres -i -s 1
pgbench -U postgres -T 20 -j 3 -c 3

In-mem
tps = 1280.949380 (including connections establishing)
tps = 1281.958821 (excluding connections establishing)

Disc
tps = 1644.570585 (including connections establishing)
tps = 1644.833938 (excluding connections establishing)

  1. Let's include the other optimizations in our benchmark, not only disc, like fsync=off. Other great ideas here

@mikicho
Copy link
Contributor

mikicho commented Oct 23, 2020

Added mysql benchmark results.

@goldbergyoni Agree with all your thoughts.
I'll try to investigate the Mac issue further.

@mikicho
Copy link
Contributor

mikicho commented Oct 24, 2020

It seems like we set the ramdisk appropriately.
I thought it will be way faster tho :\

On Disk:
image

On Ram:
image

@goldbergyoni @Thormod @jhenaoz I think my lack of RAM makes my benchmarks inaccurate on mac :(
Please try both benchmarks (Postgres, MySQL) on mac and see if you get more reasonable results.

@goldbergyoni
Copy link
Contributor Author

I'm checking also on Mac. Anyway, 3x times faster sounds sensible given that modern SSD are super-fast. In anyway, I think we have enough to show some solid data, I suggest starting with showing the performance base-line, and then improve upon it. We can show that just-like-this, without advanced optimizations, 100 tests run in 5 seconds (hopefully) which by itself is a great argument and prove that the setup is interesting (e.g. leave the docker-compose up on dev). Practically, I would do the following:

  1. Create a folder for performance under 'various-receipes'
  2. See the folder ''various-receipes/authentication" which reuses and approaches the main application. It shows how no to copy everything rather just rely on the existing API
  3. In your new recipe, Create a JSON with 100 orders data, with copy-paste it should last 1 min
  4. Create a parameterized test (test.each), loop over this JSON, approach the API, ensure you get back 200
  5. Run 10 times manually, even in watch mode. Maybe the performance is good enough? Run it also over CI, we will have one soon, others might take care of this
  6. Try different optimizations, interesting things might come up. For example: tmpfs, run tests in a single process (--runInBand), run in 2,4,8 cores (--maxWorker=), optimize the DB with various flags, ram drive, etc. Yoni can create nice charts based on this

If you want me to do some of these tasks, I'll be glad to help. Thoughts?

@mikicho
Copy link
Contributor

mikicho commented Nov 1, 2020

I did as you say.
See this PR summary.

@mikicho mikicho linked a pull request Nov 27, 2020 that will close this issue
@mikicho mikicho closed this as completed Nov 27, 2020
@goldbergyoni
Copy link
Contributor Author

@mikicho I would move the great analysis above (with the numbers) to the performance recipe

@rluvaton
Copy link
Collaborator

Tested on my mac:
Disk:

DiskSpeedTest

RAM:

DiskSpeedTest

@mikicho
Copy link
Contributor

mikicho commented Jan 15, 2023

@rluvaton Interesting. please add your mac and OS versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants