Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zio performance #3063

Closed
wants to merge 2 commits into from
Closed

Conversation

behlendorf
Copy link
Contributor

Two patches designed to bring ZoL's zio create/destroy code back in sync with Illumos/FreeBSD. They also act as a starting point for further optimization. Profiling clearly shows that a considerable amount of time is being spent in zio_create() which can limit throughput on high-end systems.

This profiling also revealed significant contention in spa_dispatch_ent() during a heavy write workload. One possible fix for this is to increase the number of write taskqs to reduce the contention, and a patch is provided which does this. However, other possible solutions should be explored.

NOTE: These changes are based on the currently unmerged large block feature branch.

@sempervictus sempervictus mentioned this pull request Feb 5, 2015
@behlendorf behlendorf added this to the 0.6.5 milestone Feb 6, 2015
Long ago the zio_bulk_flags module parameter was introduced to
facilitate debugging and profiling the zio_buf_caches.  Today
this code works well and there's no compelling reason to keep
this functionality.  In fact it's preferable to revert this so
the code is more consistent with out ZFS implementations.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
The zio_cons() constructor and zio_dest() destructor don't exist
in the upstream Illumos code.  They were introduced as a workaround
to avoid issue openzfs#2523.  Since this issue has now been resolved this
code is being reverted to bring ZoL back in sync with Illumos.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
@sempervictus
Copy link
Contributor

EDIT: nevermind, apparently i needed to rebase the patch stack from master - something was a forced push i guess. Apologies, will test more thoroughly...

@behlendorf
Copy link
Contributor Author

@sempervictus This used to be based on the large block patches but I recently rebased it on master because they're clearly unrelated changes.

@sempervictus
Copy link
Contributor

Just looked over basic benchmarks from my own host from ~10 months ago, and the write improvement is significant.
Same dataset, same disk, same ZFS options...

0.6.2 with some patches, including scatter/gather ARC buffers. Note the read speed, and the CPU ticks required for that (this is a 10 core xeon):

Tiotest results for 8 concurrent io threads:
,----------------------------------------------------------------------.
| Item                  | Time     | Rate         | Usr CPU  | Sys CPU |
+-----------------------+----------+--------------+----------+---------+
| Write        1600 MBs |    5.6 s | 285.495 MB/s | 257.0 %  | 2664.3 % |
| Random Write   31 MBs |    0.2 s | 185.260 MB/s | 241.9 %  | 2066.0 % |
| Read         1600 MBs |    0.5 s | 3229.961 MB/s | 5411.6 %  | 814.4 % |
| Random Read    31 MBs |    0.0 s | 3176.459 MB/s | 5448.3 %  | 731.9 % |
`----------------------------------------------------------------------'
Tiotest latency results:
,-------------------------------------------------------------------------.
| Item         | Average latency | Maximum latency | % >2 sec | % >10 sec |
+--------------+-----------------+-----------------+----------+-----------+
| Write        |        0.044 ms |        5.911 ms |  0.00000 |   0.00000 |
| Random Write |        0.052 ms |        0.164 ms |  0.00000 |   0.00000 |
| Read         |        0.005 ms |        0.106 ms |  0.00000 |   0.00000 |
| Random Read  |        0.005 ms |        0.051 ms |  0.00000 |   0.00000 |
|--------------+-----------------+-----------------+----------+-----------|
| Total        |        0.025 ms |        5.911 ms |  0.00000 |   0.00000 |
`--------------+-----------------+-----------------+----------+-----------'

This is today with this stack applied:

Tiotest results for 8 concurrent io threads:
Tiotest results for 8 concurrent io threads:
,----------------------------------------------------------------------.
| Item                  | Time     | Rate         | Usr CPU  | Sys CPU |
+-----------------------+----------+--------------+----------+---------+
| Write        1600 MBs |    1.7 s | 965.203 MB/s | 1004.4 %  | 2463.9 % |
| Random Write   31 MBs |    0.2 s | 162.832 MB/s | 149.5 %  | 1446.5 % |
| Read         1600 MBs |    1.1 s | 1440.195 MB/s | 919.6 %  | 4202.6 % |
| Random Read    31 MBs |    0.0 s | 1413.004 MB/s | 1008.3 %  | 4123.7 % |
`----------------------------------------------------------------------'
Tiotest latency results:
,-------------------------------------------------------------------------.
| Item         | Average latency | Maximum latency | % >2 sec | % >10 sec |
+--------------+-----------------+-----------------+----------+-----------+
| Write        |        0.018 ms |        2.618 ms |  0.00000 |   0.00000 |
| Random Write |        0.043 ms |        3.857 ms |  0.00000 |   0.00000 |
| Read         |        0.020 ms |        0.184 ms |  0.00000 |   0.00000 |
| Random Read  |        0.021 ms |        0.152 ms |  0.00000 |   0.00000 |
|--------------+-----------------+-----------------+----------+-----------|
| Total        |        0.020 ms |        3.857 ms |  0.00000 |   0.00000 |
`--------------+-----------------+-----------------+----------+-----------'

Linear and random reads appear to be bound somewhere, probably in ARC reads (no SG patch applied). Random writes may well have suffered from the pool going from 15% to 65% capacity and 49% fragmentation. Whatever the case, and however many factors in the basic benchmark, with LZ4 enabled, it wrote 1.6G in about 2s, which is what makes happy users.

@kernelOfTruth
Copy link
Contributor

nice !

mark the significant increase of "Sys CPU" for Read and Random Read,

might be due to fuller pool, but perhaps still something to look into

so the up-to-date results is without "scatter/gather ARC buffers" ( #2129 ) if I understood correctly ?

@sempervictus
Copy link
Contributor

@kernelOfTruth: Yes, #2129 (ABD) was in use during the initial test. @ryao has mentioned work on scatter/gather buffers which dont require the data be copied between different structures, with potentially wider application than ARC alone.

@behlendorf
Copy link
Contributor Author

@sempervictus There have been a lot of changes between 0.6.2 and here but it's good to see that things are moving in the right direction. I'm personally optimistic that the recent ARC performance work done for Illumos and the ABD+SG changes should help with those read numbers.

@behlendorf behlendorf modified the milestones: 0.6.4, 0.6.5 Feb 9, 2015
behlendorf added a commit to behlendorf/zfs that referenced this pull request Feb 11, 2015
Long ago the zio_bulk_flags module parameter was introduced to
facilitate debugging and profiling the zio_buf_caches.  Today
this code works well and there's no compelling reason to keep
this functionality.  In fact it's preferable to revert this so
the code is more consistent with other ZFS implementations.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Issue openzfs#3063
behlendorf added a commit to behlendorf/zfs that referenced this pull request Feb 11, 2015
The zio_cons() constructor and zio_dest() destructor don't exist
in the upstream Illumos code.  They were introduced as a workaround
to avoid issue openzfs#2523.  Since this issue has now been resolved this
code is being reverted to bring ZoL back in sync with Illumos.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Issue openzfs#3063
@behlendorf
Copy link
Contributor Author

@nedbass Thanks for the review, I've updated the commit message and have merged this small bit of cleanup as:

3941503 Retire zio_cons()/zio_dest()
6442f3c Retire zio_bulk_flags

@behlendorf behlendorf closed this Feb 11, 2015
DeHackEd pushed a commit to DeHackEd/zfs that referenced this pull request Apr 4, 2015
Long ago the zio_bulk_flags module parameter was introduced to
facilitate debugging and profiling the zio_buf_caches.  Today
this code works well and there's no compelling reason to keep
this functionality.  In fact it's preferable to revert this so
the code is more consistent with other ZFS implementations.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Issue openzfs#3063
DeHackEd pushed a commit to DeHackEd/zfs that referenced this pull request Apr 5, 2015
Long ago the zio_bulk_flags module parameter was introduced to
facilitate debugging and profiling the zio_buf_caches.  Today
this code works well and there's no compelling reason to keep
this functionality.  In fact it's preferable to revert this so
the code is more consistent with other ZFS implementations.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Issue openzfs#3063
@behlendorf behlendorf deleted the zio-performance branch April 19, 2021 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants