Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log decimal TID on linux #2973

Closed
wants to merge 1 commit into from
Closed

log decimal TID on linux #2973

wants to merge 1 commit into from

Conversation

ajkr
Copy link
Contributor

@ajkr ajkr commented Oct 5, 2017

identify the thread using the output of gettid() syscall on Linux, which is a system-wide unique ID, unlike pthread_self(). Also changed from hex to decimal to be compatible with tools like top.

Test Plan:

  • make check -j64
  • run db_bench and correlate top entries with log entries
$ top -H
...
782508 andrewkr  20   0  319272  47864   8884 R 99.7  0.0   0:04.18  4 db_bench
...
$ grep '782508' /dev/shm/dbbench/LOG | head -1
2017/10/05-13:23:17.517942 782508 [db/db_impl_write.cc:1162] [default] New memtable created with log file: #6. Immutable memtables: 0.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ajkr
Copy link
Contributor Author

ajkr commented Oct 6, 2017

the travis failure is in DBTest.SparseMerge, it doesn't look related.

@ajkr ajkr removed their assignment Jan 23, 2018
@ajkr
Copy link
Contributor Author

ajkr commented Jan 23, 2018

Do we still want this? I wrote it because someone internal requested it but didn't really see the point. It needs to be benchmarked if we're going to check it in.

@siying
Copy link
Contributor

siying commented Sep 9, 2019

@ajkr it does seem to be a good diff in general. What benchmark do you mean?

@ajkr
Copy link
Contributor Author

ajkr commented Sep 10, 2019

I don't quite remember. Maybe any workload that writes a lot of log messages. We can make sure the syscall we're using doesn't consume too much CPU compared to the library call we used before. Let me know if you want this PR and I can rebase.

@siying
Copy link
Contributor

siying commented Sep 10, 2019

It sounds like a good one. I'm more concerned about how to close it:) Either closing with merging or not. I don't have any concern about merging it.

@anand1976
Copy link
Contributor

This seems like a good thing to have form a debugging point of view. Since @siying has no problem either, @ajkr can you rebase?

@ajkr
Copy link
Contributor Author

ajkr commented Nov 12, 2021

Do we still want this? I wrote it because someone internal requested it but didn't really see the point. It needs to be benchmarked if we're going to check it in.

Today I finally saw the point after trying to figure out what thread pool was running excessive parallel compactions. Will resurrect this.

ajkr added a commit to ajkr/rocksdb that referenced this pull request Nov 12, 2021
This makes it easier to debug with tools like `ps`. The change only
applies to builds with glibc 2.30+ and _GNU_SOURCE extensions enabled.
We could adopt it in more cases by using the syscall but this is enough
for our build.

Replaces facebook#2973.

Test Plan: will run some benchmarks and correlate with `ps`. Will also
test with ROCKSDB_NO_FBCODE to verify the fallback behavior.
@ajkr ajkr closed this Nov 12, 2021
facebook-github-bot pushed a commit that referenced this pull request Nov 13, 2021
Summary:
This makes it easier to debug with tools like `ps`. The change only
applies to builds with glibc 2.30+ and _GNU_SOURCE extensions enabled.
We could adopt it in more cases by using the syscall but this is enough
for our build.

Replaces #2973.

Pull Request resolved: #9164

Test Plan:
- ran some benchmarks and correlated logged thread IDs with those shown by `ps -L`.
- verified no noticeable regression in throughput for log heavy (more than 700k log lines and over 5k / second) scenario.

Benchmark command:

```
$ TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=filluniquerandom -compression_type=none -max_bytes_for_level_multiplier=2 -write_buffer_size=262144 -num_levels=7 -max_bytes_for_level_base=2097152 -target_file_size_base=524288 -level_compaction_dynamic_level_bytes=true -max_background_jobs=12 -num=20000000
```

Results before: 15.9MB/s, 15.8MB/s, 16.0MB/s
Results after: 16.3MB/s, 16.3MB/s, 15.8MB/s

- Rely on CI to test the fallback behavior

Reviewed By: riversand963

Differential Revision: D32399660

Pulled By: ajkr

fbshipit-source-id: c24d44fdf7782faa616ef0a0964eaca3539d9c24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants