Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce non-throwing variants of hasToken #45341

Merged
merged 5 commits into from
Jan 24, 2023

Conversation

ltrk2
Copy link
Contributor

@ltrk2 ltrk2 commented Jan 16, 2023

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Introduce non-throwing variants of hasToken and hasTokenCaseInsensitive

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Since the hasToken function isn't explicitly documented in the first place, I'll provide some explanation below:

  • Motivation: Providing the means to handle queries potentially containing an illegal needle parameter by returning null instead of throwing an exception.
  • Parameters: haystack as first argument and needle as second, matching that of hasToken.
  • Example use:
:) select hasTokenOrNull('asdf', 'asdf.ghjk');
SELECT hasTokenOrNull('asdf', 'asdf.ghjk')

Query id: ba232cb4-ff65-432a-b95a-73ae7d76bd8b

┌─hasTokenOrNull('asdf', 'asdf.ghjk')─┐
│                                     ᴺᵁᴸᴸ │
└──────────────────────────────────────────┘

1 row in set. Elapsed: 0.007 sec.

@ltrk2 ltrk2 force-pushed the feature/non-throwing-hastoken branch from 4222aa7 to 0cbc7f9 Compare January 16, 2023 21:20
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-feature Pull request with new product feature label Jan 16, 2023
@qoega qoega added the can be tested Allows running workflows for external contributors label Jan 17, 2023
@ltrk2 ltrk2 force-pushed the feature/non-throwing-hastoken branch 2 times, most recently from c4221b3 to 357d724 Compare January 17, 2023 13:35
@kssenii kssenii self-assigned this Jan 17, 2023
@ltrk2
Copy link
Contributor Author

ltrk2 commented Jan 18, 2023

@kssenii I've fixed any issues I could find based on the checks. Could you please help me with the failing ones if there are any further I should aim to resolve?

@kssenii
Copy link
Member

kssenii commented Jan 19, 2023

Please take a look at failed AST fuzzer (msan). There is a crash related to your feature:
query: SELECT max(id) FROM bloom_filter WHERE hasTokenOrNull(s, 'abc,def,zzz')

2023.01.17 21:22:15.571097 [ 435 ] {} <Fatal> BaseDaemon: ########################################
2023.01.17 21:22:15.571524 [ 435 ] {} <Fatal> BaseDaemon: (version 22.13.1.1, build id: 254E273247196B58062EDA865B0E30864839FB8F) (from thread 400) (query_id: 81f4f535-624f-4f50-9eb0-d5257c8883e2) (query: SELECT max(id) FROM bloom_filter WHERE hasTokenOrNull(s, 'abc,def,zzz')) Received signal sanitizer trap (-3)
2023.01.17 21:22:15.571677 [ 435 ] {} <Fatal> BaseDaemon: Sanitizer trap.
2023.01.17 21:22:15.571855 [ 435 ] {} <Fatal> BaseDaemon: Stack trace: 0x29852b99 0x2a141aba 0xc5fb656 0xc6101f3 0x48d86d90 0x4c99ada1 0x4c99404f 0x4c98be95 0x4c987bf9 0x4c93e4ac 0x4c940ae2 0x4c9352de 0x4c9c786c 0x4b876ec9 0x4b8e9618 0x4b8afb1d 0x4b8b594e 0x29b311ed 0x29b3f363 0x7efeb7f13609 0x7efeb7e38133
2023.01.17 21:22:15.657461 [ 435 ] {} <Fatal> BaseDaemon: 0.1. inlined from ./build_docker/../src/Common/StackTrace.cpp:334: StackTrace::tryCapture()
2023.01.17 21:22:15.657634 [ 435 ] {} <Fatal> BaseDaemon: 0. ./build_docker/../src/Common/StackTrace.cpp:295: StackTrace::StackTrace() @ 0x29852b99 in /workspace/clickhouse
2023.01.17 21:22:15.829271 [ 435 ] {} <Fatal> BaseDaemon: 1. ./build_docker/../src/Daemon/BaseDaemon.cpp:431: sanitizerDeathCallback() @ 0x2a141aba in /workspace/clickhouse
2023.01.17 21:22:20.820926 [ 435 ] {} <Fatal> BaseDaemon: 2. __sanitizer::Die() @ 0xc5fb656 in /workspace/clickhouse
2023.01.17 21:22:25.714648 [ 435 ] {} <Fatal> BaseDaemon: 3. ? @ 0xc6101f3 in /workspace/clickhouse
2023.01.17 21:22:25.757086 [ 435 ] {} <Fatal> BaseDaemon: 4. ./build_docker/../src/Columns/FilterDescription.cpp:0: DB::FilterDescription::FilterDescription(DB::IColumn const&) @ 0x48d86d90 in /workspace/clickhouse
2023.01.17 21:22:25.926994 [ 435 ] {} <Fatal> BaseDaemon: 5. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.h:0: DB::FilterWithCachedCount::FilterWithCachedCount(COW<DB::IColumn>::immutable_ptr<DB::IColumn> const&) @ 0x4c99ada1 in /workspace/clickhouse
2023.01.17 21:22:26.128123 [ 435 ] {} <Fatal> BaseDaemon: 6. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.cpp:1351: DB::MergeTreeRangeReader::executePrewhereActionsAndFilterColumns(DB::MergeTreeRangeReader::ReadResult&) const @ 0x4c99404f in /workspace/clickhouse
2023.01.17 21:22:26.267714 [ 435 ] {} <Fatal> BaseDaemon: 7. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.cpp:1040: DB::MergeTreeRangeReader::read(unsigned long, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange>>&) @ 0x4c98be95 in /workspace/clickhouse
2023.01.17 21:22:26.408454 [ 435 ] {} <Fatal> BaseDaemon: 8. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.cpp:0: DB::MergeTreeRangeReader::read(unsigned long, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange>>&) @ 0x4c987bf9 in /workspace/clickhouse
2023.01.17 21:22:26.697535 [ 435 ] {} <Fatal> BaseDaemon: 9. ./build_docker/../src/Storages/MergeTree/MergeTreeBaseSelectProcessor.cpp:404: DB::IMergeTreeSelectAlgorithm::readFromPartImpl() @ 0x4c93e4ac in /workspace/clickhouse
2023.01.17 21:22:26.989096 [ 435 ] {} <Fatal> BaseDaemon: 10. ./build_docker/../src/Storages/MergeTree/MergeTreeBaseSelectProcessor.cpp:447: DB::IMergeTreeSelectAlgorithm::readFromPart() @ 0x4c940ae2 in /workspace/clickhouse
2023.01.17 21:22:27.262179 [ 435 ] {} <Fatal> BaseDaemon: 11. ./build_docker/../src/Storages/MergeTree/MergeTreeBaseSelectProcessor.cpp:224: DB::IMergeTreeSelectAlgorithm::read() @ 0x4c9352de in /workspace/clickhouse
2023.01.17 21:22:27.353008 [ 435 ] {} <Fatal> BaseDaemon: 12.1. inlined from ./build_docker/../src/Storages/MergeTree/MergeTreeSource.cpp:172: DB::MergeTreeSource::reportProgress(DB::ChunkAndProgress)
2023.01.17 21:22:27.353131 [ 435 ] {} <Fatal> BaseDaemon: 12. ./build_docker/../src/Storages/MergeTree/MergeTreeSource.cpp:214: DB::MergeTreeSource::tryGenerate() @ 0x4c9c786c in /workspace/clickhouse
2023.01.17 21:22:27.437788 [ 435 ] {} <Fatal> BaseDaemon: 13.1. inlined from ./build_docker/../contrib/llvm-project/libcxx/include/optional:344: std::__1::__optional_storage_base<DB::Chunk, false>::has_value[abi:v15000]() const

Also Stateless tests (msan) [4/6] — Tests are not finished, fail: 1.
If you open run.log on top of the details page, you will see that is also crashed on query query: SELECT max(id) FROM bloom_filter WHERE hasTokenOrNull(s, 'abc,def,zzz');.

2023.01.17 15:57:19.238417 [ 15082 ] {} <Fatal> BaseDaemon: ########################################
2023.01.17 15:57:19.238768 [ 15082 ] {} <Fatal> BaseDaemon: (version 22.13.1.1, build id: 254E273247196B58062EDA865B0E30864839FB8F) (from thread 6669) (query_id: fdd2b753-6f7e-4725-b970-ade665ff115f) (query: SELECT max(id) FROM bloom_filter WHERE hasTokenOrNull(s, 'abc,def,zzz');) Received signal sanitizer trap (-3)
2023.01.17 15:57:19.239053 [ 15082 ] {} <Fatal> BaseDaemon: Sanitizer trap.
2023.01.17 15:57:19.239347 [ 15082 ] {} <Fatal> BaseDaemon: Stack trace: 0x29852b99 0x2a141aba 0xc5fb656 0xc6101f3 0x48d86d90 0x4c99ada1 0x4c99404f 0x4c98be95 0x4c987bf9 0x4c93e4ac 0x4c940ae2 0x4c9352de 0x4c9c786c 0x4b876ec9 0x4b8e9618 0x4b8afb1d 0x4b8b594e 0x29b311ed 0x29b3f363 0x7f6d2951c609 0x7f6d29441133
2023.01.17 15:57:19.431300 [ 15082 ] {} <Fatal> BaseDaemon: 0.1. inlined from ./build_docker/../src/Common/StackTrace.cpp:334: StackTrace::tryCapture()
2023.01.17 15:57:19.431745 [ 15082 ] {} <Fatal> BaseDaemon: 0. ./build_docker/../src/Common/StackTrace.cpp:295: StackTrace::StackTrace() @ 0x29852b99 in /usr/bin/clickhouse
2023.01.17 15:57:19.704777 [ 15082 ] {} <Fatal> BaseDaemon: 1. ./build_docker/../src/Daemon/BaseDaemon.cpp:431: sanitizerDeathCallback() @ 0x2a141aba in /usr/bin/clickhouse
2023.01.17 15:57:29.809132 [ 15082 ] {} <Fatal> BaseDaemon: 2. __sanitizer::Die() @ 0xc5fb656 in /usr/bin/clickhouse
2023.01.17 15:57:42.104789 [ 15082 ] {} <Fatal> BaseDaemon: 3. ? @ 0xc6101f3 in /usr/bin/clickhouse
2023.01.17 15:57:42.187785 [ 15082 ] {} <Fatal> BaseDaemon: 4. ./build_docker/../src/Columns/FilterDescription.cpp:0: DB::FilterDescription::FilterDescription(DB::IColumn const&) @ 0x48d86d90 in /usr/bin/clickhouse
2023.01.17 15:57:42.550536 [ 15082 ] {} <Fatal> BaseDaemon: 5. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.h:0: DB::FilterWithCachedCount::FilterWithCachedCount(COW<DB::IColumn>::immutable_ptr<DB::IColumn> const&) @ 0x4c99ada1 in /usr/bin/clickhouse
2023.01.17 15:57:43.051134 [ 15082 ] {} <Fatal> BaseDaemon: 6. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.cpp:1351: DB::MergeTreeRangeReader::executePrewhereActionsAndFilterColumns(DB::MergeTreeRangeReader::ReadResult&) const @ 0x4c99404f in /usr/bin/clickhouse
2023.01.17 15:57:43.320972 [ 15082 ] {} <Fatal> BaseDaemon: 7. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.cpp:1040: DB::MergeTreeRangeReader::read(unsigned long, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange>>&) @ 0x4c98be95 in /usr/bin/clickhouse
2023.01.17 15:57:43.682675 [ 15082 ] {} <Fatal> BaseDaemon: 8. ./build_docker/../src/Storages/MergeTree/MergeTreeRangeReader.cpp:0: DB::MergeTreeRangeReader::read(unsigned long, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange>>&) @ 0x4c987bf9 in /usr/bin/clickhouse
2023.01.17 15:57:44.286262 [ 15082 ] {} <Fatal> BaseDaemon: 9. ./build_docker/../src/Storages/MergeTree/MergeTreeBaseSelectProcessor.cpp:404: DB::IMergeTreeSelectAlgorithm::readFromPartImpl() @ 0x4c93e4ac in /usr/bin/clickhouse

You can also find server logs there.

@ltrk2 ltrk2 force-pushed the feature/non-throwing-hastoken branch from 0437704 to 0fc68e5 Compare January 19, 2023 13:50
@ltrk2 ltrk2 force-pushed the feature/non-throwing-hastoken branch from 0fc68e5 to 810c9ba Compare January 20, 2023 15:24
Copy link
Member

@rschu1ze rschu1ze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I saw that @kssenii was assigned already. Just took a look because I did some work in this area/file already.

src/Functions/FunctionsStringSearch.h Outdated Show resolved Hide resolved
src/Functions/HasTokenImpl.h Outdated Show resolved Hide resolved
src/Functions/HasTokenImpl.h Outdated Show resolved Hide resolved
src/Functions/HasTokenImpl.h Outdated Show resolved Hide resolved
src/Functions/HasTokenImpl.h Outdated Show resolved Hide resolved
@ltrk2 ltrk2 force-pushed the feature/non-throwing-hastoken branch from 06ed25e to 9710b86 Compare January 23, 2023 22:28
Copy link
Member

@rschu1ze rschu1ze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm.

Restarted some tests, they failed due to unrelated infra issues.

@rschu1ze rschu1ze self-assigned this Jan 24, 2023
@ltrk2
Copy link
Contributor Author

ltrk2 commented Jan 24, 2023

@kssenii @rschu1ze thank you to both of you for taking the time to review my changes and make suggestions for improvement!

@kssenii
Copy link
Member

kssenii commented Jan 24, 2023

Integration tests failures not related - they are broken in master

if (i < res.size())
memset(&res[i], negate, (res.size() - i) * sizeof(res[0]));
}
catch (...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschu1ze @kssenii

I'm surprised by this code.
It is obviously incorrect, it is slow, and record errors in the system.errors table.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the only problem with the code, then I have a correction: #51425

Copy link
Contributor Author

@ltrk2 ltrk2 Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexey-milovidov the range you highlighted is mostly unchanged. Or is it the try-catch as @rschu1ze is suggesting? If you could provide some way of reproducing potential issues with it I'd be happy to contribute an improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, catch (...) is the problem.

alexey-milovidov added a commit that referenced this pull request Jun 25, 2023
…-hastoken"

This reverts commit dd18dd2, reversing
changes made to 4bd3f0e.
@alexey-milovidov alexey-milovidov mentioned this pull request Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants