New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CH-34239] HTTP User-Agent header #34330
[CH-34239] HTTP User-Agent header #34330
Conversation
Added HTTP User-Agent Header to HTTP requests. User-Agent: ClickHouse/VERSION_STRING
Can you add a test, please? |
src/IO/ReadWriteBufferFromHTTP.h
Outdated
@@ -276,6 +277,8 @@ namespace detail | |||
"0 < http_retry_initial_backoff_ms < settings.http_retry_max_backoff_ms (now 0 < {} < {})", | |||
settings.http_max_tries, settings.http_retry_initial_backoff_ms, settings.http_retry_max_backoff_ms); | |||
|
|||
http_header_entries.emplace_back(std::make_pair("User-Agent", fmt::format("ClickHouse/{}", VERSION_STRING))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if User-Agent is already specified in the headers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. The most efficient way to check this would be a linear traversal of the vector O(n)
. Loading everything into a map would use memory and not improve efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
Will do. I have not figured out the functional testing but I will look into them. |
Input header vectors could potentially contain User-Agent. If so, do not set another.
I'm not sure, but probably https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/00646_url_engine.python and https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/01854_HTTP_dict_decompression.python can help you (they are related to URL table function and engine). You can check headers, that were given to the server. |
@FArthur-cmd thanks for the info. I have not used Python in a while and the This routine makes the HTTP request to the ClickHouse server and collects the response:
This routine then splits the response and checks for correct response against the expected:
My suggestion would be to modify the two functions below: def get_ch_answer(query):
host = CLICKHOUSE_HOST
if IS_IPV6:
host = f'[{host}]'
url = os.environ.get('CLICKHOUSE_URL', 'http://{host}:{port}'.format(host=CLICKHOUSE_HOST, port=CLICKHOUSE_PORT_HTTP))
return urllib.request.urlopen(url, data=query.encode()) The return value has been changed. def check_answers(query, answer):
ch_answer = get_ch_answer(query)
if ch_answer.read().decode().strip() != answer.strip():
print("FAIL on query:", query, file=sys.stderr)
print("Expected answer:", answer, file=sys.stderr)
print("Fetched answer :", ch_answer, file=sys.stderr)
raise Exception("Fail on query")
headers = ch_answer.getheaders()
found = [(key, value) for key, value in headers if key == 'User-Agent' and value.startswith('ClickHouse/')]
if not found:
raise Exception("Fail on query. Missing User-Agent")
Another possibility is to check on each request to the server def get_ch_answer(query):
host = CLICKHOUSE_HOST
if IS_IPV6:
host = f'[{host}]'
url = os.environ.get('CLICKHOUSE_URL', 'http://{host}:{port}'.format(host=CLICKHOUSE_HOST, port=CLICKHOUSE_PORT_HTTP))
request = urllib.request.urlopen(url, data=query.encode())
headers = request.getheaders()
found = [(key, value) for key, value in headers if key == 'User-Agent' and value.startswith('ClickHouse/')]
if not found:
raise Exception("Fail: Missing User-Agent header")
return request.read().decode() There are calls being made to create and drop test tables in the tests using the What are your thoughts? |
AFAIK, this won't work. You are right about requests to ClickHouse server. But this request differs from the problem, that you are solving. You want to use the URL function to get information from other servers into ClickHouse, so you can make an additional server, that will get requests from ClickHouse. In these tests processors like So, I suggest changing the process and the test itself. |
Thank you @FArthur-cmd for all the information and sorry for the delay. I had to spend some time today learning about Python's |
Added stateless functional test 02205_HTTP_user_agent. Sends URL request staying true to the reported error example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have problems with code style, you can run utils/check-style/check-style
to check it locally.
Co-authored-by: Filatenkov Artur <58165623+FArthur-cmd@users.noreply.github.com>
Un-commented line required for test runner.
Looks like my test is failing. It passes when I run it locally when I run the Python script. I will look into it. 2022-02-08 20:06:59 02205_HTTP_user_agent: [ FAIL ] - return code: 1, result:
2022-02-08 20:06:59
2022-02-08 20:06:59
2022-02-08 20:06:59
2022-02-08 20:06:59 stdout:
2022-02-08 20:06:59
2022-02-08 20:06:59 |
Added noop logger in <HttpProcessor> to suppress <stderr> output logging on inbound connections. Pipe to </dev/null> in shell script no longer needed.
I am not sure if the failing integration tests are related to the changes in this PR. The server logs seem to indicate an issue connecting to the databases. I do not know enough about the ClickHouse architecture to diagnose this but my changes delegate the setting of the MongoDB2022.02.10 16:49:19.057607 [ 68 ] {} <Error> bool DB::(anonymous namespace)::checkPermissionsImpl(): Code: 412. DB::Exception: Can't receive Netlink response: error -2. (NETLINK_ERROR), Stack trace (when copying this message, always include the lines below):
0. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/exception:0: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x1c8a1dbb in /usr/bin/clickhouse
1. ./obj-x86_64-linux-gnu/../src/Common/Exception.cpp:58: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xaa9e7bc in /usr/bin/clickhouse
2. ./obj-x86_64-linux-gnu/../src/Common/TaskStatsInfoGetter.cpp:0: DB::(anonymous namespace)::query(int, unsigned short, unsigned int, char8_t, unsigned short, void const*, int) @ 0xaadbb21 in /usr/bin/clickhouse
3. ./obj-x86_64-linux-gnu/../src/Common/TaskStatsInfoGetter.cpp:181: DB::(anonymous namespace)::getFamilyIdImpl(int) @ 0xaadbd08 in /usr/bin/clickhouse
4. ./obj-x86_64-linux-gnu/../src/Common/TaskStatsInfoGetter.cpp:0: DB::TaskStatsInfoGetter::TaskStatsInfoGetter() @ 0xaadb600 in /usr/bin/clickhouse
5. ./obj-x86_64-linux-gnu/../src/Common/TaskStatsInfoGetter.cpp:293: DB::(anonymous namespace)::checkPermissionsImpl() @ 0xaadb30b in /usr/bin/clickhouse
6. ./obj-x86_64-linux-gnu/../src/Common/TaskStatsInfoGetter.cpp:0: DB::TaskStatsInfoGetter::checkPermissions() @ 0xaadb279 in /usr/bin/clickhouse
7. ./obj-x86_64-linux-gnu/../src/Common/ThreadProfileEvents.cpp:0: DB::TasksStatsCounters::create(unsigned long) @ 0xaad29fc in /usr/bin/clickhouse
8. ./obj-x86_64-linux-gnu/../src/Interpreters/ThreadStatusExt.cpp:0: DB::ThreadStatus::initPerformanceCounters() @ 0x1895a1e4 in /usr/bin/clickhouse
9. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:993: DB::ThreadStatus::setupState(std::__1::shared_ptr<DB::ThreadGroupStatus> const&) @ 0x18959ecc in /usr/bin/clickhouse
10. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:3211: DB::CurrentThread::initializeQuery() @ 0x1895bda5 in /usr/bin/clickhouse
11. ./obj-x86_64-linux-gnu/../src/Core/BackgroundSchedulePool.cpp:238: DB::BackgroundSchedulePool::attachToThreadGroup() @ 0x179248ab in /usr/bin/clickhouse
12. ./obj-x86_64-linux-gnu/../src/Core/BackgroundSchedulePool.cpp:0: DB::BackgroundSchedulePool::threadFunction() @ 0x1792498e in /usr/bin/clickhouse
13. ./obj-x86_64-linux-gnu/../src/Core/BackgroundSchedulePool.cpp:0: void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<ThreadFromGlobalPool::ThreadFromGlobalPool<DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, unsigned long, char const*)::$_1>(DB::BackgroundSchedulePool::BackgroundSchedulePool(unsigned long, unsigned long, char const*)::$_1&&)::'lambda'(), void ()> >(std::__1::__function::__policy_storage const*) @ 0x179251d2 in /usr/bin/clickhouse
14. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2210: ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xab29fce in /usr/bin/clickhouse
15. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:1655: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()> >(void*) @ 0xab2d8d1 in /usr/bin/clickhouse
16. __tsan_thread_start_func @ 0xa9e9f7d in /usr/bin/clickhouse
17. ? @ 0x7f2b88a3e609 in ?
18. __clone @ 0x7f2b88965293 in ?
(version 22.2.1.4267) Cassandra2022.02.10 16:52:28.810078 [ 375 ] {} <Error> ExternalDictionariesLoader: Could not update external dictionary 'Cassandra_complex_key_hashed_', leaving the previous version, next update is scheduled at 2022-02-10 16:52:34: Code: 528. DB::Exception: Cassandra driver error 16777226: No hosts available: Underlying connection error: Connect error 'connection refused'. (CASSANDRA_INTERNAL_ERROR), Stack trace (when copying this message, always include the lines below):
0. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/exception:0: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x1c8a1dbb in /usr/bin/clickhouse
1. ./obj-x86_64-linux-gnu/../src/Common/Exception.cpp:58: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xaa9e7bc in /usr/bin/clickhouse
2. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/string:1444: DB::Exception::Exception<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, char const*, char const*&>(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, char const*&&, char const*&) @ 0x144c5651 in /usr/bin/clickhouse
3. ./obj-x86_64-linux-gnu/../src/Dictionaries/CassandraHelpers.cpp:0: DB::cassandraWaitAndCheck(DB::Cassandra::ObjectHolder<CassFuture_, &(cass_future_free), &(CassFuture_* DB::Cassandra::defaultCtor<CassFuture_>())>&) @ 0x144c45c6 in /usr/bin/clickhouse
4. ./obj-x86_64-linux-gnu/../src/Dictionaries/CassandraDictionarySource.cpp:0: DB::CassandraDictionarySource::getSession() @ 0x144c072e in /usr/bin/clickhouse
5. ./obj-x86_64-linux-gnu/../src/Dictionaries/CassandraDictionarySource.cpp:139: DB::CassandraDictionarySource::loadAll() @ 0x144c0294 in /usr/bin/clickhouse
6. DB::HashedDictionary<(DB::DictionaryKeyType)1, false>::loadData() @ 0x14f560a7 in /usr/bin/clickhouse
7. DB::HashedDictionary<(DB::DictionaryKeyType)1, false>::HashedDictionary(DB::StorageID const&, DB::DictionaryStructure const&, std::__1::shared_ptr<DB::IDictionarySource>, DB::HashedDictionaryStorageConfiguration const&, std::__1::shared_ptr<DB::Block>) @ 0x14f55ded in /usr/bin/clickhouse
8. std::__1::shared_ptr<DB::HashedDictionary<(DB::DictionaryKeyType)1, false> > std::__1::allocate_shared<DB::HashedDictionary<(DB::DictionaryKeyType)1, false>, std::__1::allocator<DB::HashedDictionary<(DB::DictionaryKeyType)1, false> >, DB::StorageID, DB::DictionaryStructure const&, std::__1::shared_ptr<DB::IDictionarySource>, DB::HashedDictionaryStorageConfiguration const&, std::__1::shared_ptr<DB::Block> const&, void>(std::__1::allocator<DB::HashedDictionary<(DB::DictionaryKeyType)1, false> > const&, DB::StorageID&&, DB::DictionaryStructure const&, std::__1::shared_ptr<DB::IDictionarySource>&&, DB::HashedDictionaryStorageConfiguration const&, std::__1::shared_ptr<DB::Block> const&) @ 0x15174e2b in /usr/bin/clickhouse
9. DB::HashedDictionary<(DB::DictionaryKeyType)1, false>::clone() const @ 0x14f5717d in /usr/bin/clickhouse
10. ./obj-x86_64-linux-gnu/../src/Interpreters/ExternalLoader.cpp:0: std::__1::shared_ptr<DB::IExternalLoadable const> std::__1::__function::__policy_invoker<std::__1::shared_ptr<DB::IExternalLoadable const> (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::ExternalLoader::ObjectConfig const&, std::__1::shared_ptr<DB::IExternalLoadable const> const&)>::__call_impl<std::__1::__function::__default_alloc_func<DB::ExternalLoader::ExternalLoader(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, Poco::Logger*)::$_1, std::__1::shared_ptr<DB::IExternalLoadable const> (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::ExternalLoader::ObjectConfig const&, std::__1::shared_ptr<DB::IExternalLoadable const> const&)> >(std::__1::__function::__policy_storage const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::ExternalLoader::ObjectConfig const&, std::__1::shared_ptr<DB::IExternalLoadable const> const&) @ 0x17fdc02c in /usr/bin/clickhouse
11. ./obj-x86_64-linux-gnu/../src/Interpreters/ExternalLoader.cpp:0: DB::ExternalLoader::LoadingDispatcher::loadSingleObject(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::ExternalLoader::ObjectConfig const&, std::__1::shared_ptr<DB::IExternalLoadable const>) @ 0x17fe4ef7 in /usr/bin/clickhouse
12. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2851: DB::ExternalLoader::LoadingDispatcher::doLoading(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, bool, unsigned long, bool, std::__1::shared_ptr<DB::ThreadGroupStatus>) @ 0x17fe225b in /usr/bin/clickhouse
13. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/type_traits:0: ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::ExternalLoader::LoadingDispatcher::*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, bool, unsigned long, bool, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::ExternalLoader::LoadingDispatcher*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, unsigned long&, bool&, unsigned long&, bool, std::__1::shared_ptr<DB::ThreadGroupStatus> >(void (DB::ExternalLoader::LoadingDispatcher::*&&)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, bool, unsigned long, bool, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::ExternalLoader::LoadingDispatcher*&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, unsigned long&, bool&, unsigned long&, bool&&, std::__1::shared_ptr<DB::ThreadGroupStatus>&&)::'lambda'()::operator()() @ 0x17fe8b00 in /usr/bin/clickhouse
14. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2090: void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::ExternalLoader::LoadingDispatcher::*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, bool, unsigned long, bool, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::ExternalLoader::LoadingDispatcher*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, unsigned long&, bool&, unsigned long&, bool, std::__1::shared_ptr<DB::ThreadGroupStatus> >(void (DB::ExternalLoader::LoadingDispatcher::*&&)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, bool, unsigned long, bool, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::ExternalLoader::LoadingDispatcher*&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, unsigned long&, bool&, unsigned long&, bool&&, std::__1::shared_ptr<DB::ThreadGroupStatus>&&)::'lambda'(), void ()> >(std::__1::__function::__policy_storage const*) @ 0x17fe8862 in /usr/bin/clickhouse
15. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2210: ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xab29fce in /usr/bin/clickhouse
16. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:1655: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()> >(void*) @ 0xab2d8d1 in /usr/bin/clickhouse
17. __tsan_thread_start_func @ 0xa9e9f7d in /usr/bin/clickhouse
18. ? @ 0x7f52d55d3609 in ?
19. __clone @ 0x7f52d54fa293 in ?
(version 22.2.1.4267)
2022.02.10 16:52:28.810406 [ 375 ] {} <Trace> ExternalDictionariesLoader: Next update time for 'Cassandra_complex_key_hashed_' was set to 2022-02-10 16:52:34
2022.02.10 16:52:28.810564 [ 376 ] {} <Trace> ExternalDictionariesLoader: Supposed update time for 'Cassandra_range_hashed_' is 2022-02-10 16:52:34 (backoff, 1 errors) |
@Mergifyio update |
✅ Branch has been successfully updated |
These tests were broken at master some time ago. Now they are fixed. Let's see if the update will help with tests. |
HDFS tests are broken in master, they are not related to this pr |
Added HTTP User-Agent Header to HTTP requests.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Some servers expect a User-Agent header in their HTTP requests. A
User-Agent
header entry has been added to HTTP requests of the form:User-Agent: ClickHouse/VERSION_STRING
Resolves #34239.
...