Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATS9 Crash in http_hdr_clone #6882

Closed
shinrich opened this issue Jun 10, 2020 · 3 comments · Fixed by #7004
Closed

ATS9 Crash in http_hdr_clone #6882

shinrich opened this issue Jun 10, 2020 · 3 comments · Fixed by #7004
Assignees
Labels
Milestone

Comments

@shinrich
Copy link
Member

We moved ATS 9 into a forward proxy environment, and get the following crash every 10 minutes or so

{code}
(gdb) bt
#0 0x000000000061ccf3 in http_hdr_clone (s_hh=0x0, s_heap=0x0, d_heap=0x2b3792da8800) at ../../../../../../_vcs/trafficserver9/proxy/hdrs/HTTP.cc:359
#1 0x000000000059057c in copy (hdr=0x2b3854860870, this=0x2b38548607f0)
at /sd/workspace/src/git.ouroath.com/Edge/build/_build/build_release_posix-x86_64_gcc_8/trafficserver9/build/../../../../_vcs/trafficserver9/proxy/hdrs/HTTP.h:815
#2 copy (hdr=0x2b3854860870, hdr@entry=0x2b38548607f0, this=0x2b38548607f0, this@entry=0x2b3854860870)
at /sd/workspace/src/git.ouroath.com/Edge/build/_build/build_release_posix-x86_64_gcc_8/trafficserver9/build/../../../../_vcs/trafficserver9/proxy/hdrs/HTTP.h:807
#3 HttpTransactHeaders::copy_header_fields (src_hdr=0x2b3854860870, new_hdr=new_hdr@entry=0x2b38548607f0, retain_proxy_auth_hdrs=false, date=1591803373)
at ../../../../../../_vcs/trafficserver9/proxy/http/HttpTransactHeaders.cc:207
#4 0x000000000056a1cf in HttpTransact::build_response_copy (s=0x2b3854860100, base_response=, outgoing_response=0x2b38548607f0, outgoing_version=...)
at ../../../../../../_vcs/trafficserver9/proxy/http/HttpTransact.cc:4102
#5 0x0000000000583578 in HttpTransact::OSDNSLookup (s=0x2b3854860100) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpTransact.cc:1910
#6 0x0000000000545e51 in HttpSM::call_transact_and_set_next_state (this=this@entry=0x2b3854860000, f=f@entry=0x0) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:7236
#7 0x000000000054879a in HttpSM::do_hostdb_lookup (this=this@entry=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:4138
#8 0x000000000055c84a in HttpSM::set_next_state (this=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:7594
#9 0x0000000000545d42 in HttpSM::call_transact_and_set_next_state (this=this@entry=0x2b3854860000, f=f@entry=0x0) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:7244
#10 0x00000000005569da in HttpSM::handle_api_return (this=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:1604
#11 0x0000000000552651 in HttpSM::state_api_callout(int, void*) () at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:1536
#12 0x0000000000555887 in HttpSM::state_api_callback (this=this@entry=0x2b3854860000, event=event@entry=60000, data=data@entry=0x0)
at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:1340
#13 0x00000000004f7a6a in TSHttpTxnReenable () at ../../../../../_vcs/trafficserver9/src/traffic_server/InkAPI.cc:6105
#14 0x00002b375100f90e in main_handler (cont=, event=, edata=0x2b3854860000) at ../../../../../_vcs/trafficserver9/plugins/regex_revalidate/regex_revalidate.c:453
#15 0x00000000004e3563 in INKContInternal::handle_event (this=0x2b3724ee96e0, event=60015, edata=0x2b3854860000) at ../../../../../_vcs/trafficserver9/src/traffic_server/InkAPI.cc:1096
#16 0x00000000004f5adb in handleEvent (data=0x2b3854860000, event=60015, this=0x2b3724ee96e0)
at /sd/workspace/src/git.ouroath.com/Edge/build/_build/build_release_posix-x86_64_gcc_8/trafficserver9/build/../../../../_vcs/trafficserver9/iocore/eventsystem/I_Continuation.h:167
#17 handleEvent (data=0x2b3854860000, event=60015, this=0x2b3724ee96e0)
at /sd/workspace/src/git.ouroath.com/Edge/build/_build/build_release_posix-x86_64_gcc_8/trafficserver9/build/../../../../_vcs/trafficserver9/iocore/eventsystem/I_Continuation.h:163
#18 APIHook::invoke(int, void*) const () at ../../../../../_vcs/trafficserver9/src/traffic_server/InkAPI.cc:1333
#19 0x00000000005525ae in HttpSM::state_api_callout(int, void*) () at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:1469
#20 0x000000000055442f in setup_cache_lookup_complete_api (this=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:2571
#21 HttpSM::state_cache_open_read (this=0x2b3854860000, event=, data=0xffffffffffffb050) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:2571
#22 0x0000000000554dc3 in main_handler (data=0xffffffffffffb050, event=1103, this=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:2613
#23 HttpSM::main_handler (this=0x2b3854860000, event=1103, data=0xffffffffffffb050) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:2583
#24 0x00000000005a19bd in handleEvent (data=0xffffffffffffb050, event=1103, this=0x2b3854860000)
at /sd/workspace/src/git.ouroath.com/Edge/build/_build/build_release_posix-x86_64_gcc_8/trafficserver9/build/../../../../_vcs/trafficserver9/iocore/eventsystem/I_Continuation.h:167
#25 handleEvent (data=0xffffffffffffb050, event=1103, this=0x2b3854860000)
at /sd/workspace/src/git.ouroath.com/Edge/build/_build/build_release_posix-x86_64_gcc_8/trafficserver9/build/../../../../_vcs/trafficserver9/iocore/eventsystem/I_Continuation.h:163
#26 HttpCacheSM::state_cache_open_read (this=, event=, data=0xffffffffffffb050) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpCacheSM.cc:133
#27 0x000000000068cfdd in Cache::open_read(Continuation*, ats::CryptoHash const*, HTTPHdr*, OverridableHttpConfigParams const*, CacheFragType, char const*, int) ()
at ../../../../../../_vcs/trafficserver9/iocore/cache/CacheRead.cc:149
#28 0x000000000066a388 in CacheProcessor::open_read (this=, cont=cont@entry=0x2b3854861da0, key=, request=, params=,
pin_in_cache=, type=CACHE_FRAG_TYPE_HTTP) at ../../../../../../_vcs/trafficserver9/iocore/cache/Cache.cc:3149
#29 0x00000000005a17a1 in HttpCacheSM::do_cache_open_read (this=this@entry=0x2b3854861da0, key=...) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpCacheSM.cc:273
#30 0x00000000005a1c57 in HttpCacheSM::open_read (this=this@entry=0x2b3854861da0, key=key@entry=0x2b373850a630, url=url@entry=0x2b38548607c8, hdr=hdr@entry=0x2b38548607b0,
params=, pin_in_cache=) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpCacheSM.cc:307
#31 0x0000000000543102 in HttpSM::do_cache_lookup_and_read (this=this@entry=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:4569
#32 0x000000000055c4f1 in HttpSM::set_next_state (this=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:7409
#33 0x0000000000545d42 in HttpSM::call_transact_and_set_next_state (this=this@entry=0x2b3854860000, f=) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:7244
#34 0x0000000000556bbf in HttpSM::handle_api_return (this=0x2b3854860000) at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:1703
#35 0x0000000000552651 in HttpSM::state_api_callout(int, void*) () at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:1536
#36 0x0000000000555887 in HttpSM::state_api_callback (this=this@entry=0x2b3854860000, event=event@entry=60000, data=data@entry=0x0)
at ../../../../../../_vcs/trafficserver9/proxy/http/HttpSM.cc:1340
#37 0x00000000004f7a6a in TSHttpTxnReenable () at ../../../../../_vcs/trafficserver9/src/traffic_server/InkAPI.cc:6105
#38 0x00002b3760605883 in ?? ()
#39 0x0000000000000000 in ?? ()
{code}

@shinrich shinrich added the Crash label Jun 10, 2020
@shinrich shinrich added this to the 10.0.0 milestone Jun 10, 2020
@shinrich shinrich self-assigned this Jun 10, 2020
@shinrich
Copy link
Member Author

shinrich commented Jun 10, 2020

Looks the same as issue #4263 which should have been fixed by PR #4289. That fix is in 9.0.x, but we are still seem the same or a similar issue.

The original source of this logic is from PR #4145. Logic to handle redirecting to loopback addresses. In the crash case, the address is a non-routable (192.168.40.x) but not a loopback. I have no idea how the redirect_in_process flag got set, because the cache lookup failed and according to the SM history it has not communicated to any server yet.

However in our case, this is not a cache hit. Just was not able to get a dns response for the requested origin.

(gdb) print s->cache_lookup_result
$3 = HttpTransact::CACHE_LOOKUP_MISS

The comment suggests that this is error handling if the dns lookup failed, but that happens much earlier in the OSDNSLookup method. In my core, the dns lookup seems to have succeeded.

(gdb) print s->dns_info.lookup_success 
$14 = true

Here is the code in question from HttpTransact::OSDNLookup method.

  if (s->redirect_info.redirect_in_process) {
    // If dns lookup was not successful, the code below will handle the error.
    RedirectEnabled::Action action = RedirectEnabled::Action::INVALID;
    if (true == Machine::instance()->is_self(s->host_db_info.ip())) {
      action = s->http_config_param->redirect_actions_self_action;
    } else {
      // Make sure the return value from contains is big enough for a void*.
      intptr_t x{intptr_t(RedirectEnabled::Action::INVALID)};
      ink_release_assert(s->http_config_param->redirect_actions_map != nullptr);
      ink_release_assert(s->http_config_param->redirect_actions_map->contains(s->host_db_info.ip(), reinterpret_cast<void **>(&x)));
      action = static_cast<RedirectEnabled::Action>(x);
    }
    switch (action) {
    case RedirectEnabled::Action::FOLLOW:
      TxnDebug("http_trans", "[OSDNSLookup] Invalid redirect address. Following");
      break;
    case RedirectEnabled::Action::REJECT:
      TxnDebug("http_trans", "[OSDNSLookup] Invalid redirect address. Rejecting.");
      build_error_response(s, HTTP_STATUS_FORBIDDEN, nullptr, "request#syntax_error");
      SET_VIA_STRING(VIA_DETAIL_TUNNEL, VIA_DETAIL_TUNNEL_NO_FORWARD);
      TRANSACT_RETURN(SM_ACTION_SEND_ERROR_CACHE_NOOP, nullptr);
      break;
    case RedirectEnabled::Action::RETURN:
      TxnDebug("http_trans", "[OSDNSLookup] Configured to return on invalid redirect address.");
      // fall-through
    default:
      // Return this 3xx to the client as-is.
      TxnDebug("http_trans", "[OSDNSLookup] Invalid redirect address. Returning.");
      build_response_copy(s, &s->hdr_info.server_response, &s->hdr_info.client_response, s->client_info.http_version);
      TRANSACT_RETURN(SM_ACTION_INTERNAL_CACHE_NOOP, nullptr);
    }
  }

@shinrich
Copy link
Member Author

Reviewing my open issues. Looks like issue #6157 is describing something similar. We hadn't really wanted to do the follow redirects until our most recent deployment.

@shinrich
Copy link
Member Author

@bryancall has started to look at this again. Same build and it is no longer crashing after running for several days. Evidently I was examining this during very particular traffic flows that were causing the rapid crash. In any case @bryancall is looking through the code to find the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants