Skip to content

HttpService::GetRoute first-match-wins over std::unordered_map breaks ANY("*") + specific routes #831

@borisbat

Description

@borisbat

Hit something integrating libhv. Set up an HttpService with a bunch of specific GETs and POSTs, plus ANY("*") as a "unknown route, here is the help JSON" catch-all. The catch-all answers everything. Even paths that have their own handler.

Looked at http/server/HttpService.cpp around GetRoute(HttpRequest*, http_handler**). pathHandlers is std::unordered_map<std::string, ...>. GetRoute iterates it begin()..end() and returns on the first matching path. The wildcard branch is:

if (kp[0] == '*') { match = hv_strendswith(vp, kp+1); break; }

For "*" that is hv_strendswith(anything, "") which is unconditionally true. So it's not really "wildcard wins" -- it's "whichever path the hash function visits first wins". Order of registration is irrelevant, which was the part that confused me longest.

Single-file repro, links against built libhv plus its openssl deps, no other dependencies:

#include "hv/HttpServer.h"
#include "hv/HttpService.h"
#include "hv/requests.h"
#include "hv/hv.h"
#include <thread>
#include <chrono>
#include <cstdio>
#include <string>
#include <vector>
using namespace hv;
int main() {
    printf("libhv version: %s\n", hv_version());
    HttpService router;
    std::vector<std::string> paths;
    for (char c = 'a'; c <= 'z'; ++c) {
        std::string p = "/"; p.push_back(c); paths.push_back(p);
        router.GET(p.c_str(), [p](HttpRequest*, HttpResponse* r){
            r->body = "EXACT:" + p; return 200;
        });
    }
    router.Any("*", [](HttpRequest*, HttpResponse* r){
        r->body = "FALLBACK"; return 200;
    });
    printf("pathHandlers iteration order:\n");
    int i = 0;
    for (auto& kv : router.pathHandlers)
        printf("  [%2d] %s\n", i++, kv.first.c_str());
    HttpServer server(&router);
    server.setPort(19099);
    server.start();
    std::this_thread::sleep_for(std::chrono::milliseconds(150));
    int hijacked = 0;
    for (const auto& p : paths) {
        auto resp = requests::get(("http://127.0.0.1:19099" + p).c_str());
        std::string want = "EXACT:" + p;
        bool good = resp && resp->body == want;
        printf("GET %-4s -> %s%s\n", p.c_str(),
            resp ? resp->body.c_str() : "(no resp)",
            good ? "" : "  <-- HIJACKED");
        if (!good) ++hijacked;
    }
    server.stop();
    return hijacked ? 1 : 0;
}

MSVC 19.44 / Win11, libhv v1.3.4:

libhv version: 1.3.4
pathHandlers iteration order:
  [ 0] /a   [ 1] /b   ... [15] /p   [16] *   [17] /q   ... [26] /z
GET /a   -> EXACT:/a
...
GET /p   -> EXACT:/p
GET /q   -> FALLBACK   <-- HIJACKED
GET /r   -> FALLBACK   <-- HIJACKED
...
GET /z   -> FALLBACK   <-- HIJACKED
summary: 16 ok, 10 hijacked

"" lands at index 16; everything iterated after that point is shadowed. With libstdc++ the cutoff sits in a different place but the bug is the same. Same shape surfaced inside our project at GaijinEntertainment/daScript#2675 -- user registered GET("/status") next to ANY("") and could not figure out why /status came back with the help JSON.

Happy to PR. Asking first because the fix shape depends on whether you want to keep std::unordered_map for the O(1) exact-path lookup and add a fallback wildcard pass, or restructure routes entirely.

Thanks for libhv. Genuinely a pleasure to integrate.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions