Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend HTTP server to handle more scenarios #14234

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jagannatharjun
Copy link
Member

@jagannatharjun jagannatharjun commented Jan 17, 2021

Untie IRequestHandler and Connection class. Make Interface of IRequestHandler more barebones, and implement existing behavior of IRequesteHandler class in new class BasicRequestHandler (implements IRequestHandler). Make Connection class directly handle serializing of HTTP responses and add new functions to the class accordingly.

Obsolete implementation summary Untie IRequestHandler and Connection classes. Have Server takes a new interface IConnectionHandler which only takes a Connection do as it wishes, Base IRequestHandler (renamed to RequestHandler) on IConnectionHandler that provides more user-friendly interface to work with. Also, add some more convenient functions to the Connection class to facilitate sending HTTP responses


Extends HTTP server by introducing two new classes HttpSocket, IHttpConnectionHandler. renaming IRequestHandler => RequestHandler. Previous class Connection and IRequestHandler were very closely tied making types of requests that can be served very small. Now Http::Server instantiate a HttpSocket with for every new connection, HttpSocket will parse the request and emit a signal, Server will catch this signal and pass the HttpSocket to IHttpConnectionhandler, The interface of IHttpConnectionhandler is fundamental. Thus class RequestHandler class is provided that transforms the IHttpConnectionHandler interface to IRequestHandler

Required for #12542

src/base/http/httpsocket.h Outdated Show resolved Hide resolved
src/base/http/httpsocket.h Outdated Show resolved Hide resolved
src/base/http/httpsocket.h Outdated Show resolved Hide resolved
virtual Response processRequest(const Request &request, const Environment &env) = 0;

private:
void handleConnection(Connection *socket) override final;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final implies override so it's redundant to use both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Connection *connection?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what's the point to redeclare it private? It implements interface method that's public by definition.

Copy link
Member Author

@jagannatharjun jagannatharjun Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what's the point to redeclare it private? It implements interface method that's public by definition.

At some point in time, I was instructed to do something similar in another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what's the point to redeclare it private? It implements interface method that's public by definition.

At some point in time, I was instructed to do something similar in another PR.

Are you sure it was about exactly the same case? Maybe it was about the case when you override previously protected method in final class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final implies override so it's redundant to use both.

No it doesn't: https://stackoverflow.com/questions/29412412/does-final-imply-override

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't: https://stackoverflow.com/questions/29412412/does-final-imply-override

In such case it does (defacto). final requires the method to be virtual. But since this declaration does not have virtual specifier, it can only be virtual if it is overriden. I would prefer to use minimal set of specifiers in method declarations. Or does it seem less readable to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was taking your comment at face value and didn't read the code. But if that particular function isn't virtual then it cannot be override and/or final anyway. It would be a compile error. Or are you saying something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch whatever I said. I missed your point and the rest of the SO answer. Yes, final override is superfluous. Because final requires a virtual function aka something that is overridable (sic). In that case override won't achieve something more. Maybe only a clearer intent indication to the reader. But let's not go there.

#include "connection.h"
#include "httputils.h"

using namespace Http;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this using statement if you still use fully qualified names below?

src/base/http/requesthandler.h Outdated Show resolved Hide resolved
connect(socket, &Connection::disconnected, this, [socket, this]() { removeConnection(socket); });
connect(socket, &Connection::readyRequest, this, [socket, this]()
{
m_connectionHandler->handleConnection(socket);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it handles not Connection itself but each request received within Connection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, then I will have to use QObject::Connect in non QObject classes, I don't think you would like it either.

src/base/http/connection.h Show resolved Hide resolved

bool hasExpired(qint64 timeout) const;
bool isClosed() const;
void sendStatus(const Http::ResponseStatus &status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to use fully qualified names within the same namespace.

src/base/http/httputils.cpp Show resolved Hide resolved
src/base/http/server.cpp Outdated Show resolved Hide resolved
src/base/http/connection.cpp Outdated Show resolved Hide resolved
src/base/http/connection.cpp Outdated Show resolved Hide resolved
src/base/http/connection.cpp Outdated Show resolved Hide resolved
src/base/http/connection.cpp Outdated Show resolved Hide resolved

void Http::RequestHandler::handleConnection(Connection *socket)
{
Q_ASSERT(socket->status() == Connection::PendingStatus);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a really bad idea to have such assertion in interface method.
If it handles Connection it shouldn't restrict the state (or other properties) of this connection to be accepted as input parameter.
As I said before it doesn't handle Connection currently. It simply handles each specific request received within the Connection, but in a poorly designed form that hides the essence of things.

If you really need "new interface IConnectionHandler which only takes a Connection do as it wishes" you should pass Connection once it is constructed and allow handler to do what it wants further.
Otherwise you need other handler interface that allows correct per-request handling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I agree with your comments about the interface, the problem is that I don't want to make RequestHandler a QObject. or since now it's not a pure interface maybe I can make it a QObject.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I can make it a QObject.

Maybe...
Can you describe how your another ConnectionHandler should interacts with Connection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is based on an older version of this PR jagannatharjun/qBittorrent@76002fc#diff-877ca73ad30842b71bcdadd27bac68ce563ed8aa9af32a65c667ad08b8de4d9aR155

Well, let's try to figure out what the differences are between the existing handlers (web application and tracker) and the one that is needed to support media streaming.

Existing implementation just handles each request and send response. It doesn't close connection after each response. Connection can be closed either by client side or by timeout (and maybe due to underlying network error). It's "request handling" in its essence so it can be easily implemented without direct access to connection. When request is received it is passed to the handler, who processes it and returns a response that is sent to the client (i.e. all network IO can be done outside of handler).

Another handler still "just handles each request and send response" as existing one. The differences (judging by the code above) are that:

  1. It closes connection after each request;
  2. It handles request asynchronously.

As for (1), I have a question, is this really required, or can it re-use the existing connection to send multiple requests?
Anyway the connection can be closed outside the handler based on some Response attribute. So the main problem is (2). Even if writing to the socket in parts, mentioned by @Chocobo1, turns out to be a bad idea, and we stop at accumulating the entire Response, and only then sending it to the client, it will still not be available immediately. Apparently, here it remains only to use the Qt meta-object system, and allow the handler to return the generated Response not directly from the method, but through the signal. You only need to establish some kind of relationship between the Request, its Connection, and the Response.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was thinking about all this and writing the previous long comment, I came up with the idea of a compromise option.
You just add a property to the Request that contains a pointer to its Connection. When the next Request is received, you pass it to the handler, so that it will still be the RequestHandler, and then it is responsible for processing the Request and sending the response through the Connection.
I.e., you do not have to change almost anything. Only rename IConnectionHandler to IRequestHandler, RequestHandler to something like BasicRequestHandler. Also you need to obtain the Request instance outside of the handler and pass it through the IRequestHandler::handleRequest() method.
That's all, if I didn't miss anything important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just add a property to the Request that contains a pointer to its Connection.

IMO making it property will expose connection to too many places. so I am providing connection as an argument in IRequestHander::handleRequest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making it property will expose connection to too many places.

What exactly are the places that bother you? In any case, you expose it to the outside world (even through the parameter). Do you want to hide it from the inner space, where it also comes from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly are the places that bother you?

BasicRequestHandler::processReqeust tryCompressContent.

In any case, you expose it to the outside world (even through the parameter)

with parameter it is only exposed to IrequestHandler::handleRequest and doesn't leak to outside space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with parameter it is only exposed to IrequestHandler::handleRequest and doesn't leak to outside space.

...unless you allow it in IRequestHandler::handleRequest().
Well, let's keep it that way for now.

@glassez glassez self-assigned this Jan 20, 2021
@glassez glassez added the Core label Jan 20, 2021
@glassez glassez changed the title Extend Http server to handle more scenerios Extend HTTP server to handle more scenarios Jan 20, 2021
public:
virtual ~BasicRequestHandler() = default;

virtual Response processRequest(const Request &request, const Environment &env) = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems now it is only intended to be implemented in subclasses. It should be used only inside this class (as an implementation detail of handleRequest()), isn't it?

src/base/http/basicrequesthandler.h Outdated Show resolved Hide resolved
@@ -108,9 +109,13 @@ void Server::incomingConnection(const qintptr socketDescriptor)
static_cast<QSslSocket *>(serverSocket)->startServerEncryption();
}

auto *c = new Connection(serverSocket, m_requestHandler, this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why not keep this logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's cleaner to keep these two classes separated.

src/base/http/connection.h Outdated Show resolved Hide resolved
src/base/http/connection.h Outdated Show resolved Hide resolved
src/base/http/connection.h Outdated Show resolved Hide resolved
connection->sendStatus(response.status);
connection->sendHeaders(response.headers);
if (!response.content.isEmpty())
connection->sendContent(response.content);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why not serialize/write response at once (as it was before)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in #12542 I send content in chunks, so to support that I've added sendHeaders and sendContent functions in Connection class. I can add sendResponse function in the Connection class but let's keep Response handling in this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep Response handling in this class.

What's the reason? Connection acts on HTTP layer. It parses received HTTP requests into Request object so why it shouldn't make an opposite action, i.e. serialize Response object? In addition, it will give more control over its integrity and consistency.

in #12542 I send content in chunks

Is it really useful? IIRC, your tests show it is not so optimal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep Response handling in this class.

What's the reason? Connection acts on HTTP layer. It parses received HTTP requests into Request object so why it shouldn't make an opposite action, i.e. serialize Response object? In addition, it will give more control over its integrity and consistency.

connection class already provide an interface for sending data, why pollute it by providing other ways to do the same. As far as the integrity of responses is concerned. I don't want to add such logic in Connection class. I would like to keep it barebones and have the request handler handle everything.

in #12542 I send content in chunks

Is it really useful? IIRC, your tests show it is not so optimal.

in #12542 client can ask for 1 gig of file. would you like to send all that at once? with the latest revision, this PR uses a cache for sending headers and status this makes it on par with the current master in terms of performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in #12542 client can ask for 1 gig of file.

Are you going to accept such request?

would you like to send all that at once?

I assume you don't, but then how do you come back later after sending partial of it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to accept such request?

yep, it's a valid request

I assume you don't, but then how do you come back later after sending partial of it?

this functionality is basically to allow https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests on downloading torrent files, I use libtorrent::read_piece which is asynchronous so there is a natural callback there. So we basically fulfill HTTP range requests in size of torrent piece length using libtorrent::read_piece. Implementation is here https://github.com/jagannatharjun/qBittorrent/blob/stream-final/src/base/streaming/streamfile.cpp#L144

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use libtorrent::read_piece which is asynchronous so there is a natural callback there.

Excuse me I haven't got time to check your code, so how do you handle libtorrent::read_piece() alert that will return in unspecified/random order? Do you issue only one read_piece() at a time?

this functionality is basically to allow https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests

Some more questions, which of the following do you intend to support?

  • Accept-Ranges: bytes (this is the minimum requirement right?)
  • Accept-Ranges: none
  • Single part ranges
  • Multipart ranges
  • Conditional range requests

I hope we can aim to support the smallest set that will satisfy your need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you handle libtorrent::read_piece() alert that will return in unspecified/random order? Do you issue only one read_piece() at a time?

yes, read_piece() is issued one at a time.

which of the following do you intend to support?

only Single Part ranges.

src/base/http/basicrequesthandler.cpp Outdated Show resolved Hide resolved
src/base/http/connection.h Outdated Show resolved Hide resolved
src/base/http/connection.h Outdated Show resolved Hide resolved
bool isClosed() const;
void sendStatus(const ResponseStatus &status);
void sendHeaders(const HeaderMap &headers);
void sendContent(const QByteArray &data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't really send out anything by calling them, add*** would be perhaps more correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we see it's not sending content immediately as an implementation detail. I mean Python's HTTP server also has the same functions and they too don't send the data immediately. add*** in Connection doesn't seem right.

Copy link
Member

@Chocobo1 Chocobo1 Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain what I would expect:

Response res;  // a specific class for building response 
res.setStatus(...);
res.setHeaders(...);
res.setContent(...);

connection.send(res);  // really send it out from socket

The current way of send* functions are not intuitive for general use and couldn't be more wrong as you are mixing higher layer data (http) with low level socket operations in the same class.

EDIT: please look up OSI model if you haven't already. I meant this: https://en.wikipedia.org/wiki/Communication_protocol#Software_layering

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, his main reason is the need to send very large data (several gigabytes) in a single HTTP response.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aimed to implement something like Python Http Handler https://docs.python.org/3/library/http.server.html#http.server.BaseHTTPRequestHandler.send_header
it's used by a large audience so IMO there is nothing wrong with this PR's abstractions. I am open to ideas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify once again (if this was unclear from my previous comments), I suggest having two types of Responses, one for general cases and the other for the case of sequential write content.

Agree, I think your suggestion really make the interface robust enough to avoid (most) misuse and would still preserve the correct layering.

#14234 (comment)

In the simplified example, I want to note that it might need checking if the socket is still open before sending data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be related to single responsibility principal.

If you talk about Http::Connection it is still responsible for sending/receiving HTTP "packets" (requests/responses).

your solution creates two branches in the behavior of a concrete class.

In fact, you can use the second one (#14234 (comment)) in all cases.

HTTP responses are well defined.

Yes.
I agree that it is possible to have the simplest interface with three methods (as you suggest) and make the correctness of their use the responsibility of the user of the class. And this should not even cause any problems in the simplest case, when you call them in a row in one place. But how to control the correctness of the logic in more complex cases (for example, in what you intend to implement next), where the calls of these methods are located far from each other (both in code and at runtime)? How are you going to check the correctness of your logic? Just inserting assertions and waiting for them to hit? Assertions is a good tool that should not be neglected, because we often can not predict all the mistakes. But why not add to it an interface that eliminates the possibility of making such mistakes (at least some of the possible ones)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok now it make sense.

How about this -

class ResponseContent
{
public:
 virtual size_t size() const = 0;
 virtual size_t read(void *data, size_t len) = 0;
signals:
 void readyRead();
};

and provide an implementation based on QByteArray for convenience;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this -

I'm sorry, it doesn't make sense to me until you show me how you're going to use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make Response::Content (https://github.com/qbittorrent/qBittorrent/blob/master/src/base/http/types.h#L12) a std::shared_ptr< ResponseContent >
then everything will remain as it is in the current master. I just need to change Connection::sendResponse to handle this async type operation.

src/base/http/connection.h Outdated Show resolved Hide resolved
src/base/http/basicrequesthandler.cpp Outdated Show resolved Hide resolved
src/base/http/connection.h Show resolved Hide resolved
src/base/http/connection.h Show resolved Hide resolved
src/base/http/connection.h Show resolved Hide resolved
Copy link
Member

@glassez glassez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extend Http server to handle more scenerios

Isn't "scenerios" a mistake? I took the liberty of correcting the PR title. Please correct the commit message accordingly.

src/base/http/connection.h Outdated Show resolved Hide resolved
@jagannatharjun
Copy link
Member Author

@glassez @Chocobo1

would it be fine if I don't touch the existing HTTP library of qBittorrent at all for this feature and implement the proposed feature totally on its own? HTTP library of qBittorrent is not built for this feature at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants