Skip to content

Extract HttpUserAgent class from HttpSM#9869

Merged
maskit merged 15 commits intoapache:masterfrom
JosiahWI:refactor/http-user-agent
Jul 12, 2023
Merged

Extract HttpUserAgent class from HttpSM#9869
maskit merged 15 commits intoapache:masterfrom
JosiahWI:refactor/http-user-agent

Conversation

@JosiahWI
Copy link
Contributor

@JosiahWI JosiahWI commented Jun 16, 2023

This extracts a new class with 3 fields (entry, raw_buffer_reader, and txn). It doesn't have much responsibility yet. But it does provide a starting point for moving some behavior out of the state machine.

Edit: We extracted a lot more fields, and moved HttpVCTable to its own file.

@JosiahWI JosiahWI self-assigned this Jun 16, 2023
@JosiahWI JosiahWI added this to the 10.0.0 milestone Jun 16, 2023
@JosiahWI JosiahWI force-pushed the refactor/http-user-agent branch from f153eaa to 5846e2f Compare June 16, 2023 20:42
@JosiahWI JosiahWI marked this pull request as ready for review June 16, 2023 21:50
@JosiahWI JosiahWI requested a review from shinrich June 16, 2023 21:50
Copy link
Member

@maskit maskit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with making changes step by step, but I'd like to see the big picture before making changes.

HttpSMHandler, HttpVC_t and HttpVCTableEntry probably need a better place.

@JosiahWI
Copy link
Contributor Author

My objective is to move some of the logic for response headers into this new class. For example, HttpSM::setup_push_read_response_header() can be moved entirely or in part into this new class.

@JosiahWI JosiahWI requested a review from maskit June 20, 2023 18:26
@maskit
Copy link
Member

maskit commented Jun 20, 2023

Hmm, the goal is still unclear to me. If it's going to be like we can get anything about a user agent from HttpUserAgent, and anything about a server from HttpServer that kinda make sense. But we already have client_info and server_info in HttpTransact::State. What would be the difference? And why don't you move other variables that are related to client (e.g. client_request_hdr_bytes, client_tcp_reused, client_protocol`, etc) into the class?

@JosiahWI
Copy link
Contributor Author

I wanted to split up the class a bit based on the single responsibility principle to make it more manageable. I did not have a long-term vision of how the refactoring would continue, but I'd like to try moving the other client fields over (in this PR) like you suggested, and splitting out HttpServer in a separate one.

I'm pretty new to the codebase and it's not very clear to me yet which fields belong together. I thought client and user agent were somehow different concepts in the codebase, because they are named differently.

@JosiahWI
Copy link
Contributor Author

The goal is to encapsulate these fields such that the state machine doesn't need to access this data at all. If it has less data it has to manage directly it will be a lot easier to follow the logic. I want to move operations on client and server data out of the state machine, so that the state machine focuses on changing between states correctly and delegates the details of changing the state to those new classes.

@maskit
Copy link
Member

maskit commented Jun 20, 2023

Thanks for the clarification.

I wanted to split up the class a bit based on the single responsibility principle to make it more manageable.

Sounds reasonable. You've already answered my question on the other comment but I was not sure what the responsibility of the new class where you have only one for user agent side.

I'm pretty new to the codebase and it's not very clear to me yet which fields belong together. I thought client and user agent were somehow different concepts in the codebase, because they are named differently.

I'd say nobody fully understand the concepts 🙂 The two terms might have been used for a client on HTTP layer and a client on TCP layer, but this is just my guess.

The goal is to encapsulate these fields such that the state machine doesn't need to access this data at all. If it has less data it has to manage directly it will be a lot easier to follow the logic.

Sounds great. Let's make it happen.

@JosiahWI JosiahWI force-pushed the refactor/http-user-agent branch from 28a3007 to e1dc4ad Compare June 22, 2023 13:54
@JosiahWI
Copy link
Contributor Author

JosiahWI commented Jun 22, 2023

Rebased on master due to merge conflict.

@JosiahWI JosiahWI force-pushed the refactor/http-user-agent branch from e1dc4ad to 1035d35 Compare June 26, 2023 10:55
@JosiahWI JosiahWI force-pushed the refactor/http-user-agent branch from 1035d35 to b708860 Compare June 26, 2023 18:11
@JosiahWI
Copy link
Contributor Author

Rebased on master.

@bryancall
Copy link
Contributor

[approve ci autest]

@JosiahWI
Copy link
Contributor Author

The autest failure on this one indicates a real problem. I've been working on it but debugging the state machine without unit tests is a relatively large time sink.

@JosiahWI
Copy link
Contributor Author

I found the bug. I missed a single !, and I think I spent as much time finding that single missing ! as I spent making this entire PR. I'm going to add a unit test that covers that case, but I'm encountering a lot of segfaults in the test setup.

@JosiahWI JosiahWI force-pushed the refactor/http-user-agent branch from b708860 to df22de9 Compare July 10, 2023 12:04
@JosiahWI
Copy link
Contributor Author

Rebased on master. Some unfinished changes also got pushed by mistake.

JosiahWI added 2 commits July 12, 2023 07:04
This extracts a new class with 3 fields (entry, raw_buffer_reader,
and txn). It doesn't have much responsibility yet. But it does
provide a starting point for moving some behavior out of the
state machine.
This introduces HttpVCTable.h and HttpVCTable.cc to hold all the
stuff related to the HttpVCTable struct.
JosiahWI added 7 commits July 12, 2023 07:07
This moves all the client_ fields except the byte counts into
HttpUserAgent, and sets most of the fields when the transaction
is set.
This matches the behavior of the original code, and may be
important.
A not (!) operator was missed, which caused tcp_reused to be
set incorrectly on the user agent. This fixes the issue, and adds
Catch tests to check for regression.
There is one segfault left to fix.
@JosiahWI JosiahWI force-pushed the refactor/http-user-agent branch from 7292945 to acf0f03 Compare July 12, 2023 12:07
@JosiahWI
Copy link
Contributor Author

Rebased on master to resolve merge conflict.

@JosiahWI JosiahWI force-pushed the refactor/http-user-agent branch from 79708a0 to 0dfbfef Compare July 12, 2023 14:26
@JosiahWI
Copy link
Contributor Author

JosiahWI commented Jul 12, 2023

I'm wondering whether it's worth it to build the new test in the automake build. In the CMake build it was convenient to build a single test executable, but the automake build has things split into multiple test executables, and building the one new test would require a large amount of boilerplate (linking 10+ libs or whatnot), and I'm hoping the automake build will eventually be removed anyway so we don't have to maintain two build generators.

}

inline ProxyTransaction *
HttpSM::get_ua_txn()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if we should keep this because now we can do sm->get_user_agent()->get_txn(). Maybe we should keep it, so we can get a user-agent transaction and a server transaction in a consistent way.

Copy link
Member

@maskit maskit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Assuming functions are just moved and there's no logic change.

@JosiahWI
Copy link
Contributor Author

JosiahWI commented Jul 12, 2023

There isn't supposed to be any logic change. The AuTests passing and the relative simplicity of the change (most of the code was directly copy/pasted) gives me a little confidence that the behavior is preserved, and I am planning to add some more unit tests in a followup PR.

@maskit maskit merged commit 7708542 into apache:master Jul 12, 2023
@JosiahWI JosiahWI deleted the refactor/http-user-agent branch July 13, 2023 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants