Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ticket31632 043 01 #1593

Closed
wants to merge 3 commits into from
Closed

Conversation

Labels
None yet
Projects
None yet
4 participants
@dgoulet-tor
Copy link
Contributor

@dgoulet-tor dgoulet-tor commented Dec 10, 2019

No description provided.

dgoulet-tor added 3 commits Dec 10, 2019
Signed-off-by: David Goulet <dgoulet@torproject.org>
Signed-off-by: David Goulet <dgoulet@torproject.org>
If the directory connection dies before being able to upload the descriptor,
we'll attempt to retry to upload again to that HSDir.

Closes #31632

Signed-off-by: David Goulet <dgoulet@torproject.org>
@coveralls
Copy link

@coveralls coveralls commented Dec 10, 2019

Pull Request Test Coverage Report for Build 7520

  • 21 of 42 (50.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.008%) to 62.969%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/feature/dirclient/dirclient.c 1 4 25.0%
src/feature/hs/hs_service.c 20 38 52.63%
Totals Coverage Status
Change from base Build 7517: -0.008%
Covered Lines: 49499
Relevant Lines: 78609

💛 - Coveralls

* attempt a re-upload of the corresponding descriptor. This is called when an
* HS descriptor upload fails.
*
* An upload does NOT occure when:
Copy link
Member

@asn-d6 asn-d6 Dec 10, 2019

s/occure/occur/

} else if (TO_CONN(conn)->purpose == DIR_PURPOSE_UPLOAD_HSDESC &&
conn->hs_ident) {
log_info(LD_DIR, "Giving up on uploading HS descriptor. Retrying.");
hs_service_reupload_desc_to_dir(conn->hs_ident, conn->identity_digest);
Copy link
Member

@asn-d6 asn-d6 Dec 10, 2019

Hmm are we talking about infinite retries here? Could this be a problem?
Like, if the network goes down (e.g. mobile), will the service go mental?

Copy link
Contributor Author

@dgoulet-tor dgoulet-tor Dec 11, 2019

Yeah I thought of that as well and figured you would notice it and give me a good idea here lol...

Would be wise to avoid spamming to infinity the reupload to a single HSDir. We would need something like last_hid_serv_request used for clients. And avoid too many retries.

But there is probably an larger problem here which is that the service side does not keep a state of its desc upload. So imagine all 6 upload failed and retried many times but then we stop, chances are that our network is down for instance. When it comes back up, we should notice that we never uploaded and do it again.

Else, the service thinks it made it happen and will wait between 60 and 120 min before retrying...

There is actually an argument now I think to extend this feature to correctly track our HSDir upload service side. What do you think?

Copy link
Member

@asn-d6 asn-d6 Dec 18, 2019

Hmm, you are right, I'm not sure what's the ideal behavior here. I'm generally afraid of this code path spinning in an andless loop when the network is down. Perhaps we can add a small delay when the network fails us before we retry. But not sure how hard that would be engineering wise :/

@@ -759,6 +760,10 @@ connection_dir_client_request_failed(dir_connection_t *conn)
log_info(LD_DIR, "Giving up on downloading microdescriptors from "
"directory server at '%s'; will retry", conn->base_.address);
connection_dir_download_routerdesc_failed(conn);
} else if (TO_CONN(conn)->purpose == DIR_PURPOSE_UPLOAD_HSDESC &&
conn->hs_ident) {
Copy link
Member

@asn-d6 asn-d6 Dec 10, 2019

Has this feature been tested and if yes, how? The lack of unittests is a bit distressing.

Copy link
Contributor Author

@dgoulet-tor dgoulet-tor Dec 11, 2019

Testing that specific feature is sorta hard in a regression test environment (let say Chutney). One would need to close the channel link in the HSDir upload circuit right when the descriptor uploads...

Even with unit test, this is difficult to test. Uploading a descriptor is async (request <-> reply) so at best the unit test would test the correctness of the function based on what we pass it. (which most of the code is already tested except, I give you that, the new hs_service_reupload_desc_to_dir()).

I'll add a unit test that actually confirms that if we end up in the code path of "connection ended badly", we end up re-uploading.

But I won't be able to test that up to the directory upload :S.

Copy link
Member

@asn-d6 asn-d6 Dec 18, 2019

Suggestion of real life test: Can you test that by setting up an onion service with a small descriptor upload time, and then shutting down the network before it needs to upload descriptor?

@torproject-pusher torproject-pusher deleted the branch torproject:master May 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment