Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract out the network IO #1250

Merged
merged 15 commits into from Mar 3, 2021
Merged

Conversation

sechkova
Copy link
Contributor

Description of the changes being introduced by the pull request:

Based on the discussion in #1213 and mainly the idea of @jku, implement an abstract FetcherInterface calss and
a concrete RequestsFetcher class .

The idea is to extract the network IO related code out of tuf.download and put it in RequestsFetcher class which is also the default FetcherInterface implementation which TUF uses if not given an external one.

RequestsFetcher implements only one public method 'fetch' which returns a bytes iterator. All file handling and verification
logic remains in tuf.download which is now wrapper in a FileDownload class for convenience.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@sechkova
Copy link
Contributor Author

The new GitHub actions have been harsh on me :)) Will review the errors after the holiday break.

@sechkova
Copy link
Contributor Author

sechkova commented Jan 5, 2021

The new GitHub actions have been harsh on me :)) Will review the errors after the holiday break.

Fixed the linter issues, however according to Windows "Download was too slow." 🙄

@sechkova
Copy link
Contributor Author

sechkova commented Jan 6, 2021

Fixed a small issue and force-pushed but (un)fortunately I see no errors while testing on a local windows VM.

@sechkova
Copy link
Contributor Author

First thing, apologize for the spam and using the CI as a test server but I could not reproduce the Windows tests failure locally.

The problem can be summarized in two parts:

  1. Slow 'localhost' lookup on Windows tests: localhost name lookup takes 1sec on Windows in github actions  #1257
  2. A change in this PR which adds the time to connect to the server to the calculation of average_download_speed for the first downloaded chunk which raises SlowRetreivalError in the case of 1.

While 1. can quickly be fixed as suggested by @jku in #1257, 2. is related to the network abstraction api which may need some discussion. The main point is, given all the uncertainty around protection against slow retrieval attack and its recent removal from the specification, where should we implement the slow retrieval check:

  1. Keep the fetcher unaware of it and implement the download speed calculations in its caller function.
    • has the obvious pros that fetcher implementation is unaware of it and the check is performed by TUF itself
    • cons: incorrectly adds connection time to the actual chunk download time
  2. Keep the fetcher unaware of it but make fetcher API more complicated by splitting session establishment/connection and downloading data
    • pros: solves the timing issue
    • cons: undesirably complicated api
  3. Move the download speed calculation and slow retrieval checks as part of TUF's fetcher implementation
    • pros: replicates current behaviour
    • cons: different fetcher implementations can(and most likely will) totally ignore the slow retrieval check

My choice is in favour of point 3 but I would like to hear more opinions before going ahead.

@lukpueh
Copy link
Member

lukpueh commented Jan 12, 2021

Thanks for the detailed comment, @sechkova (and of course for the patch, which I haven't looked at yet). Here are my thoughts:

  • I'd rule out (2). This attack, which is no longer part of the spec (Remove slow retrieval attacks from protections specification#111), does not seem to justify a more complicated API.
  • I think (3) is perfectly fine, maybe we can add a big disclaimer to the API docs that says that only the default fetcher provides this protection, and any custom implementation must do it themselves.
  • Regardless, wouldn't it be possible to achieve (1) by granting some grace period in the beginning, where the fetcher may drop below the rate? AFAIU a successful slow-retrieval attack requires quite some time, so if we start checking the average rate only after enough time has passed to likely smooth out the spike caused by connection/session establishment, we are probably able to detect the attack early enough.

@trishankatdatadog
Copy link
Member

I agree with @lukpueh: (1) makes more sense to me, depending on the philosophical motivation for the fetcher. Are we simply trying to fetch bytes or do something more?

@jku
Copy link
Member

jku commented Jan 12, 2021

I lean towards option 3 (mostly because I do not believe our slow retrieval protection works without false positives -- Teodoras test failures are a strange example of this IMO -- and I would like to not have these false positives in pip)

wouldn't it be possible to achieve (1) by granting some grace period in the beginning, where the fetcher may drop below the rate? AFAIU a successful slow-retrieval attack requires quite some time, so if we start checking the average rate only after enough time has passed to likely smooth out the spike caused by connection/session establishment, we are probably able to detect the attack early enough.

I originally suggested the same thing, as this might also help with the massive bias that the current implementation has. Example of the bias: Imagine a download of 10 chunks where 9 chunks transfer at "normal internet speeds" but one is slow:

  • if the slow chunk is the first one the download will be canceled if speed is < 50B/s
  • if the slow chunk is the last one the download will happily succeed even if the chunk speed is 1 B/s (a very long time later but it will succeed)

However, solving the above issue issue properly is no doubt complex -- probably the best we could do is special case the first chunk handling, just giving it a few extra seconds... I don't see why this is better than letting the fetcher handle it.

@jku
Copy link
Member

jku commented Jan 12, 2021

Oh additional reason for choice 3 (that Teodora kind of mentioned): the patch will probably be cleaner as it will be easier to just move existing code from download.py to Fetcher instead of developing similar but not quite same code.

@jku
Copy link
Member

jku commented Jan 12, 2021

I agree with @lukpueh: (1) makes more sense to me, depending on the philosophical motivation for the fetcher. Are we simply trying to fetch bytes or do something more?

The issue is that the first call to the fetch() generator also creates the connection because there is no separate initialization function... now that I said it aloud, I think it has to be possible to initialize the generator separately. I'll look this up. Thanks.

@joshuagl
Copy link
Member

I am in favour of option three, both because we don't have a specification for slow retrieval attacks, and because we are not confident about the effectiveness of the current implementation of slow retrieval attacks.

First thing, apologize for the spam and using the CI as a test server but I could not reproduce the Windows tests failure locally.

I didn't receive any spam about CI failures, so I think no need to apologise here 😄

@sechkova sechkova force-pushed the fetcher branch 3 times, most recently from ebef202 to e8be76b Compare January 18, 2021 12:51
@sechkova
Copy link
Contributor Author

Thanks all for giving your opinion and especially @jku for the ideas, most of which I think are included in the latest changes.
Feel free to review 😉

tuf/fetcher.py Outdated

# Check response status.
response.raise_for_status()
response = session.get(url, stream=True, timeout=self.SOCKET_TIMEOUT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there's no longer a context manager in use for response, do we need to call some cleanup method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also maybe double check what happens when connect times out. I expect that's a ReadTimeoutError which should be handled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some cleanup in 77900af.

Connect time out will raise an exception that will be captured in NoWorkingMirrorError. This is the same as the current behaviour.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it except for the fact that it rewrites download.py completely :) I understand it's mostly an indent change but I was really hoping that wasn't needed.

I have not yet reviewed download.py changes -- I will do that but I'll leave an early review for this question: What compromises would we need to make to keep download.py changes to a minimum?

tuf/fetcher.py Outdated



class FetcherSettings():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the construct? I can see it should work but what is the advantage of a separate FetcherSettings and RequestsFetcher?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to separate FetcherSettings so that other FetcherInterface implementations can reuse them (by inheriting) or replace them with their own. But given that we cannot remove them from tuf.settings just like that, I dropped this change too.

tuf/settings.py Outdated
Comment on lines 77 to 82
# Set a timeout value in seconds (float) for non-blocking socket operations.
SOCKET_TIMEOUT = 4 #seconds

# The maximum chunk of data, in bytes, we would download in every round.
CHUNK_SIZE = 400000 #bytes

Copy link
Member

@jku jku Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are settings that unfortunately maybe used by someone so I don't think we should just silently delete them without deprecation.

I wonder if the new implementation should still use these as the default values for its properties... and possibly mark these as deprecated, explaining that these values are now fetcher specific and that the default RequestsFetcher exposes the same properties (but that those properties might not be considered API(?) and those changing them should consider making their own fetcher)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally forgot about settings being part of the API. I dropped the commit and just used tuf.settings in RequestsFetcher.

I suggest we revise tuf.settings in general and then apply the decisions here.

Comment on lines 662 to 664
fetcher:
A conctrete 'FetcherInterface' implementation. If not provided,
tuf.fetcher.RequestsFetcher is used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo on concrete. Also it's lacking a sentence on what fetcher does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b0e8acb

@sechkova sechkova force-pushed the fetcher branch 3 times, most recently from a749a0d to abac87f Compare January 22, 2021 14:21
@sechkova
Copy link
Contributor Author

Reworked some changes after discussion with @jku. Download.py should be more appealing for a review now!

self.temp_file.seek(0)
temp_file_data = self.temp_file.read().decode('utf-8')
self.assertEqual(self.file_contents, temp_file_data)
self.assertEqual(self.file_length, len(temp_file_data))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to check that when you know that the self.file_contents is equal to temp_file_data from the previous assertEqual?

Copy link
Contributor Author

@sechkova sechkova Jan 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Thanks for noting, seems like I'll have to take another look on the test cases.


# Test incorrect URL parsing
def test_url_parsing(self):
with self.assertRaises(tuf.exceptions.URLParsingError) as cm:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly will happen if this test fails? What about when it succeeds?
I just haven't seen the usage of a context manager with an exception...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with self.assertRaises(ExpectedError):
  function_under_test(argument1, argument2)

is functionally the same as

self.assertRaises(ExpectedError, function_under_test, argument1, argument2)

but more readable especially with longer names and more arguments, and also more extensible (as you can have more than one line in the block as test_url_parsing() does).

The as cm seems unneeded (but harmless) in this case as it's not used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but more readable especially with longer names and more arguments, and also more extensible (as you can have more than one line in the block as test_url_parsing() does).

This is what I've meant when writing it but I guess I had plans to use cm that didn't happen ...

@joshuagl
Copy link
Member

Would it be possible to ensure the new files (tuf/client/fetcher.py and tuf/requests_fetcher.py) adhere to the new coding style? Of particular note is the docstring format.

@sechkova
Copy link
Contributor Author

Would it be possible to ensure the new files (tuf/client/fetcher.py and tuf/requests_fetcher.py) adhere to the new coding style? Of particular note is the docstring format.

The docstrings are updated. I haven't applied the rules from the new coding style which cause linter errors. Since the changes in the PR are meant to work with the existing code they follow the old style. Let's review the coding style during the transition to TUF 1.0.0

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments, which you may choose to address or not, and a couple of questions. I will be happy to merge this in its current state once we've released v0.17.0 (#1284).

Thank you for the solid work on this!

tests/test_fetcher.py Outdated Show resolved Hide resolved
tests/test_fetcher.py Outdated Show resolved Hide resolved
tests/test_fetcher.py Show resolved Hide resolved
tests/test_fetcher.py Outdated Show resolved Hide resolved
temp_file_data = self.temp_file.read().decode('utf-8')
self.assertEqual(self.file_contents, temp_file_data)
# Double check that we download more than one chunk
self.assertGreater(chunks_count, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check is unnecessary? We are checking for the expected number of chunks below, what additional value does this provide?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworked it a bit.

Does this line express my intention better:
https://github.com/theupdateframework/tuf/blob/c794e0b917ba79327277f2846eda0c9a005b6730/tests/test_fetcher.py#L122-L123

The goal is to make sure that the calculation of expected_chunks is correct and results in more than one chunk downloaded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I had confused variables. Apologies. 🤦
The locality of the check makes the intention clearer, I don't think we should be using assert() over self.assert* here though.

Copy link
Contributor Author

@sechkova sechkova Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did change the variable to make the test clearer 🙄
Here it is, using the unittest's "assert" 50b3b19

tuf/download.py Show resolved Hide resolved
The new class FetcherInterface defines an interface for
abstract network download which can be implemented for a
variety of network libraries and configurations.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
A concrete implementation of FetcherInterface based on
the Requests library.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Abstract the network IO. Move the network operations from
tuf.download to the RequestsFercher class which is TUF's
implementation of the abstract FetcherInterface.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Initialize Updater with an external implementation of
FetcherInterface. If not provided, tuf.fetcher.RequestsFetcher
is used.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Pass RequestsFetcher object to tuf.download functions.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Add unit test for requests_fetcher.py

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
- Update RequestsFetcher.fetch to return a generator object.
- Update _download_file to skip connection time when calculating
average download speed.
- Write chunk to temp file before exiting the fetcher loop
on error.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
On Windows (Github Actions) the lookup for 'localhost' takes 1 second.
This is because:
- Windows retries connect() with a timeout
- the machine has IPv6 and IPv4 but Testserver only binds the port on IPv4
- the test clients connect to 'localhost'

Since socketserver.TCPServer does not seem to support IPv6 before 3.8,
just replace 'localhost' with '127.0.0.1' in client-side URLs.

See theupdateframework#1257

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Use '>=' as the defensive version of the equality check.

Add a comment describing the need of a chunks() generator.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
@sechkova
Copy link
Contributor Author

Addressed the latest comments and rebased on the latest changes from the development branch.

sechkova and others added 6 commits February 24, 2021 11:41
Add test cases to test_fetcher and test_download that
decrease default chunk size and download data in more
than one chunk.

Small code-style improvements.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Calls to safe_download and unsafe_download leave
a temporary file unclosed.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
A custom error is required so that updater is able to special case
403 & 404 status codes.

Rewrite the test case a bit to be more readable.

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
* Move FetcherInterface to tuf/client/ directory: This way everything
  inside that directory is clearly part of client API, and everything
  outside _may_ be more of an implementation detail (settings is still
  an unfortunate exception)
* Keep RequestsFetcher in tuf/ for same reasons: it's just the default
  implementation, not explicitly part of client API

An even clearer division would be if we moved all the client specific
implementation details (download.py, mirrors.py, requests_fetcher.py)
to tuf/client/_internal/ but that's a larger change...

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
Use a common test level constant for defining
the host address forming the download URL on
the client side.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Ensure that the newly added files' docstrings adhere to the
recently adopted code style guideline (theupdateframework#1232).

Small code style improvements in comments and imports.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
@jku
Copy link
Member

jku commented Mar 1, 2021

Had another look, I agree with my previous self: LGTM.

Notes:

  • The whole thing is a mostly faithful reproduction of the earlier code, just in a RequestsFetcher.fetch() generator function
  • requests_fetcher.py, download.py and mirrors.py are implementation details of updater.py: this is why download.py "API change" was seen as reasonable in the PR
  • download.py does not provide much value anymore but I think keeping it is the correct call to keep the PR to a reasonable size
  • This PR fixes the issues that I can see with pip integration (ability to tweak network stack tweakables, ability to do progress notifications, ability to compress metadata on the wire)
  • The test server address fix could be more integrated (like TestServerProcess.get_base_url() that returns something like "http://127.0.0.1:1234/" but I think this is good enough for this PR)

There's an approval already so I'll merge this soon if no complaints appear.

@sechkova sechkova mentioned this pull request Mar 1, 2021
17 tasks
@lukpueh
Copy link
Member

lukpueh commented Mar 1, 2021

Thanks for all the efforts you put into this PR, @sechkova and @jku! And kudos to the reviewers, @joshuagl and @MVrachev. ❤️

I won't add any meaningful comments here, but based on a quick glance through the diff and all the other comments and discussions above, I'm sure the approval is well justified. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants