Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check validity of downloads #1091

Closed
vsoch opened this issue Oct 24, 2017 · 11 comments
Closed

Check validity of downloads #1091

vsoch opened this issue Oct 24, 2017 · 11 comments

Comments

@vsoch
Copy link
Collaborator

vsoch commented Oct 24, 2017

We should, as a sanity check, make sure that docker layers (and other bits) are downloaded entirely and completely before we dump them into the image. I've seen a few issues come up that (possibly?) are related to a bad download, meaning one that completes (but is corrupt for some other reason).

@dtrudg
Copy link
Contributor

dtrudg commented Apr 17, 2018

In Slack @vsoch came across this issue again - a cloudflare banned page had been returned, with 200 success. Content check is needed to address this, not just status code checks.

@vsoch
Copy link
Collaborator Author

vsoch commented Apr 18, 2018

This is a fun investigation :) Here is the full response:

<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->
<!--[if IE 7]>    <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->
<!--[if IE 8]>    <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en-US"> <!--<![endif]-->
<head>
<title>Access denied | production.cloudflare.docker.com used Cloudflare to restrict access</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
<meta name="robots" content="noindex, nofollow" />
<meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1" />
<link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/cf.errors.css" type="text/css" media="screen,projection" />
<!--[if lt IE 9]><link rel="stylesheet" id='cf_styles-ie-css' href="/cdn-cgi/styles/cf.errors.ie.css" type="text/css" media="screen,projection" /><![endif]-->
<style type="text/css">body{margin:0;padding:0}</style>
<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/zepto.min.js"></script><!--<![endif]-->
<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/cf.common.js"></script><!--<![endif]-->
</head>
<body>
  <div id="cf-wrapper">
    <div class="cf-alert cf-alert-error cf-cookie-error" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.</div>
    <div id="cf-error-details" class="cf-error-details-wrapper">
      <div class="cf-wrapper cf-header cf-error-overview">
        <h1>
          <span class="cf-error-type" data-translate="error">Error</span>
          <span class="cf-error-code">1010</span>
          <small class="heading-ray-id">Ray ID: 40ca50f5684e37ce &bull; 2018-04-16 23:08:51 UTC</small>
        </h1>
        <h2 class="cf-subheadline">Access denied</h2>
      </div><!-- /.header -->
      <section></section><!-- spacer -->
      <div class="cf-section cf-wrapper">
        <div class="cf-columns two">
          <div class="cf-column">
            <h2 data-translate="what_happened">What happened?</h2>
            <p>The owner of this website (production.cloudflare.docker.com) has banned your access based on your browser's signature (40ca50f5684e37ce-ua48).</p>
          </div>
          
        </div>
      </div><!-- /.section -->
      <div class="cf-error-footer cf-wrapper">
  <p>
    <span class="cf-footer-item">Cloudflare Ray ID: <strong>40ca50f5684e37ce</strong></span>
    <span class="cf-footer-separator">&bull;</span>
    <span class="cf-footer-item"><span>Your IP</span>: 96.10.226.142</span>
    <span class="cf-footer-separator">&bull;</span>
    <span class="cf-footer-item"><span>Performance &amp; security by</span> <a href="https://www.cloudflare.com/5xx-error-landing?utm_source=error_footer" id="brand_link" target="_blank">Cloudflare</a></span>
    
  </p>
</div><!-- /.error-footer -->
    </div><!-- /#cf-error-details -->
  </div><!-- /#cf-wrapper -->
  <script type="text/javascript">
  window._cf_translation = {};
  
  
</script>
</body>
</html>

I think the number 40ca50f5684e37ce-ua48 is a specific id (the first part before the dash) tied to a user agent code, one from the cloudflare lookup that docker production is using. It looks a lot like one of these rules --> https://api.cloudflare.com/#user-agent-blocking-rules-list-useragent-rules

The Ray ID looks like something that is passed around that would help us get support, if needed --> https://support.cloudflare.com/hc/en-us/articles/200169746-Adding-the-CF-RAY-header-to-your-logs.

If I search that page for error code 1010 there are three options:

  • 1010 | Invalid or missing record
  • 1010 | Failed to validate SAN :
  • 1010 | Bulk deal limit reached (this is a zone error code)

This one --> this --> https://support.cloudflare.com/hc/en-us/articles/200171806-Error-1010-The-owner-of-this-website-has-banned-your-access-based-on-your-browser-s-signature and the error code matches too. The weird dance through proxys and removing headers has me thinking this is probably it. actually matches most closely, but it isn't super informative.

I know that requests automatically adds a user agent (Python Requests something) but I couldn't figure out what old school urllib and urllib2 would return, especially after all the ways it's passed around and used in Singularity. It could be any of the following:

  1. If the User-Agent is just missing, we could try adding one, one that is specific to the user at the time (so if installed in a container it wouldn't be shared)
  2. If the User Agent is present but too common and then reaching some rate limit, we could still try changing it, but maybe to one of the top used ones so it's not flagged as likely to be a bot. My thinking behind this one is that I was making requests from a very base (commonly used) container... and if everyone is making requests from that container, or even one large user, it would make sense to be blocked.
  3. If the issue is that the request is going through a redirect, and the redirect always passes on the same user agent, that would be an issue for the service providing the redirect, because all users would be then sharing that user agent.

So some sanity checks to try -

  1. Trace the user agent through the requests / responses
  2. Determine if a user agent is consistent between different hosts in the "same" container, or on a single host and a container used by it.
  3. Add a custom user agent, one that is most common --> https://techblog.willshouse.com/2012/01/03/most-common-user-agents/

@vsoch
Copy link
Collaborator Author

vsoch commented Apr 18, 2018

Also pinging @thiell here because he helped with the investigation above! (And might be interested too).

@dtrudg
Copy link
Contributor

dtrudg commented Apr 18, 2018

It would be valid to set a custom user-agent, but I'm not sure that would change things much here unless you are making requests from a common IP shared with loads of people - and those people also happen to be making requests to Docker hub using python urllib.

AFAIK the default user agent for urllib only really varies based on the version of python installed at the location you use it.

Redirection shouldn't be an issue - when a redirect response is issued, the original client makes a new request (sending its own user-agent) directly to the referred address. A redirect isn't a 'passed on' request, it results in a brand new request by the originator. I think what you are getting at is proxying - If you happen to be sitting behind a web proxy then it's a lot easier to hit rate limits, as all users will be going out from a small set of IPs.

Where are you doing this singularity pulls from @v? How many do you think you are doing? Also are you using any authentication to the Docker registry? There is a known bug with a PR fix there - which can easily hit rate limits if you make a lot of pulls and happened to specify invalid credentials #1406

@vsoch
Copy link
Collaborator Author

vsoch commented Apr 18, 2018

The pulls are from the Tunel container --> https://singularityhub.github.io/interface and I have hit it twice, always with a vanilla ubuntu container. If I go away for a few hours and come back it usually works again! I really wasn't doing more than maybe a few an hour, for testing different views. I first thought it was something in Tunel, but when I went to the base command line in the container I still had the issue. I was (hoping) it was something related to running Singularity in Docker, and some common algorithm that anyone using the same ubuntu container might then generate the same user agent? There definitely isn't any authentication here, beyond just standard pull stuffs.

@dtrudg
Copy link
Contributor

dtrudg commented Apr 18, 2018

Hi @v - I meant where is Tunel running? Are you on a University network behind a shared proxy? On a cloud service? etc. When you say you still had the issue in the base command line, do you mean just doing stuff from the command line triggered it? Or did the problem remain after it had become apparent when using Tunel?

This is a tough one to reason out if you are only doing a few pulls per hour. I'm yet to trigger a Cloudflare block doing anything, and I built about 2000 Singularity containers from docker pulls over a weekend once from a residential IP, and the same thing in a shorter time-span from a University IP.

User agent looks like this.... Python-urllib/2.7 it's the same for everyone using the same version of Python. If you are blocked it's almost certainly associated with at least user agent + IP address though.

@vsoch
Copy link
Collaborator Author

vsoch commented Apr 18, 2018

oh! Of course! One time I was on shared public wireless, the second time on my own wireless, which is possibly provided by the same company (but different router of course). It was on my computer Ubuntu 16.04), using Docker version 18.03.0-ce, build 0520e24 and the latest development branch 2.x. When I say command line, I mean the Docker container command line where Singularity is installed (not directly on the host). This was the first time I've seen it, which is why I thought it was something specific about being in a Docker container. But most stuffs is working ok! :)

@dtrudg
Copy link
Contributor

dtrudg commented Apr 18, 2018

Okay - I have a theory on what might be leading to cloudflare bans here.

The python code to pull docker images is abusing the authentication token mechanism of the docker registry pretty badly.

  1. Initial request related to the pull happens, gets a 401 as a token is needed
  2. update_token does it's stuff, requesting a 9000s expiry token - this is as it should be, but 9000s is crazy long.
  3. Manifests are retrieved.
  4. multiprocessing of layer downloads starts
  5. After each layer download update_token is called, even if we are only 0.5s into the 9000s token expiry. Note that because of multiprocessing these calls can be concurrent

If I pull e.g. dctrud/docker-aufs-sanity there are 5 small layers and it takes less than 3 seconds to get all of them, but Singularity is making 9 requests for auth tokens when there should only be one.

This type of behaviour could definitely be grounds for a temporary API ban.

We need 2 fixes here:

  1. Update token shouldn't request a new token if the original hasn't yet expired
  2. We really need to checksum downloaded layers, and throw up an error at point of download if checksum doesn't match.

@vsoch
Copy link
Collaborator Author

vsoch commented Apr 18, 2018

This was just reported again by @sbutcher

<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->
<!--[if IE 7]>    <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->
<!--[if IE 8]>    <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en-US"> <!--<![endif]-->
<head>
<title>Access denied | production.cloudflare.docker.com used Cloudflare to restrict access</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
<meta name="robots" content="noindex, nofollow" />
<meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1" />
<link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/cf.errors.css" type="text/css" media="screen,projection" />
<!--[if lt IE 9]><link rel="stylesheet" id='cf_styles-ie-css' href="/cdn-cgi/styles/cf.errors.ie.css" type="text/css" media="screen,projection" /><![endif]-->
<style type="text/css">body{margin:0;padding:0}</style>
<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/zepto.min.js"></script><!--<![endif]-->
<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/cf.common.js"></script><!--<![endif]-->
</head>
<body>
  <div id="cf-wrapper">
    <div class="cf-alert cf-alert-error cf-cookie-error" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.</div>
    <div id="cf-error-details" class="cf-error-details-wrapper">
      <div class="cf-wrapper cf-header cf-error-overview">
        <h1>
          <span class="cf-error-type" data-translate="error">Error</span>
          <span class="cf-error-code">1010</span>
          <small class="heading-ray-id">Ray ID: 40d8311b28e00a8a &bull; 2018-04-18 15:33:47 UTC</small>
        </h1>
        <h2 class="cf-subheadline">Access denied</h2>
      </div><!-- /.header -->
      <section></section><!-- spacer -->
      <div class="cf-section cf-wrapper">
        <div class="cf-columns two">
          <div class="cf-column">
            <h2 data-translate="what_happened">What happened?</h2>
            <p>The owner of this website (production.cloudflare.docker.com) has banned your access based on your browser's signature (40d8311b28e00a8a-ua48).</p>
          </div>
          
        </div>
      </div><!-- /.section -->
      <div class="cf-error-footer cf-wrapper">
  <p>
    <span class="cf-footer-item">Cloudflare Ray ID: <strong>40d8311b28e00a8a</strong></span>
    <span class="cf-footer-separator">&bull;</span>
    <span class="cf-footer-item"><span>Your IP</span>: xxx.xx.x.xxx</span>
    <span class="cf-footer-separator">&bull;</span>
    <span class="cf-footer-item"><span>Performance &amp; security by</span> <a href="https://www.cloudflare.com/5xx-error-landing?utm_source=error_footer" id="brand_link" target="_blank">Cloudflare</a></span>
    
  </p>
</div><!-- /.error-footer -->
    </div><!-- /#cf-error-details -->
  </div><!-- /#cf-wrapper -->
  <script type="text/javascript">
  window._cf_translation = {};

@vsoch
Copy link
Collaborator Author

vsoch commented Apr 18, 2018

@dctrud I think that must be it! +1 on either of those solutions, I would only refresh given expired.

@dtrudg
Copy link
Contributor

dtrudg commented Apr 28, 2018

Closing - this was included in 2.5.0.

@dtrudg dtrudg closed this as completed Apr 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants