New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic re-try logic to all of fedimg's operations #34

Open
ralphbean opened this Issue Sep 17, 2015 · 0 comments

Comments

Projects
None yet
3 participants
@ralphbean
Copy link
Contributor

ralphbean commented Sep 17, 2015

Right now, the cloud just plain fails sometimes. Like this error email I got today:

[2015-09-17 13:20:12][    fedmsg   ERROR]
Image copy to ap-southeast-2 failed


Process Details
---------------
host:     fedimg01.phx2.fedoraproject.org
PID:       8640
name:     fedmsg-hub
command:  /usr/bin/python /usr/bin/fedmsg-hub

Callstack that lead to the logging statement
--------------------------------------------
  File "/usr/lib64/python2.7/threading.py", line 784 in __bootstrap
    self.__bootstrap_inner()
  File "/usr/lib64/python2.7/threading.py", line 811 in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 764 in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 113 in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 65 in mapstar
    return map(*args)
  File "/usr/lib/python2.7/site-packages/fedimg/uploader.py", line 46 in <lambda>
    results = pool.map(lambda s: s.upload(), services)
  File "/usr/lib/python2.7/site-packages/fedimg/services/ec2.py", line 627 in upload
    ami['region']))
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/fedimg/services/ec2.py", line 605, in upload
    description=self.image_desc)
  File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/ec2.py", line 2691, in copy_image
    self.connection.request(self.path, params=params).object)
  File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 799, in request
    response = responseCls(**kwargs)
  File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 143, in __init__
    headers=self.headers)
BaseHTTPError: InternalError: An internal error has occurred

The cloud just doesn't always work 100% of the time; we know that.

So, we should wrap most of fedimg's operations in little re-try loops that try 3 or 4 times before giving up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment