Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[patch] pgloader hangs on connection errors (python 2.4) #1

Closed
dpanech opened this issue Sep 19, 2010 · 3 comments
Closed

[patch] pgloader hangs on connection errors (python 2.4) #1

dpanech opened this issue Sep 19, 2010 · 3 comments

Comments

@dpanech
Copy link
Contributor

dpanech commented Sep 19, 2010

Hi,

It seems that pgloader hangs after encountering many types of errors, e.g., database connection errors (but possibly other kinds of errors, too). It looks like it hangs after catching (and printing) the exception. I think it may have something to do with multi-threading.

Here's the output of of a sample invocation when the specified DB server is down:

$ ./pgloader.py -c pgloader.conf -d
pgloader     INFO     Logger initialized
pgloader     WARNING  path entry '/usr/share/python-support/pgloader/reformat' does not exists, ignored
pgloader     INFO     Reformat path is []
pgloader     INFO     Will consider following sections:
pgloader     INFO       data
pgloader     INFO     Will load 1 section at a time
data         INFO     Loading threads: 1
data         ERROR    could not connect to server: Connection refused
    Is the server running on host "localhost" and accepting
    TCP/IP connections on port 54321?

Exception in thread data:
Traceback (most recent call last):
  File "/usr/lib/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "/home/dpanech/pgloader_git/pgloader/pgloader/pgloader.py", line 831, in run
    self._postinit()
  File "/home/dpanech/pgloader_git/pgloader/pgloader/pgloader.py", line 212, in _postinit
    self.db.reset()
  File "/home/dpanech/pgloader_git/pgloader/pgloader/db.py", line 196, in reset
    raise PGLoader_Error, "Can't connect to database"
PGLoader_Error: Can't connect to database

At this point it hangs and ignores SIGINT (KeybpardInterrupt).

I'm using python 2.4, psycopg 2.0.13, libpq 8.1.21 on RedHat EL 5 (update 5).

Thanks,
D.

@dpanech
Copy link
Contributor Author

dpanech commented Sep 20, 2010

Here's a patch that fixes some of these problems:

  • sets a flag ("success") on the loader objects any time an exception is caught in a thread
  • the main function checks all threads for the success flag and returns 0 (all threads finished successfully) or 1 (at least one thread had an exception, or some other exception occured in the main thread).
  • The script no longer returns the result of "print_summary" to the OS. I'm not sure what that's supposed to do, but it seems wrong. Note that only values 0..127 are portable for sys.exit return codes, with 0 meaning "success". Without this it's impossible to tell whether pgloader succeeded or not after running it from, say, another script.

Also, it would be nice if pgloader returned some well-defined (and documented) error codes to the OS, so calling scripts can check them, maybe something like:

  • 0 -- success
  • 1 -- fatal error
  • 2 -- some records were rejected
  • ...

Note that I don't really "know" python, please double check these changes... they seem to work for me though.

    diff --git a/pgloader.py b/pgloader.py
    index 494e57d..d9e0f52 100755
    --- a/pgloader.py
    +++ b/pgloader.py
    @@ -752,23 +752,29 @@ def load_data():
         log.info("All threads are started, wait for them to terminate")
         check_events(finished, log, "processing is over")

    +    # check whether any thread failed
    +    for section, loader in threads.iteritems():
    +        if not loader.success:
    +            return 1
    +
         # total duration
         td = time.time() - begin
    -    retcode = 0

         if SUMMARY and not interrupted:
             try:
    -            retcode = print_summary(None, sections, summary, td)
    +            print_summary(None, sections, summary, td)
                 print
             except PGLoader_Error, e:
                 log.error("Can't print summary: %s" % e)
    +            return 1

             except KeyboardInterrupt:
    -            pass
    +            return 1

    -    return retcode
    +    return 0

     if __name__ == "__main__":
    +    ret = 1
         try:
             ret = load_data()
         except Exception, e:
    diff --git a/pgloader/pgloader.py b/pgloader/pgloader.py
    index 5b1becd..e585419 100644
    --- a/pgloader/pgloader.py
    +++ b/pgloader/pgloader.py
    @@ -826,36 +826,42 @@ class PGLoader(threading.Thread):
             self.sem.acquire()
             self.log.debug("%s acquired starting semaphore" % self.logname)

    -        # postinit (will call db.reset() which will get us connected)
    -        self.log.debug("%s postinit" % self.logname)
    -        self._postinit()
    -
             # tell parent thread we are running now
             self.started.set()
             self.init_time = time.time()        

    +        try:
    +            # postinit (will call db.reset() which will get us connected)
    +            self.log.debug("%s postinit" % self.logname)
    +            self._postinit()
    +
    +            # do the actual processing in do_run
    +            self.do_run()
    +            
    +        except Exception, e:
    +            self.log.error(e)
    +            self.terminate(False)
    +            return
    +
    +        self.terminate()
    +        return
    +
    +    def do_run(self):
    +
             # Announce the beginning of the work
             self.log.info("%s processing" % self.logname)

             if self.section_threads == 1:
    -            try:
    -                # when "No space left on device" where logs are sent,
    -                # we want to catch the exception
    -                if 'reader' in self.__dict__ and self.reader.start is not None:
    -                    self.log.debug("Loading from offset %d to %d" \
    -                                   %  (self.reader.start, self.reader.end))
    -
    -                self.prepare_processing()
    -                self.process()
    -                self.finish_processing()
    +            # when "No space left on device" where logs are sent,
    +            # we want to catch the exception
    +            if 'reader' in self.__dict__ and self.reader.start is not None:
    +                self.log.debug("Loading from offset %d to %d" \
    +                               %  (self.reader.start, self.reader.end))

    -            except Exception, e:
    -                # resources get freed in self.terminate()
    -                self.terminate()
    -                self.log.error(e)
    -                raise
    +            self.prepare_processing()
    +            self.process()
    +            self.finish_processing()

    -            self.terminate()
                 return

             # Mutli-Threaded processing of current section
    @@ -873,10 +879,9 @@ class PGLoader(threading.Thread):
                 # here we need a special thread reading the file
                 self.round_robin_read()

    -        self.terminate()
             return

    -    def terminate(self):
    +    def terminate(self, success = True):
             """ Announce it's over and free the concurrency control semaphore """

             # force PostgreSQL connection closing, do not wait for garbage
    @@ -898,6 +903,7 @@ class PGLoader(threading.Thread):
             except IOError, e:
                 pass

    +        self.success = success
             self.finished.set()
             return

@dpanech
Copy link
Contributor Author

dpanech commented Sep 20, 2010

BTW is there a mailing list? This issue tracker doesn't support attachments. I had to add a tab to every line in the above patch to prevent it from being re-formatted. Am I doing this wrong? I have never used github (or git or python for that matter) before today, sorry.

@dimitri
Copy link
Owner

dimitri commented Sep 20, 2010

The github way seems to be: fork the project, use git, push your patches on your fork, then create a pull request ticket --- or just a ticket like this, I can see the patches in the Fork Queue tab too.

I'd appreciate it if you can send me "real" patches, either this way to the plain git way (git send-email, then I git am -s)

dimitri added a commit that referenced this issue Oct 21, 2017
It turns out that when using *print-pretty* in CCL we then have CL reader
references in the output, such as in the following example:

  QUERY: comment on table mysql.base64 is $#1=DXIDC_EMLAQ$Test decoding base64 documents$#1#$

Of course that's wrong, so prevent this from happening by
forcing *print-pretty* to nil in a top-level function. We still turn this on
in the monitor thread when printing error messages as those might contain
recursive data structures.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants