Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download Attempts-Data Returned #57

Closed
wants to merge 4 commits into from
Closed

Download Attempts-Data Returned #57

wants to merge 4 commits into from

Conversation

timronan
Copy link
Contributor

Verifies that bytes are returned in each download attempt. Creates logical frame work to handle different types of download attempts. Changes output messages for each download attempt.

timronan and others added 4 commits November 28, 2018 14:36
Copy link
Contributor

@chad-earthscope chad-earthscope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe this code is doing what is expected:

        p = Popen([command], stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
        output, errors = p.communicate()
        # Print statement is used to add bytes to the output pipe. Semi-kluge but works until we find a better solution.
        # Potentially associated with the asynchronous behavior of the workers downloads.
        print("\033[F")
        # Determines if bytes are returned by the subprocess created by workers.execute.
        if len(output) is not 0:
            ErrorStatistics.bytecount += 1 #Increments the byte count in manager.ErrorStatistics
        return p

The code effective says "if the subprocess produces any data on it's stdout stream, increment a number". This is not the same thing as knowing that a download worker actually received data.

Also, I believe this general _popen call is used for all workers (download, ingestion, indexing) and it would be doing the same logic and accumulation for non-download workers.

print("\033[F")
# Determines if bytes are returned by the subprocess created by workers.execute.
if len(output) is not 0:
ErrorStatistics.bytecount += 1 #Increments the byte count in manager.ErrorStatistics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the actual byte count, it's an incremented value, so confusingly named.

output, errors = p.communicate()
# Print statement is used to add bytes to the output pipe. Semi-kluge but works until we find a better solution.
# Potentially associated with the asynchronous behavior of the workers downloads.
print("\033[F")
Copy link
Contributor

@chad-earthscope chad-earthscope Nov 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a console sequence for "Cursor up one line". It should not be needed at all.

# Potentially associated with the asynchronous behavior of the workers downloads.
print("\033[F")
# Determines if bytes are returned by the subprocess created by workers.execute.
if len(output) is not 0:
Copy link
Contributor

@chad-earthscope chad-earthscope Nov 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testing if the stdout stream of the called process returned anything, combined with that "Cursor up one line", who knows what it's getting. It's very likely unrelated to anything about bytes downloaded. I printed the output, it is some non-sensical repeat of my debugging print statement.


from .manager import ErrorStatistics
self.errors = ErrorStatistics()
p = Popen([command], stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
Copy link
Contributor

@chad-earthscope chad-earthscope Nov 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be confirmed if we go with this approach, but this may be high-jacking all stdout and stderr output, preventing those from going into the rover log files.

@chad-earthscope
Copy link
Contributor

Even more problematic: the p.communicate() call blocks/waits for the process to terminate. This effectively turns rover into single-process program, with no parallelism.

In the rover model this _popen call just spins off the worker processes and they work in the background. The check for termination is in Workers.check() method, using process.poll() to determine when worker processes have terminated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants