-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
find_probable_cause_of_failure() is bad at fetching logs #2
Comments
Would it be more appropriate to call out to the |
I'm refactoring the S3 and SSH log fetcher functionality to subclass
This will probably also involve breaking a lot of S3-related code out of |
Yup, that sounds good. Another good way to approach this is to start out by building a standalone utility (in And please, use |
Can do. My current strategy is to copy any relevant functions (ls/get from S3, local, and SSH + dependencies) into instance methods and helpers for fetchers so that |
Sounds like a good plan. |
New info: logs have slightly different paths on S3 vs local. Here's a quickref I'll put in the comments:
|
…Now this should fix Yelp#2
I believe this can be closed unless it also encompasses a log fetching/parsing refactor. |
Yup, thanks! |
merge master into Google dataproc
We currently grab EMR logs from S3. This only works for job flows that shut down after running your job. Technically, it's not supposed to work at all; according to (http://developer.amazonwebservices.com/connect/entry.jspa?externalID=3938&categoryID=265), logs aren't copied to S3 until they've been untouched for 5 minutes.
Rather than grabbing the logs from S3 directly, we need to download the relevant logs via ssh if the job flow is still running, and S3 if it's not, and parse the log files locally.
The text was updated successfully, but these errors were encountered: