New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-11904] Remove PrettyPrint from job definition #14122
Conversation
R: @pabloem I don't know if this would affect anything in the dataflow backend? But AFAICT this would result in more compressed JSON and fewer user jobs over the limit. @Fokko, I don't know that this solves your issue. Even after this change, your job graph size is still over the limit (20 MiB IIUC?) https://cloud.google.com/dataflow/docs/guides/common-errors#job-graph-too-large It might be good to look at what is in the serialized JSON payload for your job, see if there's any multi-MiB blobs of data that you shouldn't be serializing? |
right - often closures with larger-than-intended context can explode the job graph size |
We're writing data from PubSub to BigQuery, and there are Avro messages that we know upfront, and that we decode from PubSub. I don't see anything huge, but the @dpcollins-google Thanks for the quick reply. I'm iterating towards a <20MB job, which is indeed the limit. I noticed that it was pretty printed, so that was a quick win of around 5%. |
Can you try using |
let me know if that works, and if so, I can get it on stackoverflow / documentation for others |
Run Java PostCommit |
@pabloem Thanks for the pointer. It looks good at first glance. The UI seems to become unresponsive: Graph monitoring data exceeded maximum allowable size. Some monitoring data including counters may be incomplete. But if it works, that would be okay for now. :) |
hm right, because the UI also passes the job graph in an API call... |
@pabloem Seems like I'm unable to see the stacktrace of the failed test, can you confirm? |
that's correct - though the failed test is broken on the main branch: https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/testReport/ |
Run Java PostCommit |
@pabloem Do you think this is something that we can move forward? |
thanks for rebasing! : ) |
Run Java PostCommit |
Thanks @pabloem |
Please add a meaningful description for your change here. I would like to remove the PrettyPrint from the output. This decreases the size of the job definition:
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-11904] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.