Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cli): failure if account cache is malformed #10887

Merged
merged 3 commits into from
Oct 16, 2020

Conversation

rix0rrr
Copy link
Contributor

@rix0rrr rix0rrr commented Oct 15, 2020

Because of concurrent running of integration tests, the account
cache (which is supposed to be a JSON file) can be read in a state
where it's empty or incompletely written, which fails the JSON parse.

If that happens, ignore the error and pretend the cache is empty.

Fixes sporadic concurrency issues in the integration tests.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

Because of concurrent running of integration tests, the account
cache (which is supposed to be a JSON file) can be read in a state
where it's empty or incompletely written, which fails the JSON parse.

If that happens, ignore the error and pretend the cache is empty.

Fixes sporadic concurrency issues in the integration tests.
@rix0rrr rix0rrr requested a review from a team October 15, 2020 10:35
@rix0rrr rix0rrr self-assigned this Oct 15, 2020
@gitpod-io
Copy link

gitpod-io bot commented Oct 15, 2020

@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Oct 15, 2020
Comment on lines 232 to 234
console.log(configuration.context);
console.log(configuration.context.all);
console.log(configuration.context.get(cxapi.ENABLE_DIFF_NO_FAIL));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these were for diagnostic purposes and should be removed.

@@ -229,6 +229,9 @@ async function initCommandLine() {
return cli.list(args.STACKS, { long: args.long });

case 'diff':
console.log(configuration.context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional...?

@@ -83,6 +83,9 @@ export class AccountAccessKeyCache {
// File doesn't exist or is not readable. This is a cache,
// pretend we successfully loaded an empty map.
if (e.code === 'ENOENT' || e.code === 'EACCES') { return {}; }
// File is not JSON, could be corrupted because of concurrent writes.
// Again, an empty cache is fine.
if (e instanceof SyntaxError) { return {}; }
Copy link
Contributor

@iliapolo iliapolo Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im actually not sure this is because of the parallelism of the integ tests, i remember @rix0rrr and I encountered this issue a few months ago and ended up in a rabbit hole of sdk code.

What if we set the CDK_HOME env variable to be unique per execution before we resort to this? I'm not strictly opposed to it but i feel like we are doing it for the wrong reasons now and we might be missing something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specifically has nothing to do with the SDK. This file is completely under our control, written and read by us only.

I can't imagine it being anything other than the parallelism we introduced recently, given that I haven't seen this error crop up before that.

I'm open to another explanation of the available facts if you have them, and I'll agree it breaks my mental model of how I thought NodeJS worked as well. Even concurrency isn't the ultimate root cause, I'm not concerned about this fix too much, which is why I'm hesitant to do more complicated workarounds.

This is all to say: I say we ship it :).

Copy link
Contributor Author

@rix0rrr rix0rrr Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh umm. It might also be this:

await fs.ensureFile(this.cacheFile);
await fs.writeJson(this.cacheFile, map, { spaces: 2 });

But even given that I don't mind too much to put this workaround in. It will also catch malformed caches for other reasons (outside our control).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship and may the force be with you 👍

@rix0rrr
Copy link
Contributor Author

rix0rrr commented Oct 16, 2020

For future reference, I wrote a script to test assumptions, running writeJson and readJson in a hot loop.

Result:

read ok: 737
bla.json: Unexpected end of JSON input: 263

About 25% of the reads read a broken file.

var fs = require('fs-extra');

const filename = 'bla.json';

function bigObject(n) {
    if (n === 0) { return  {}; }
    return {
        [`a${n}`]: bigObject(n-1),
        [`b${n}`]: bigObject(n-1),
    };
}

let running = true;

new Promise(async (ok) => {
    const bo = bigObject(10);
    while (running) {
        try {
            await fs.writeJson(filename, bo, { spaces: 2 });
        } catch(e) {
            console.log(e.message);
        }
    }
}).then(console.log);

const counters = {};

function countHit(message) {
    if (message in counters)
        counters[message] += 1;
    else
        counters[message] = 1;
}

new Promise(async (ok) => {
    for (let i = 0; i < 1000; i++) {
        try {
            await fs.readJson(filename);
            countHit('read ok');
        } catch(e) {
            countHit(e.message);
        }
    }

    ok();
}).then(() => {
    running = false;

    // Print a summary
    for (const [k, v] of Object.entries(counters)) {
        console.log(`${k}: ${v}`);
    }
});

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: c0bbcad
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify
Copy link
Contributor

mergify bot commented Oct 16, 2020

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit 9b2438a into master Oct 16, 2020
@mergify mergify bot deleted the huijbers/survive-malformed-cache branch October 16, 2020 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution/core This is a PR that came from AWS.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants