Skip to content
This repository has been archived by the owner on Apr 4, 2019. It is now read-only.

Run grunt tasks in separate processes #148

Merged
merged 5 commits into from
Feb 22, 2018

Conversation

miiila
Copy link
Contributor

@miiila miiila commented Jan 10, 2018

After setting up new environments (staging-qa, staging-pre), black-belt was modified to handle deploys to all of them. Unfortunately, it takes some time and is not so efficient.

This PR present rough idea how to deal with it. Since I am not a skilled python dev, I'd like to hear your feedback.

My pain points:

  • how to approach the tests? (I have no idea)
  • should whole script fail if any of the grunt commands fails? (currently it doesn't, it tries to deploy all it could)

Thanks for a help to make black-belt usable again.

Almad
Almad previously requested changes Jan 12, 2018
Copy link
Contributor

@Almad Almad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My approach to testing would be to extract the grunt commands into something that can be replaced with simple subprocess that is either a short sleep or an exception.

I'd suggest slugs being created in parallel, but deploys done serially in order of importance (since they are quick, and I'd fail if they'd fail on staging before going to prod).

I think the command should definitely fail..as in, when running in parallel, finish what you can, but do not continue with stuff that follow. As in, I don't think we should continue with deploy if any of the slug builds fail.

print('\n')

while len(commands):
nextRound = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please adhere to pep8, next_round

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover, it is...


commands = []
for command in grunt_commands:
with tempfile.NamedTemporaryFile(delete=False) as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this over mkstemp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I wasn't aware about it ;-) Since I don't need delete attribute for NamedTemporaryFile, I can switch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... I'll probably stay with NamedTemporaryFile; mkstemp doesn't support context manager (a.k.a. with ...), therefore - AFAIK - I'd need to take care of closing the handler, which would complicate whole flow.

@miiila
Copy link
Contributor Author

miiila commented Jan 12, 2018

@Almad Thanks for the ideas! Let's clarify one thing:

I'd suggest slugs being created in parallel, but deploys done serially in order of importance (since they are quick, and I'd fail if they'd fail on staging before going to prod).

I think the command should definitely fail..as in, when running in parallel, finish what you can, but do not continue with stuff that follow. As in, I don't think we should continue with deploy if any of the slug builds fail.

This is definitely valid for grunt create-slug && grunt deploy-slug. Unfortunately, we have grunt deploy as well (https://github.com/apiaryio/black-belt/pull/148/files#diff-4f956f65da142221425f202ac651efaaR17).

I'd suggest splitting it into create-slug commands ran in parallel and deploys waiting only for creation success (no tests, no confirmation). Do you agree?

@Almad
Copy link
Contributor

Almad commented Jan 15, 2018

@miiila hmm, why can't deploy work internally as described as well?

@miiila
Copy link
Contributor Author

miiila commented Jan 15, 2018 via email

@Almad
Copy link
Contributor

Almad commented Jan 15, 2018

@miiila I still don't get it. Run creating slug in parallel and wait for tests to run, once all of this has been done successfully, do a serial deploy and fail / stop deploy if any of those fails.

Anything I am missing on why this can't be done?

@miiila
Copy link
Contributor Author

miiila commented Jan 15, 2018 via email

@miiila miiila force-pushed the miiila/try-parallel-slug-creation branch 2 times, most recently from f29aa8d to 6465aa7 Compare January 20, 2018 12:58
check_call(['grunt', 'deploy'])
check_call(['grunt', 'deploy', '--app=apiary-staging-pre'])
check_call(['grunt', 'deploy', '--app=apiary-staging-qa'])
slug_creaction_rc = run_grunt_in_parallel((['grunt', 'create-slug'], ['grunt', 'create-slug', '--app=apiary-staging-pre'], ['grunt', 'create-slug', '--app=apiary-staging-qa']))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put the tuple into several lines for better readability:

slug_creaction_rc = run_grunt_in_parallel((
    [...],
    [...],
    [...],
    [...],
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, from reading just this part of the code, I'm not quite sure what slug_creaction_rc means and what are its contents. I can see further below it's tested for not being zero, so I guess it's a count of something, but the name doesn't help me to understand why/what.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to longer, but more revealing name, thank you.

check_call(['grunt', 'deploy', '--app=apiary-staging-qa'])
slug_creaction_rc = run_grunt_in_parallel((['grunt', 'create-slug'], ['grunt', 'create-slug', '--app=apiary-staging-pre'], ['grunt', 'create-slug', '--app=apiary-staging-qa']))
if slug_creaction_rc != 0:
post_message("Slug creation failed, deploy stopped.", "#deploy-queue")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like mixing tabs and spaces in the indentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I haven't done any changes, but it looks better now :-)

commands = []
for command in grunt_commands:
with tempfile.NamedTemporaryFile(delete=False) as f:
app = command[2].split('=')[1] if len(command) > 2 else 'production'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably good enough for now, but generally this seems fragile to me. Maybe it could look specificialy for --app= or -a? Or maybe I'm not getting the line right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the line doesn't seem to be tested.

Copy link
Contributor

@honzajavorek honzajavorek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for jump-in comments. I wanted to help reviewing the last commit, but didn't want to interfere with @Almad in the overall review. I focused mainly on the Python code itself, not on the logic that much. Cheers.

@miiila
Copy link
Contributor Author

miiila commented Jan 22, 2018

So I added test for parallel tasks and polished other parts. I'd love to test it better than it is now (6465aa7#diff-89f24b3a707f227aaff587cda0f8268aR145), but it's far behind my MagicMock skills and I wouldn't rather invest more time (i.e. feel free to jump in ;-)). Current situation is "good enough" for me (esp. given the fact that deploy method is not tested at all).

So I'd like to ask @Almad for another round of review. Thx.

Copy link
Member

@kylef kylef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame we support Python 2 😉. This would be rather simpler/more elegant to implement using Python 3 asyncio, since subprocesses can be coroutines. They can be waited on easily as a group.

for command in grunt_commands:
with tempfile.NamedTemporaryFile(delete=False) as f:
app = get_grunt_application_name(' '.join(command))
commands.append({'app': app, 'process': Popen(command, stdout=f), 'log': f.name})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this works, but since f the file descriptor is opened with a context (with ... as f) the file descriptor should be closed at the end of the with context.

However, the process will be running and potentially sending data to the file descriptor for longer than the with context after the file descriptor was closed.

Is there some magic I am missing? Perhaps due to Popen implementation details this is working somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe magic this could be the answer (https://stackoverflow.com/a/38959232).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, the file descriptor should be closed outside of the with context:

>>> import tempfile
>>> with tempfile.NamedTemporaryFile(delete=False) as fp:
...     pass
... 
>>> fp
<closed file '<fdopen>', mode 'w+b' at 0x1038ae6f0>
>>> fp.write('x')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file

I guess the implementation of Popen is somehow re-opening or opening from the filename afterwards.

while len(commands):
next_round = []
for command in commands:
rc = command['process'].poll()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, poll is not blocking and thus this while loop will be a tight loop running many times while waiting for the processes. Perhaps it would make sense to add some sleep for a second or so to the outter loop to prevent high CPU load for the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I believe 5 seconds will be OK.

@abtris
Copy link
Contributor

abtris commented Jan 24, 2018

need rebase with master

@miiila miiila force-pushed the miiila/try-parallel-slug-creation branch from c333dc3 to b510dd4 Compare January 24, 2018 13:00
@abtris
Copy link
Contributor

abtris commented Feb 1, 2018

@kylef @honzajavorek @Almad can you make final review to merge this PR?

@abtris
Copy link
Contributor

abtris commented Feb 19, 2018

@kylef @honzajavorek @Almad Anything blocking there? We need finalize this review to merge this PR?

Copy link
Contributor

@abtris abtris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miiila miiila merged commit 6242f55 into master Feb 22, 2018
@miiila miiila deleted the miiila/try-parallel-slug-creation branch February 22, 2018 09:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants