-
Notifications
You must be signed in to change notification settings - Fork 7
Async jobs add endtime #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rohityadavcloud
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes request, overall LGTM good work. I've not tested/checked your marvin test cases though.
| @Param(description = " the created date of the job") | ||
| private Date created; | ||
|
|
||
| @SerializedName(ApiConstants.END_TIME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be finished or completed?
| expiringUnfinishedAsyncJobSearch.done(); | ||
|
|
||
| expiringCompletedAsyncJobSearch = createSearchBuilder(); | ||
| expiringCompletedAsyncJobSearch.and("created", expiringCompletedAsyncJobSearch.entity().getCreated(), SearchCriteria.Op.LTEQ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is okay only if the 'created' it not used in any of the methods to query the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only used by the Asyncjob garbage collector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified; it is only used by the async jobs garbage collector.
| SearchCriteria<AsyncJobVO> sc = expiringCompletedAsyncJobSearch.create(); | ||
| sc.setParameters("created", cutTime); | ||
| public List<AsyncJobVO> getExpiredCompletedJobs(final Date cutTime, final int limit) { | ||
| final SearchCriteria<AsyncJobVO> sc = expiringCompletedAsyncJobSearch.create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to discuss this bit over our syncup call when you're back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: during the upgrade 4.11.1->4.12.0 -> set all the removed field of the async_jobs -> NOW()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update added to schema-41110to41200.sql
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update added.
| job.setLastUpdated(DateUtil.currentGMTTime()); | ||
| final Date currentGMTTime = DateUtil.currentGMTTime(); | ||
| job.setLastUpdated(currentGMTTime); | ||
| job.setRemoved(currentGMTTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
| public AsyncJob queryJob(long jobId, boolean updatePollTime) { | ||
| AsyncJobVO job = _jobDao.findById(jobId); | ||
| public AsyncJob queryJob(final long jobId, final boolean updatePollTime) { | ||
| final AsyncJobVO job = _jobDao.findByIdIncludingRemoved(jobId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may have some API response changes, wrt who uses this method and if it can cause any regression (as the method will return a removed job).
| minimum_vms_to_pass = 2 | ||
| for vm_ip in vms_ips: | ||
| ssh_command = "ping -c 5 %s" % vm_ip | ||
| ssh_command = "ping -c 10 %s" % vm_ip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ernjvr is this related change? if it's a bugfix can we have it in a separate PR?
borisstoyanov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ernjvr I see lots of other commits in your PR, can you pull and rebase your code against master and resubmit the pull request
01ee941 to
98ddc26
Compare
| private Date created; | ||
|
|
||
| @SerializedName(ApiConstants.COMPLETED) | ||
| @Param(description = " the removed date of the job") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the description and variable to completed?
| public interface AsyncJobDao extends GenericDao<AsyncJobVO, Long> { | ||
|
|
||
| @Override | ||
| AsyncJobVO findById(Long id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it was referenced by interface and not implementation. But as discussed, I will remove the override and use findByIdIncludingRemoved.
| } | ||
|
|
||
| @Override | ||
| public AsyncJobVO findById(Long id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The findByIdIncluding removed previous usage was okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted
| cmd.jobid = self.virtual_machine.jobid | ||
| cmd_response = self.apiclient.queryAsyncJobResult(cmd) | ||
| endtime = cmd_response.endtime | ||
| self.assertIsNotNone(endtime, "Expected 'endtime' field of queryAsyncJobResult to have a timestamp of when the job finished.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd_response -> get the complete time, then use the dbclient -> async_jobs table -> job id and verify that completed matches what you got in the API response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the API, jobstatus -> success/failure -> check that against db as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| "basic", | ||
| "sg"], | ||
| required_hardware="true") | ||
| def test_async_job_table_values(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Squash the above two test cases into one, asset equal for the job id's removed/completed time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I'm not a fan of combining separate tests into one, this is not a good practice since it's limiting the test granularity. But in this case maybe it does makes sense to do it since the api result is depended on the DB value and could not work without it. In this case it'll save us the time to deploy a VM which is a lot and it does makes sense to combine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| def test_async_job_table_values(self): | ||
| """ | ||
| Test async job db table for valid values | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add another failure test case. Destroy something twice, etc. job status -> failed (int 2?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried this case and I'm not sure it baheves as expected. I've successfully destroyed a VM and tried to do it again and the job result was successful same as the first try. Should this behave like that? I think it should return an error saying the VM is already in Destroyed state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the same result as you @borisstoyanov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ernjvr do you think we should refactor the service if it tries to bring a VM into a state which it is, to prompt an error? It kinda makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@borisstoyanov I investigated the code for the scenario of starting a VM. It also returns a success status when I make a request to start a VM that has already been started. This success status is returned by the VM job handler code. The async job status gets updated with whatever status is returned by the VM job handler code. So, I think changing how the VM handler code deals with duplicate commands for the same resource is out of scope for this feature. I think it should be dealt with in a separate bug fix ticket. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound the right way to do it, thanks for the investigation @ernjvr, let me log a ticket in community. When we have time we could get back to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rohityadavcloud
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see comments. For integration tests, make use of both the API respones and db client/query result to verify both the job status and the removed time.
| except Exception as e: | ||
| self.debug("Warning! Exception in tearDown: %s" % e) | ||
|
|
||
| @attr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be formatted to a single line.
| "advancedns", | ||
| "basic", | ||
| "sg"], | ||
| required_hardware="true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be set to required_hardware="false".
| VirtualMachine.RUNNING) | ||
| self.assertEqual(response[0], PASS, response[1]) | ||
|
|
||
| result = self.dbclient.execute("select * from async_job where uuid='%s'" % self.virtual_machine.jobid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to line 127, closer to the access on result[0][17], probably name this as db_result to make it more obvious.
| removed = timezone('UTC').localize(result[0][17]) | ||
| removed = removed.astimezone(timezone('CET')) | ||
| removed = removed.strftime("%Y-%m-%dT%H:%M:%S%z") | ||
| self.assertEqual(completed, removed, "Expected 'completed' tag value to be equal to 'removed' db column value.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the word usage tag and use timestamp/datetime instead, it's good enough otherwise. Same at line 137.
rohityadavcloud
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left minor nits, otherwise LGTM. I did not test it, which the internal testing would approve or advise bugs.
| jobstatus_db = result[0][8] | ||
| self.assertEqual(jobstatus_api, jobstatus_db, "Expected 'jobstatus' tag value to be equal to 'job_status' db column value.") | ||
|
|
||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't return at the end .
| domainid=self.domain.id | ||
| ) | ||
| self.cleanup = [self.account] | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the return.
| from marvin.lib.base import * | ||
| from marvin.lib.common import * | ||
| from pytz import timezone | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run pep8 or pylint on this file and fix and lint related issues.
Fixes #11 Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Changed lease time to inifinity * message around DHCP lease infinity Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
Description
Overview:
There is currently no functional mechanism that captures or persists the end time of when an asynchronous job has finished. As a result, users are not able to do any reporting about the duration of various asynchronous jobs in Cloudstack.
The time stamp of when an asynchronous job has finished should be persisted to the 'removed' column of the async_job database table. This is an existing column which is not being used for any other functionality.
The queryAsyncJobResult API should now include an 'end_time' response tag that will display the value of the 'removed' database column.
When a Cloudstack management server starts up, it finds in the database all the asynchronous jobs it is the owner of, whose statuses are in an 'in_progress' state and updates them to a 'failed' status. At the same time, it should now also mark them as 'removed' by updating the 'removed' field with the current timestamp.
Currently, the garbage collector uses the 'created' database column to find in the database completed asynchronous job records that are older than the 'job.expire.minutes' global setting's timestamp and, then, deletes these records. It should no longer use the 'created' column to do this. It should use the 'removed' column instead.
Types of changes
GitHub Issue/PRs
Screenshots (if appropriate):
How Has This Been Tested?
Checklist:
Testing