Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOUDSTACK-3009: Fixed resource calculation CPU, RAM for accounts. #3012

Merged
merged 5 commits into from Nov 13, 2018

Conversation

@bwsw
Copy link
Contributor

commented Nov 7, 2018

Description

Bug fix for #3009

The view "service_offering_view" doesn't include removed SOs, as a result when SO is removed, the bug happens. The PR introduces a change for resource calculation changing "service_offering_view" to "service_offering" table which has all service offerings.

Must be fixed in:

  • 4.12
  • 4.11

Fixes: #3009

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Screenshots (if appropriate):

How Has This Been Tested?

The code was tested manually by running SQL inside MySQL console.
Marvin tests should be done.

@bwsw bwsw changed the title Fixed resource calculation CPU, RAM for accounts. WIP: Fixed resource calculation CPU, RAM for accounts. Nov 7, 2018

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

I think you should change the view to include this even when it is deleted, instead of changing the joining table/view. What do you think?

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

@rafaelweingartner potentially it could lead to many bugs. I think the change could some other places to fix. Now it's a small fix, but can transform into a large one.

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

Can you explain further why (your concerns)?I mean, if logically the result is the same when finding the physical interface, why would this affect other places?

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

@rafaelweingartner it's not thst issue. It's for resource accounting.

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

Ops, I was with the other issue in the cache... Sorry ;)
Let me load this one out of the HDD again...

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

Ok, I looked in the code, and the table is used in some other places, specially the ServiceOfferingJoinVO. Therefore, the impacts might be big. It makes sense to use the service_offering if it has everything you need in that context.

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

@rafaelweingartner may we add to 4.11.3 as well?

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

Sure, but then the process is a little bit different. You need to target 4.11 branch. This will require you to cherry pick your current commit into 4.11.

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

@rafaelweingartner Ok. then I leave this one too? Or close this one and open for 4.11?

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

Actually, you can keep using this one. You only need to change the target branch here, and then update your bwsw:bug/accounting-bug-3009 branch to use 4.11 + the commit introduced here

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

@rafaelweingartner Ok. Got it. I'll try. Not a git pro, thought...

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

Ping me on Slack if you get lost.

@bwsw bwsw changed the base branch from master to 4.11 Nov 7, 2018

@bwsw bwsw changed the base branch from 4.11 to master Nov 7, 2018

@bwsw bwsw force-pushed the bwsw:bug/accounting-bug-3009 branch from d580738 to 0f43697 Nov 7, 2018

@bwsw bwsw changed the base branch from master to 4.11 Nov 7, 2018

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

@rafaelweingartner Please, take a look. Looks like I managed it.

@rafaelweingartner
Copy link
Member

left a comment

That is it

@rafaelweingartner rafaelweingartner modified the milestones: 4.12.0.0, 4.11.3.0 Nov 7, 2018

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

Looks like Jenkins is unhappy with moving merge branch from master to 4.11.

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

It is a coincidence. It was not a problem because of the changes, but rather a worker that was working as it should.

@rafaelweingartner

This comment has been minimized.

Copy link
Member

commented Nov 9, 2018

@bwsw what happened with this PR? I mean, there are now a lot of docker files, and other things that are not related to the change I reviewed.

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 9, 2018

@rafaelweingartner I want to write additional E2E tests for the code. Thus, I started with Marvin and improved the Dockerfile for the simulator to fit Marvin development needs. It's a WIP work. Please, take a look at the email list for details.

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 9, 2018

@rafaelweingartner Sorry about that)

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@rhtyd rhtyd modified the milestones: 4.11.3.0, 4.11.2.0 Nov 12, 2018

@rhtyd

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

Moved to 4.11.2.0 as it sounds like a critical business issue if resource counts are wrong.

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

Packaging result: ✔centos6 ✔centos7 ✖debian. JID-2425

@rhtyd

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@rhtyd
rhtyd approved these changes Nov 12, 2018
rhtyd added 2 commits Nov 12, 2018
@rhtyd

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

Nevermind, I needed to run test quickly - have implemented minor fixes.

ram = int(self.get_resource_amount(RAM_RESOURCE_ID))

self.assertEqual(cores, self.services['service_offering_it_1']['cpunumber'] + self.services['service_offering_it_2']['cpunumber'])
self.assertEqual(ram, self.services['service_offering_it_1']['memory'] + self.services['service_offering_it_2']['memory'])

This comment has been minimized.

Copy link
@borisstoyanov

borisstoyanov Nov 12, 2018

Contributor

Test LGMT, let's wait for the results

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

Trillian test result (tid-3164)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 23884 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3012-t3164-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_deploy_virtio_scsi_vm.py
Intermittent failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermittent failure detected: /marvin/tests/smoke/test_resource_accounting.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Intermittent failure detected: /marvin/tests/smoke/test_host_maintenance.py
Smoke tests completed. 67 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_so_removal_resource_update Error 26.67 test_resource_accounting.py
@borisstoyanov
Copy link
Contributor

left a comment

^^ this looks like a Trillian problem

2018-11-12 10:24:06,987 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-34:ctx-cff1820f job-1350 ctx-612653d9 FirstFitRoutingAllocator) (logid:936f6a14) Host: 2 doesn't have cpu capability (cpu:3, speed:1599) to support requested CPU: 4 and requested speed: 100
@bwsw Do you mind if we use 3 CPUs in the offering instead of 4?

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 12, 2018

2018-11-12 10:24:59,254 INFO  [o.a.c.a.c.a.v.DeployVMCmdByAdmin] (API-Job-Executor-39:ctx-cda190fb job-1365 ctx-88aeda9f) (logid:7faecac3) Unable to create a deployment
 for VM[User|i-127-159-VM]
com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[User|i-127-159-VM]Scope=interface com.cloud.dc.DataCenter; id=1
        at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:215)
        at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:200)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:4492)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:4057)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:4044)
        at sun.reflect.GeneratedMethodAccessor535.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:338)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:107)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:174)
        at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:174)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
        at com.sun.proxy.$Proxy164.startVirtualMachine(Unknown Source)
        at org.apache.cloudstack.api.command.admin.vm.DeployVMCmdByAdmin.execute(DeployVMCmdByAdmin.java:50)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:150)
        at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:581)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:529)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
2018-11-12 10:24:59,255 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-39:ctx-cda190fb job-1365) (logid:7faecac3) Complete async job-1365, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":533,"errortext":"Unable to create a deployment for VM[User|i-127-159-VM]"}

Doesn't looks like connected with the test itself, rather an env problem.

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 12, 2018

@borisstoyanov I don't care, just wanted SOs to vary.

@borisstoyanov

This comment has been minimized.

Copy link
Contributor

commented Nov 12, 2018

@bwsw yes, you're right, issue is that blueorangutan's environment has hosts with 3 cores and we're requesting 4. Perhaps you could just change the offering to request 3?

@bwsw

This comment has been minimized.

Copy link
Contributor Author

commented Nov 12, 2018

@rhtyd
rhtyd approved these changes Nov 12, 2018
@rhtyd

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

@blueorangutan package

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

Packaging result: ✔centos6 ✔centos7 ✖debian. JID-2431

@rhtyd

This comment has been minimized.

Copy link
Member

commented Nov 12, 2018

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan

This comment has been minimized.

Copy link

commented Nov 12, 2018

Trillian test result (tid-3169)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 22024 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3012-t3169-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_outofbandmanagement.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Smoke tests completed. 68 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@rhtyd rhtyd merged commit f6e600e into apache:4.11 Nov 13, 2018

2 checks passed

Jenkins This pull request looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.