Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

template downloading fails #2820

Closed
izenk opened this issue Aug 23, 2018 · 19 comments
Closed

template downloading fails #2820

izenk opened this issue Aug 23, 2018 · 19 comments

Comments

@izenk
Copy link

izenk commented Aug 23, 2018

ISSUE TYPE
  • Bug Report
COMPONENT NAME
TEMPLATE
CLOUDSTACK VERSION
4.11.0
4.11.1
CONFIGURATION

ADVANCED NETWORKING

OS / ENVIRONMENT

CentOS Linux release 7.4.1708

SUMMARY

When trying to add template from UI, uploading process fails with error: Stream closed

May be important:
template size is 104 GB
template format is RAW

STEPS TO REPRODUCE

UI
Templates -> Add Template -> fill form (downloading from http)

EXPECTED RESULTS
Template is downloaded and registered
ACTUAL RESULTS

Downloading fails with error: Stream closed

2018-08-23 12:14:34,078 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-2:null) (logid:) SeqA 5-577195: Sending Seq 5-577195:  { An
s: , MgmtId: 90520739393856, via: 5, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
2018-08-23 12:14:35,813 DEBUG [c.c.a.ApiServlet] (qtp745160567-15:ctx-4e0d3cf8) (logid:f2420e15) ===START===  x.x.x.x -- GET  comman
d=registerTemplate&response=json&projectid=62b2e0d2-5f81-4a6d-8d8b-591283f10e4a&name=test-oracledb-tmpl5&displayText=test-oracledb-tmpl5&u
rl=http%3A%2F%2Ftemplates-host%2Fcscoracle-db.raw&zoneids=4a2e5664-edc7-4a0d-ae4a-fd56e1dc4314&format=RAW&isextractable=false&pass
wordEnabled=false&isdynamicallyscalable=false&osTypeId=b70da464-54f4-11e8-8821-525400843140&hypervisor=KVM&ispublic=true&requireshvm=false
&isfeatured=false&isrouting=false&details%5B0%5D.hypervisortoolsversion=xenserver56&directdownload=false&checksum=&_=1535019275786
2018-08-23 12:14:35,819 DEBUG [c.c.a.ApiServer] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) CIDRs from which account 'Acc
t[projectid]' is allowed to perform API calls: 0.0.0.0/0
2018-08-23 12:14:35,838 DEBUG [c.c.r.ResourceLimitManagerImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Checking if am
ount of resources of Type = 'template' for Project Name = projectname in Domain Id = 5 is exceeded: Account Resource Limit = 20, Current A
ccount Resource Amount = 6, Requested Resource Amount = 1.
2018-08-23 12:14:35,841 DEBUG [c.c.r.ResourceLimitManagerImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Checking if do
main resource limits of Type 'template' for Domain Id = 5 is exceeded: Domain Resource Limit = -1, Current Domain Resource Amount = 6, Req
uested Resource Amount = 1.
2018-08-23 12:14:35,842 DEBUG [c.c.r.ResourceLimitManagerImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Checking if domain resource limits of Type 'template' for Domain Id = 2 is exceeded: Domain Resource Limit = -1, Current Domain Resource Amount = 6, Requested Resource Amount = 1.
2018-08-23 12:14:35,886 DEBUG [c.c.r.ResourceLimitManagerImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Checking if amount of resources of Type = 'secondary_storage' for Project Name = projectname in Domain Id = 5 is exceeded: Account Resource Limit = 429496729600, Current Account Resource Amount = 217944424448, Requested Resource Amount = 108972212224.
2018-08-23 12:14:35,889 DEBUG [c.c.r.ResourceLimitManagerImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Checking if domain resource limits of Type 'secondary_storage' for Domain Id = 5 is exceeded: Domain Resource Limit = -1, Current Domain Resource Amount = 217944424448, Requested Resource Amount = 108972212224.
2018-08-23 12:14:35,890 DEBUG [c.c.r.ResourceLimitManagerImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Checking if domain resource limits of Type 'secondary_storage' for Domain Id = 2 is exceeded: Domain Resource Limit = -1, Current Domain Resource Amount = 217944424448, Requested Resource Amount = 108972212224.
2018-08-23 12:14:35,919 DEBUG [o.a.c.s.i.TemplateDataFactoryImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) template 247 is not in store:1, type:Image
2018-08-23 12:14:35,942 DEBUG [o.a.c.s.i.TemplateDataFactoryImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) template 247 is already in store:1, type:Image
2018-08-23 12:14:35,959 DEBUG [o.a.c.s.i.BaseImageStoreDriverImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Downloading template to data store 1
2018-08-23 12:14:35,970 DEBUG [o.a.c.s.i.BaseImageStoreDriverImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Performing image store createTemplate async callback
2018-08-23 12:14:36,003 DEBUG [c.c.h.o.r.Ovm3HypervisorGuru] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) getCommandHostDelegation: class org.apache.cloudstack.storage.command.DownloadCommand
2018-08-23 12:14:36,003 DEBUG [c.c.h.XenServerGuru] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) We are returning the default host to execute commands because the command is not of Copy type.
2018-08-23 12:14:36,004 DEBUG [o.a.c.s.RemoteHostEndPoint] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Sending command org.apache.cloudstack.storage.command.DownloadCommand to host: 6
2018-08-23 12:14:36,006 DEBUG [c.c.a.t.Request] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Seq 6-1837468647967162405: Sending  { Cmd , MgmtId: 90520739393856, via: 6(s-2-VM), Ver: v1, Flags: 100011, [{"org.apache.cloudstack.storage.command.DownloadCommand":{"hvm":false,"description":"test-oracledb-tmpl5","checksum":"","maxDownloadSizeInBytes":137438953472,"id":247,"resourceType":"TEMPLATE","installPath":"template/tmpl/47/247","_store":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://secondary-storage","_role":"Image"}},"url":"http://link to template","format":"RAW","accountId":47,"name":"247-47-8efe5f29-c3ee-3016-acd5-e4a4fb76bcfe","secUrl":"nfs://secondary-storage","wait":0}}] }
2018-08-23 12:14:36,006 DEBUG [c.c.r.ResourceLimitManagerImpl] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) Updating resource Type = template count for Account = 47 Operation = increasing Amount = 1
2018-08-23 12:14:36,057 DEBUG [c.c.a.ApiServlet] (qtp745160567-15:ctx-4e0d3cf8 ctx-194618fb) (logid:f2420e15) ===END===  x.x.x.x -- GET  command=registerTemplate&response=json&projectid=62b2e0d2-5f81-4a6d-8d8b-591283f10e4a&name=test-oracledb-tmpl5&displayText=test-oracledb-tmpl5&url=http%3A%2F%2Ftemplate-host%2Fcscoracle-db.raw&zoneids=4a2e5664-edc7-4a0d-ae4a-fd56e1dc4314&format=RAW&isextractable=false&passwordEnabled=false&isdynamicallyscalable=false&osTypeId=b70da464-54f4-11e8-8821-525400843140&hypervisor=KVM&ispublic=true&requireshvm=false&isfeatured=false&isrouting=false&details%5B0%5D.hypervisortoolsversion=xenserver56&directdownload=false&checksum=&_=1535019275786
2018-08-23 12:14:39,155 DEBUG [c.c.a.t.Request] (AgentManager-Handler-13:null) (logid:) Seq 6-1837468647967162405: Processing:  { Ans: , MgmtId: 90520739393856, via: 6, Ver: v1, Flags: 10, [{"com.cloud.agent.api.storage.DownloadAnswer":{"jobId":"ba4f5e6d-3466-45d7-8eb4-381d128422d7","downloadPct":0,"errorString":"Stream closed","downloadStatus":"DOWNLOAD_ERROR","downloadPath":"/mnt/SecStorage/447bd664-f78c-31d8-86c6-0bba5bfa390e/template/tmpl/47/247/dnld5981961215788364666tmp_","installPath":"template/tmpl/47/247","templateSize":0,"templatePhySicalSize":0,"checkSum":"","result":true,"details":"Stream closed","wait":0}}] }
2018-08-23 12:14:39,156 DEBUG [o.a.c.s.i.BaseImageStoreDriverImpl] (RemoteHostEndPoint-4:ctx-58623d61) (logid:6a92f38a) Performing image store createTemplate async callback
2018-08-23 12:14:39,213 WARN  [o.a.c.alerts] (RemoteHostEndPoint-4:ctx-58623d61) (logid:6a92f38a) AlertType:: 28 | dataCenterId:: 1 | podId:: null | clusterId:: null | message:: Failed to register template: d039d030-322c-49b3-8fa6-871edee2fb4b with error: Stream closed
2018-08-23 12:14:39,235 ERROR [o.a.c.s.i.BaseImageStoreDriverImpl] (RemoteHostEndPoint-4:ctx-58623d61) (logid:6a92f38a) Failed to register template: d039d030-322c-49b3-8fa6-871edee2fb4b with error: Stream closed
@izenk
Copy link
Author

izenk commented Aug 23, 2018

Web server from template download is going has no errors. Downloading just stops.

@izenk
Copy link
Author

izenk commented Sep 12, 2018

After upgrading to 4.11.1 problem still exists
log level to DEBUG didn't add any new info - just 'Stream closed' - this is some kind of general message from java library during http connection.
Problems exists only for uploading RAW templates. QCOW2 are uploaded normally.

@andrijapanic-dont-use-this-one

Issue still present in latest 4.11.2 RC3 for RAW - stream closed message.

@DennisKonrad
Copy link
Contributor

DennisKonrad commented Nov 5, 2018

Hi,

we also obseve the same problem.
It happens for us (for example) when uploading a 20GB qcow image.

We tried to upload from http via:
Templates
Storage

both did not work and failed with "Stream closed". There's nothing really interesting in the logs beside that.

image

@DennisKonrad
Copy link
Contributor

@rhtyd Is there a workaround for this?

@DennisKonrad
Copy link
Contributor

DennisKonrad commented Nov 5, 2018

Could this have to do with this hardcored max size defined here:

DownloadManagerImpl.java
// TO DO - Define Volume max size as well long maxDownloadSizeInBytes = (cmd.getMaxDownloadSizeInBytes() == null) ? TemplateDownloader.DEFAULT_MAX_TEMPLATE_SIZE_IN_BYTES : (cmd.getMaxDownloadSizeInBytes());

It is set to:

TemplateDownloader.java
long DEFAULT_MAX_TEMPLATE_SIZE_IN_BYTES = 50L * 1024L * 1024L * 1024L;

This equates to 50MB 50GB if i am not mistaken. @izenk @andrijapanic

Hmm, ok seems like this should work at least for us... But it doesnt. Any ideas where to search?

@DennisKonrad
Copy link
Contributor

DennisKonrad commented Nov 5, 2018

Can someone confirm that HttpTemplateDownloader is used for this?

@DennisKonrad
Copy link
Contributor

Ok, the logs we get in the SSVM look like this when trying to upload a template:

image

Now I'm sure HttpTemplateDownloader is at least involved here.

@DennisKonrad
Copy link
Contributor

@DaanHoogland I saw you did some work on this. Do you have an idea what we can do about this?

@DaanHoogland
Copy link
Contributor

@DennisKonrad It does not ring a bell in terms of code. It does sound like a sequence of characters that is interpreted somewhere in the stack as end of stream. Does the result match the start of the original file or is it corrupted?
And what is the fragment directly after it stops?

HttpTemplateDownloader does the actual downloading on secondary storage. You can add debug statements to it to get more information but keep in mind those statements end up on the ssvm, not on the management server.

@DennisKonrad
Copy link
Contributor

@DaanHoogland For small files it is working. But for files with multiple gb it seems to enter the „copyBytes“ only once. Theres
An empty file created and thats all.

Therefor I suspect the large filesize is the Problem here. Unfortunately it‘s very work intensive for me to Test changes to this piece of code so Im trying to get a strong indication of whats wrong First.

Also my java Programming Skills are more like basic Knowledge :/

@izenk
Copy link
Author

izenk commented Nov 20, 2018

Any news on this?

@DennisKonrad
Copy link
Contributor

@izenk Kinda,
we found out the copyBytes function itself (standalone java code) is working not just for small but also for huge files (>10GB). but we get the "Stream closed" message when using the upload functionality within cloudstack. I suspect some of the surroundings is causing this but thats just guessing.

Because the code is running within the storage vm it's hard for us to test it. We are currently in the process of setting up a testenvironment to narrow down the cause of this problem.

Any help on how to speed this up is much appreciated.
Best regards

@DaanHoogland
Copy link
Contributor

top of head guess: file server, network and vm are the suspects, @DennisKonrad. as you hav already pproven the isolated code of course. So my first step would be to prove it against the fileserver (testing the network is unfortunately implied on that path). If it fails go for another (type of) fileserver over the same networkpath. As a side track, invewtigating the iptables and logging on the SSVM might shed some light as well.

@andrijapanic-dont-use-this-one
Copy link

I seems to NOT have issues with qcow2 template (~14GB realistic size (du -hs), though image is 100GB).

@DennisKonrad
Copy link
Contributor

DennisKonrad commented Jan 15, 2019

This issue solved itself for us without us doing something really.

We updated maybe or something and after that it just worked.

@andrijapanicsb
Copy link
Contributor

still present in master - just tested...only RAW templates seems affected. (i.e. could download qcow2 template 50GB (du -hs shows 49GB), while after converted same template to RAW format, I again get stream closed error message...

@rohityadavcloud rohityadavcloud added this to the 4.13.0.0 milestone May 27, 2019
@rohityadavcloud rohityadavcloud modified the milestones: 4.13.0.0, 4.13.1.0 Jul 4, 2019
@andrijapanicsb
Copy link
Contributor

andrijapanicsb commented Jan 10, 2020

Still present in master - RAW templates don't work - can tested by downloading the raw image from here: http://dl.openvm.eu/cloudstack/macchinina/x86_64/macchinina.img

@DaanHoogland
Copy link
Contributor

tested that image in #3993. I'm not sure if that is enough testing though, @izenk @DennisKonrad @andrijapanicsb have an opinion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants