Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the estimated size by adding the size of RPMs (yum cache) #874

Closed
wants to merge 2 commits into from

Conversation

bessonc
Copy link

@bessonc bessonc commented Oct 17, 2019

Related: rhbz#1761337

@coveralls
Copy link

coveralls commented Oct 17, 2019

Pull Request Test Coverage Report for Build 2225

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at ?%

Totals Coverage Status
Change from base Build 2193: 0.0%
Covered Lines:
Relevant Lines: 0

💛 - Coveralls

anaconda moves the yum cache to the disk once it is partitioned, so the
downloaded rpms also consume space until the installation is finished.
Take this into account when estimating the size.

Related: rhbz#1761337
@bessonc bessonc force-pushed the rhel7-extras branch 5 times, most recently from df3de2e to 40e529d Compare October 18, 2019 15:41
src/pylorax/api/compose.py Outdated Show resolved Hide resolved
@bcl
Copy link
Contributor

bcl commented Oct 18, 2019

(continuing the bz conversation here). You said using estimate_size() resulted in doubling your rootfs? Here's what I get when I run it:
DEBUG lorax-composer: installed_size = 482524878, template_size=1678969342, metadata_size=415125504

It adds about 400M to the size.
The yumlock.yb.conf.installroot referenced in #875 is where lorax-composer puts the metadata, it is not where packages are installed or cached so it should be a fairly accurate number for just the metadata.

@bessonc
Copy link
Author

bessonc commented Oct 21, 2019

(continuing the bz conversation here). You said using estimate_size() resulted in doubling your rootfs? Here's what I get when I run it:
DEBUG lorax-composer: installed_size = 482524878, template_size=1678969342, metadata_size=415125504

It adds about 400M to the size.
The yumlock.yb.conf.installroot referenced in #875 is where lorax-composer puts the metadata, it is not where packages are installed or cached so it should be a fairly accurate number for just the metadata.

Again, I agree. But on my both test VM, I confirm the metadata_size is about 2 GB. I'm not sure that I didn't break something, so I will make a new try. Currently, I have:

2.0G	/var/tmp/composer/yum/root/var/tmp/composer/cache/os
1.7G	/var/tmp/composer/yum/root/var/tmp/composer/cache/os/gen
338M	/var/tmp/composer/yum/root/var/tmp/composer/cache/os/gen/primary_db.sqlite
254M	/var/tmp/composer/yum/root/var/tmp/composer/cache/os/gen/filelists_db.sqlite
1.1G		/var/tmp/composer/yum/root/var/tmp/composer/cache/os/gen/other_db.sqlite

But only 400M in the corresponding yum cache:

405M	/var/cache/yum/x86_64/7Server/os
338M	/var/cache/yum/x86_64/7Server/os/gen
338M	/var/cache/yum/x86_64/7Server/os/gen/primary_db.sqlite

EDIT:
1/ Installed a RHEL 7.7 from scratch.
2/ Setup my local repositories for increased bandwidth, and double-checked there is no duplicates.
3/ yum --enablerepo=extras install lorax lorax-composer composer-cli ; yum update ; reboot
4/ Enabled and started lorax-composer. Once the service is started, /usr/libexec/urlgrabber-ext-down is forked in the background and it downloads metadata files into /var/tmp/composer/yum/root/var/tmp/composer/cache/. After a few time, the disk usage of this directory is 2 GB.

Maybe using cdn.redhat.com (which is slow for me) does not lead to this problem. Finally, we don't care about my test case, it will work anyway with your patch.

EDIT2:
And here is the size of the metadata for my customer (he uses a sat server):

3.5G	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms
3.2G	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms/gen
282M	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms/gen/primary.xml
337M	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms/gen/primary.xml.sqlite
497M	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms/gen/filelists.xml
253M	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms/gen/filelists.xml.sqlite
923M	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms/gen/other.xml
944M	/var/tmp/composer/yum/root/var/tmp/composer/cache/rhel-7-server-rpms/gen/other.xml.sqlite

As said previously, I don't understand why the metadata downloaded by lorax are so large, there is maybe another issue here. But I'm going to come back to your first recommandation, that is to say including your commit from #875.

Anyway, thanks for your time on this.

anaconda also writes the repo metadata to the disk, so take that into
account when estimating the required size. We do this by using the size
of lorax-composer's copy of the metadata which was used to depsolve the
blueprint.

Resolves: rhbz#1761337
@bcl
Copy link
Contributor

bcl commented Oct 21, 2019

Thanks for taking another look at this, we have no control over how much metadata is downloaded and depending on the repository being used it may be different than what I am using. When you say 'After a few time, the disk usage of this directory is 2 GB.' do you mean it is smaller the first time?
eg. start from a clean setup (you can stop the service and rm -rf /var/tmp/composer and then restart it).

  • What is the usage under /var/tmp/composer/yum/root/var/tmp/composer/ after a restart?
  • What about after doing a composer-cli blueprint depsolve example-http-server?
  • And after doing composer-cli compose start example-http-server ami?

With the repos I have setup here I am not seeing a significant increase after starting the service.

@bessonc
Copy link
Author

bessonc commented Oct 23, 2019

Thanks for taking another look at this, we have no control over how much metadata is downloaded and depending on the repository being used it may be different than what I am using. When you say 'After a few time, the disk usage of this directory is 2 GB.' do you mean it is smaller the first time?

No, I was just talking about the time needed to download 2GB.

eg. start from a clean setup (you can stop the service and rm -rf /var/tmp/composer and then restart it).

Yep, already done this several time.

* What is the usage under `/var/tmp/composer/yum/root/var/tmp/composer/` after a restart?

2 GB. I finally tested with cdn.redhat.com (with only the base repo) and I got the same size.

* What about after doing a `composer-cli blueprint depsolve example-http-server`?
* And after doing `composer-cli compose start example-http-server ami`?

Done, no problem, that doesn't change the size of the cache.

With the repos I have setup here I am not seeing a significant increase after starting the service.

I don't understand how you can get a so small cache on your system. In fact, a cache of 400M is the one I have in the standard /var/cache/yum and that's why I don't understand why the cache in /var/tmp/composer is so large.

@bcl
Copy link
Contributor

bcl commented Oct 23, 2019

No, I was just talking about the time needed to download 2GB.

Ok, good, I would have been more puzzled :)

I don't understand how you can get a so small cache on your system. In fact, a cache of 400M is the one I have in the standard /var/cache/yum and that's why I don't understand why the cache in /var/tmp/composer is so large.

I don't use the cdn or a subscription, I am using a local mirror of internal releases.

So the new question is, does the installation image really need that 2G of extra space? If you feel like experimenting, you could calculate the extra metadata size and still display it, but add a constant instead as an experiment, say 500M, to see if that's enough. If it still works then I'm not sure this is a valid way to calculate needed extra space.

@bessonc
Copy link
Author

bessonc commented Oct 30, 2019

If you feel like experimenting, you could calculate the extra metadata size and still display it, but add a constant instead as an experiment, say 500M, to see if that's enough. If it still works then I'm not sure this is a valid way to calculate needed extra space.

I can confirm this is probably not the right way. In my first attempt, when I only added the size of all RPM packages, only a few extra MB were required (since the calculation was over-estimated by 40% for that too). That's why in my 2nd attempt, I only added the size of the metadata (4 files per repos, that's what I can see when I refresh my yum repos), and it was enough.

For my test case, I have:

  • initially about 2150 MB (whereas 2850 are needed)
  • after the patch taking into account the size of RPMs, 2840 MB
  • after the patch which includes 4 files corresponding to the metadata usually downloaded by yum, I got ~3200 MB for the rootfs partition. I used the attribute "size", but there is also an "opensize" attribute. I think it could be a good idea to consider this extra size too, as it corresponds to the extracted space needed during the install process.

I don't know why some files are downloaded in the installroot (/var/tmp/composer/yum/root), but not in the "standard" yum cache (in /var/cache/yum), especially:
254M /var/tmp/composer/yum/root/var/tmp/composer/cache/os/gen/filelists_db.sqlite
1.1G /var/tmp/composer/yum/root/var/tmp/composer/cache/os/gen/other_db.sqlite

So I think we don't need to count them.

@bcl
Copy link
Contributor

bcl commented Nov 21, 2019

See PR #875
The reason for the large difference between host/composer metadata size and what Anaconda actually downloads is the filelists and 'other' metadata. composer actually needs the filelists in order to more accurately calculate the usage.

@bcl bcl closed this Nov 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants