brg #13

Open
paradigm opened this Issue May 22, 2014 · 6 comments

Projects

None yet

2 participants

@paradigm
Member

There should be a tool to automate acquiring clients. Tools like debootstrap and pacman can make a client, but they largely depend on a similar client already existing due to dependencies. This tool should be very portable to avoid that limitation. The tentative name for the tool is "brg".

@paradigm paradigm added this to the Hawkey milestone May 22, 2014
@paradigm
Member

Tentative work on brg is being done here. Note the script itself and support scripts.

@meinwald meinwald was assigned by paradigm May 22, 2014
@meinwald

My current work is in https://github.com/meinwald/bedrocklinux-userland/tree/brg

So far I have been mostly working on a CentOS support script, but fixed some small issues in brg and the Debian support script.

Current status of the CentOS support script is that it will install CentOS 6 reliably (at least in my own testing) but I'm working on a strategy to install any version of CentOS. There are two main barriers to this:

  1. In order to accurately resolve dependencies, I need to bootstrap sqlite. The dependencies for that vary between releases.
  2. The sqlite included in CentOS 5 and lower is older and less featureful and specifically, recursive triggers that simplify the queries are not available.

Current plan to combat this is to use CentOS 6 sqlite regardless of release selected. This does address (1) and (2) but has the disadvantage of requiring that the mirror provided includes CentOS 6.

My goal on this issue is to make all existing support scripts feature complete (support all applicable options in brg) and add a -g option for group/task installs to complement the existing -p for packages. At that point, I'll work with paradigm to determine who will handle future development.

@paradigm
Member

CentOS 6 sqlite requirement just a temporary measure, or planned for the long term? If it is just a work-around for the limitations of CentOS 5's sqlite, with the expectation of using the target release's native sqlite for future releases, that should be fine, I think. However, if the plan is to use it going forward:

  • While I know CentOS 6 is going to be with us for a long time, I'd rather see if we can have something native to the release going forward.
  • While we could use CentOS 6's sqlite for non-CentOS RPM distros, I'd like to see if we can stick with using everything native to the given distro and its repositories. Thus we'll need something else for, say, Fedora.

I expect those requirements may be a bit naive. I could give up on them later if we can't pull it off, but I'd like to try.

Is the concern that you are using sqlite to resolve dependencies, and thus have a catch-22 about what to use to resolve sqlite's dependencies? If so, I may try to find some alternative solution (likely using the XML document instead of sqlite for dependency information).

I could:

  • Try to make my own XML parser in shell/awk. It wouldn't have to be fully-featured; just enough to resolve dependency information. I know it is supposed to be easy to underestimate the difficulty of writing an XML parser, but I'm tempted to give it a shot anyways. Parse it character-by-character (no AWK record/field stuff), use a stack for the current state. The trickiest parts would have been dealing with escaped characters, but XML uses things such as " which make it much easier to parse. After those, hardest thing will be comments and CDATA, which don't seem to bad either.
  • Find common packages provided by distros such as centos with reliably constant dependencies to help parse the XML. Maybe perl?
  • Use a third party project that is not provided in the Bedrock Linux core or the target distro such as this: http://www.xmlsh.org/HomePage.

If none of these are found to be viable, we can certainly fall back to using CentOS 6's sqlite for as long as it is available. CentOS 6 is not going anywhere anytime soon.

I'll be happy to take future development back from you once you're happy with the part your working on. Didn't mean to imply any pressure by assigning this to you.

@meinwald

Ideally this would just be a workaround for CentOS 5 and older. There are two distinct issues, the first being changing dependencies for sqlite. Due to the infrequent releases of CentOS, I am comfortable, at least for the time being, tracking the dependencies for sqlite and keeping that able to bootstrap on all CentOS 6+, so those should all be able to work natively. I think the vast majority of users will be wanting CentOS 6+ at this point due to the age of CentOS 5, so while I will support CentOS 5 and lower, I think it is reasonable to use a workaround for it.

As far as parsing XML, that would solve the problem, but it is tricky to implement correctly. To comment on your suggestions:

  • New parser in shell/awk. This would work, but it would be difficult to ensure accuracy. While it does not technically need to be a full parser, it would need to be fairly close, as things like order of elements could change. Whitespace convention definitely does change between release so it would need to go character by character.
  • Perl specifically has a lot more dependencies than sqlite, so we would be in the same situation. Python and ruby would also have this issue. In a lot of cases programming languages are even built with sqlite support so they would pull in sqlite plus all their things. Even if there were something that would be able to be used almost directly, there tend to be some base packages that get pulled in with almost everything, and sometimes they change names around or split/merge things, so I don't think it's possible to reliably install anything with one set of dependencies that would work against all or most releases. It would be possible to leverage software from an existing guest, but it would add complexity to ensure compatibility against a lot of distros and it would prevent CentOS from being the first installed client.
  • An existing third-party library is probably the best available option for parsing XML, though it would add a dependency.

There are potentially a few ways this can end up working without adding in XML parsing, and it will be hard to see which is best until CentOS 7 is released (RHEL7 beta is out but they organize things a bit differently). I still think the workaround for CentOS 5 and lower is the best we can do now, but if there were other uses for parsing XML in core Bedrock or XML code were otherwise brought into core, I would make use of that.

I haven't had as much time as I thought I would this past week, but I should be able to get the CentOS 5 workaround code written in the next week or so. At that point, it may be clearer what is workable.

@paradigm
Member

For now assume no fancy new XML stuff. If the dependency resolution stuff is sufficiently compartmentalized/modular I expect we can always go back and change it. No pressure on the time tables.

@paradigm paradigm modified the milestone: Hawkey Jun 25, 2014
@paradigm
Member

Updating the issue with the current line of thinking about this project:

There are three strategies being pursued:

  1. Using very portable code, figure out, acquire, and unpackage everything required to bootstrap the target distro. Then, chroot into where it was unpackaged and use the distro's own code to bootstrap what we want for a client. For example, for Debian-based systems, have a busybox-based shell script parse a Debian repository for debootstrap and its dependencies, acquire and extract those into a temporary directory, chroot into that and run debootstrap to get the client, then clean up.
  2. Acquire an installation disk image for the target distro, then mount and chroot into it. Then, use the software on the image to bootstrap a client. The advantage here over strategy (1) is that we do not need to do things such as calculate dependencies in our own code. The disadvantages are that (a) this will likely take much more bandwidth and temporary disk space than would be necessary with (1), (b) the layout of disk images could change between releases; this would be unlikely to be future-proof, and (c) there is no guarentee the image will have sufficiently flexible software on it for our uses.
  3. Various distros provide pre-made tarballs or images of a filesystem we can use. These include Gentoo (tarball), Arch (tarball) and Fedora (filesystem image). The only downside to these is that most distros do not provide them. There are third party services which do, such as OpenVZ templates, but I would rather not use or recommend them if we can avoid it.

The primary sticking point at the moment is RHEL-based distros (other than Fedora). Information such as dependencies is provided in both XML and sqlite. Parsing either of these with just busybox is undesirable, making strategy (1) undesirable as well. The CentOS installer, and presumably other RHEL-based installers, does not actually come with yum installed in the installer. Thus, we cannot utilize yum from the installer to determine dependencies. Moreover, option (3) does not seem to exist for CentOS.

To resolve this sticking point, we may have to do one of the following:

  • To allow strategy (1) or (2), we may have to hardcode dependencies for either yum, sqlite or some XML parser for each release of each non-Fedora RHEL distro.
  • To allow strategy (1) or (2), we may have to write our own XML parser in shell, include a pre-existing shell XML parser into the core, include some non-shell but static XML parser into the core.
  • To allow strategy (2), we may have to write python code to utilize the yum libraries in the installer image.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment