Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ros2 rosindex generation and deploy #568

Closed
sloretz opened this issue Aug 14, 2018 · 14 comments
Closed

Ros2 rosindex generation and deploy #568

sloretz opened this issue Aug 14, 2018 · 14 comments

Comments

@sloretz
Copy link
Contributor

sloretz commented Aug 14, 2018

I'm looking for input about automating the generation and deployment of a ros2 rosindex equivalent website automatically. @mikaelarguedas gave me a an explanation of how this might look, and recommended opening a ticket for more comments.

Task background

rosindex is a static website using jekyll and a jekyll ruby plugin. Content is generated from ros/rosdistro. Once generated the result is pushed to a github repo to be hosted using github pages. The entire website is generated at once, meaning content for all ROS distros would be generated at the same time even if only one distro has changed. Generating bouncy + ardent takes less than a minute on my machine.

The website generation job would need to happen in a docker container that has ruby and some other dependencies installed.

Job Dependencies

The website should reflect released packages, so it would need to be regenerated any time a ROS distro is synced from testing to main. Since it generates content from all distros, something needs to be done to avoid generating content for other distros that have changed in ros/rosdistro but not been sync'd to main. Maybe a job could cache the ROS distro specific distribution.yaml file when the sync to main happens?

+-------------------+    +----------------------+
|sync to main ardent+--->+cache rosdistro ardent+-----+
+-------------------+    +----------------------+     |   +----------------------+
                                                      <-->+generate ros2 rosindex|
+-------------------+    +----------------------+     |   +----------------------+
|sync to main bouncy+--->+cache rosdistro bouncy+-----+
+-------------------+    +----------------------+

The output of generate ros2 rosindex is a commit to some github pages repository, so the job needs a github user with write access. Are their other jobs that push to github which this job could be based off of?

@gavanderhoorn
Copy link
Contributor

With rosindex, are you referring to rosindex by @jbohren?

@tfoote
Copy link
Member

tfoote commented Aug 14, 2018

@gavanderhoorn yes, we're referring to that.

RE: Push access
We can relatively easily setup ssh keys to allow pushing to github from a specific job.

Re: job dependencies
I would recommend that we not try to couple this with the syncs.

rosindex indexes a lot more than just the released versions so there's actually a lot of content that's updated more asynchronously. It's pulling content from the upstream repositories etc that aren't even released. I would suggest that we plan to run it periodically like we do already with the doc jobs.

Being "synced" to main is not usually what we want for documentation/indexing. We usually have all documentation pull from the branch flagged by the doc tag so that a documentation fix does not require a rerelease of the package to become available. That would cause a lot of unnecessary rebuilding of packages.

@dirk-thomas
Copy link
Member

I would recommend that we not try to couple this with the syncs.

👍

The doc_independent-packages job runs daily atm. Adding additional repos to this job can easily be done by configuration: see https://github.com/ros-infrastructure/ros_buildfarm_config/blob/195e00fec02727858afb75a4d99c1febd8a97365/doc-independent-build.yaml#L5-L12

While the job has a different scope at the moment it should be fairly easy to make it work for this use case:

  • The repos a cloned "shallow" (
    'python3 -u $WORKSPACE/ros_buildfarm/scripts/wrapper/git.py clone --depth 1 %s $WORKSPACE/repositories/%s' % (repo_url, repo_name),
    ) - you probably want a full clone in order to push back to the repo
  • For each repo make html is being invoked:
    echo "# BEGIN SUBSECTION: $subdir: make html"
    cd $BASE_DIR/$subdir/doc
    mkdir -p $OUTPUT_DIR/$subdir
    ln -s $OUTPUT_DIR/$subdir _build
    (set -x; make html)
    echo "# END SUBSECTION"
    so wrapping the logic in a Makefile would be all that is necessary to process that new repo.
  • The job uploads the documentation to docs.ros.org but if the package doesn't generate any it would just skip this part.
  • Instead the job could commit the changes during its make invocation.

I am just describing this approach since I think adding a new job type requires a lot more code as well as documentation (the logic needs to run within a custom generated Docker container, there probably needs to be a configuration file, the new job needs to be documented, etc.).

@sloretz
Copy link
Contributor Author

sloretz commented Aug 23, 2018

@dirk-thomas Wrapping all the logic in a makefile seems doable for rosindex as long as the user running make html has permission to install stuff into the docker container. Besides running make html locally, how do I test a new doc_independent job?

@dirk-thomas
Copy link
Member

as long as the user running make html has permission to install stuff into the docker container

I am not sure that is the case.

how do I test a new doc_independent job?

The is no "local invocation" of that job type. Also since your make html aims to upload the data that would be affecting the live server which wouldn't be desired.

@sloretz
Copy link
Contributor Author

sloretz commented Aug 23, 2018

as long as the user running make html has permission to install stuff into the docker container

I am not sure that is the case.

Sounds like I need to learn how to set up a new job type. Do you have a link to a past PR that added one?

Also since your make html aims to upload the data that would be affecting the live server which wouldn't be desired.

It might not matter, but assume I made make html upload to a staging server for the purpose of making sure the logic works. Do I need to set up a full buildfarm on a computer to test the job?

@dirk-thomas
Copy link
Member

Sounds like I need to learn how to set up a new job type.

As mentioned above creating new job types is fairly "expensive" so I wouldn't suggest that. Instead the Docker container invoking these Makefiles could be updated to contain additionally required dependencies.

Do I need to set up a full buildfarm on a computer to test the job?

I would rather suggest to duplicate the job (on the existing buildfarm) and iterate on the copy (without actually pushing the results of the other packages to the server).

@hidmic
Copy link
Contributor

hidmic commented Oct 9, 2018

Alright, we can piggyback rosindex builds on the doc_independent job. But @sloretz has a point, we need additional dependencies to build rosindex that are likely not present in the buildfarm containers.

Off the top of my head, we could either change the doc independent Dockerfile template to install these or change the doc_independent job to perform the necessary setup prior to the build. Does that sound reasonable to you @dirk-thomas ?

@dirk-thomas
Copy link
Member

we could either change the doc independent Dockerfile template to install these or change the doc_independent job to perform the necessary setup prior to the build.

As I mentioned in my previous comment:

Instead the Docker container invoking these Makefiles could be updated to contain additionally required dependencies.

@hidmic
Copy link
Contributor

hidmic commented Oct 10, 2018

@dirk-thomas Right, and sorry for the ignorance if I'm wrong, but doesn't that mean to add the necessary dependencies to the doc_independent Dockerfile template here? I'm new to the buildfarm ecosystem, but it'd seem that the container image is re-generated (reusing cached layers most likely though) when the Jenkins job runs.

@dirk-thomas
Copy link
Member

doesn't that mean to add the necessary dependencies to the doc_independent Dockerfile template here?

Yes.

it'd seem that the container image is re-generated (reusing cached layers most likely though) when the Jenkins job runs.

Yes.

@hidmic
Copy link
Contributor

hidmic commented Oct 11, 2018

Alright, dove deeper into the task and there're a couple alternatives.


If we go down the "reuse the doc_independent job" route, I found some itchy details:

  • On dependencies. To build rosindex we need to pull a rather long list of debian packages (though some we can certainly prune). As suggested by @dirk-thomas, the simplest thing we can do is just adding them to the Dockerfile here. To that, we add bundled gems. As suggested by @nuclearsandwich, we can bundle them into the repository. To be resilient to platform changes, we need to rebuild gems' C extensions, if any (and there're some e.g. for nokogiri, eventmachine, etc.). For that we need the bundle pristine command (since bundle exec gem pristine --all is not working, see bug ticket), which unfortunately is not available for bundler 1.11.2 as provided in Ubuntu 16.04 apt sources. So we need an extra gem install bundler step in the Dockerfile.
  • On layout. All provided repositories are assumed to have a doc/ subdirectory from where make html can be run (see here). We do have the target but not the same directory layout. Thus we need to make a special case for rosindex.
  • On networking. To build rosindex, multiple (and by multiple I mean many) repositories have to be fetched. But the doc_independent container won't allow that. The simplest thing to do would be to add --net=host to the docker run statement here. We don't have to make this a special case for rosindex, but I'm not sure if we want to do this for all doc independent repos being built.
  • On output. The doc_independent job rsync's generated documentation, see here. Unless we defer rosindex production site updates to another step after the job has run, this logic won't do for us. We could make the output directory a local clone of the production repository (as we do now), and selectively rsync or git push content, or just make rosindex a special case (e.g. using another directory below generated_documentation/). Either way, we also need to setup push access to the production repository.

This applies whether we make all this changes to the doc_independent job or if we fork it into an e.g. doc_index job that's tailored for rosindex. The latter is (potentially much) more work, but IMHO has the benefit of keeping both intent and logic clearer.


A different approach would be a source side container (as suggested by @nuclearsandwich). Everything is then kept along with rosindex sources. The buildfarm becomes just an executor of a scripted build that may as well be run locally. That's quite flexible and probably easier to maintain, but it means that a new job (a clone & run or a clone, build & push one if we want to have separate explicit build and push phases) has to be created from scratch (along with documentation and everything).


I wanted to reach out to you guys for some back and forth before going down the (hopefully correct) rabbit hole. Personally, I would like building and deploying to be pretty much the same whether you run it on the buildfarm or locally. That would also simplify automating all this. What do you think?

@dirk-thomas
Copy link
Member

@hidmic Can this be closed with #576 merged and deployed?

@hidmic
Copy link
Contributor

hidmic commented Nov 13, 2018

@dirk-thomas Yes it can, I thought I had used a Closes #... sentence in #576.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants