Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new Docker images in kitchen (Chef) tests to get Systemd activated #439

Closed
11 tasks done
rshad opened this issue Feb 3, 2020 · 7 comments
Closed
11 tasks done
Assignees

Comments

@rshad
Copy link
Contributor

rshad commented Feb 3, 2020

Hi all!

We need to change the Docker images we use in Kitchen tests so we get Systemd which fits better with Kitchen tests when checking the status of the service, or if it's enabled or disabled, etc .. .

Tasks

  • Identify the proper Docker images.

  • Adapt kitchen.yml to use ENVIRONMENT variables.

  • Adapt run.sh

  • Adapt the corresponding tests.

  • Verify the changes

    • Ubuntu
      • 14.04
      • 16.04
      • 18.04
  • CentOS 7.

  • Amazon Linux 2.

Kr,

Rshad

@rshad rshad self-assigned this Feb 3, 2020
@rshad rshad added Chef and removed Chef labels Feb 3, 2020
@rshad rshad added this to the Sprint - 106 - DevOps milestone Feb 3, 2020
@rshad
Copy link
Contributor Author

rshad commented Feb 3, 2020

Working branch: feature-439-docker-images-systemd.

@rshad
Copy link
Contributor Author

rshad commented Feb 4, 2020

Hi all!

The required tasks are almost completed.

On the following, we briefly describe the changes we applied to reach our goals in this issue.

Folders/Files Re-structuring

We first, decided to re-structure the folders/files in the folder kitchen/wazuh-chef. So it looks like:

├───auxiliary
│   └───cookbooks
│       └───load_attributes
│           └───recipes
├───common
├───environments
│   ├───agent
│   └───manager
│       └───test_environment
└───suites
    └───manager_agent
        └───tests
            └───base
  • auxiliary
    Where auxiliary/cookbooks contains the auxiliary cookbooks we need for testing purposes, and in this case it's a unique one, load_attributes which we use to save the corresponding node's attributes into a JSON file, where Chef is running.

  • common
    It contains all the files/scripts that can be used by the different testing suites.

  • environments
    It contains the tests environment files, for each of the tested cookbooks. In this case, those are the manager and the agent.

  • suites
    It contains the different suites or scenarios, each with its kitchen.yml and tests.


Parameterization Variables

Our parameterization variables are environment variables that will be assigned in execution time. These are:

  • IMAGE: Docker image name
  • PLATFORM: Platform name.
  • RELEASE: Docker image release.
  • SUITE_PATH: suite folder path.
  • COOKBOOKS_PATH: cookbooks folder path.

Parameterized kitchen.yml

For each testing suite there will be a dedicated folder under the folder kitchen/wazuh-chef/suites and each suite will have its kitchen.yml.

We parameterized the platform configuration's section as follows:

platforms:
  - name: <%= ENV['PLATFORM'] %>_<%= ENV['RELEASE'] %>_kitchen_chef
    driver_config:
      image: <%= ENV['IMAGE'] %>
      platform: <%= ENV['PLATFORM'] %>
      forward: 443
      publish_all: true
      run_command: /sbin/init
      privileged: true
      volume:
        - /sys/fs/cgroup:/sys/fs/cgroup:ro
      provision_command:
        - sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
        - python -mplatform | grep -qi debian && apt-get install -y apt-transport-https gnupg2 || yum install -y openssl

Parameterized run.sh

We also parameterized the paths in run.sh as follows:

development_agent_path="$COOKBOOKS_PATH/wazuh_agent/test/environments/development.json"
development_manager_path="$COOKBOOKS_PATH/wazuh_manager/test/environments/development.json"
development_manager_path_master="$COOKBOOKS_PATH/wazuh_manager/test/environments/development-master.json"

Centralized Provisioning commands in kitchen.yml
We do not run provisioning commands in run.sh anymore, but in kitchen.yml.

platform:
.
.
    driver_config:
       provision_command:
        - sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
        - python -mplatform | grep -qi debian && apt-get install -y apt-transport-https gnupg2 || yum install -y openssl

As you can see, the following command autoruns the corresponding command based on the OS distribution:

 python -mplatform | grep -qi debian && apt-get install -y apt-transport-https gnupg2 || yum install -y openssl

Errors and Issues to solve

1. AmazonLinux Hostname is not loaded by Chef.
On AmazonLinux2 the agent was not able to register itself in the corresponding manager; It fails when running the task:

args = "-m #{agent_auth['host']} -p #{agent_auth['port']} -A #{agent_auth['name']}"

if agent_auth['auto_negotiate']
  args << ' -a ' + agent_auth['auto_negotiate']
end

With the error:

         * execute[/var/ossec/bin/agent-auth -m 172.17.0.3 -p 1515 -A ] action run[2020-02-04T17:06:36+00:00] ERROR: execute[/var/ossecess to exit with [0], but received '1'
       ---- Begin output of /var/ossec/bin/agent-auth -m 172.17.0.3 -p 1515 -A  ----
       STDOUT:
       STDERR: /var/ossec/bin/agent-auth: option requires an argument -- 'A'

Looks like the hostname is not loaded correctly.


2. Docker images created by kitchen are not deleted automatically

We noted that we did not set the corresponding configuration to delete the Docker images created by Kitchen. And to do so we set the feature remove_images: true in kitchen.yml

driver:
  name: docker
  use_sudo: false
  remove_images: true
  use_internal_docker_network: true

However, this is not working, as the latest version of ktichen-docker gem ~ 2.9.0 produce the error:

unknown flag: --type See 'docker --help'. Usage: docker [OPTIONS] COMMAND

And this problem was reported in test-kitchen/kitchen-docker#338 and supposedly fixed in test-kitchen/kitchen-docker#340.

To test the corresponding fix, we cloned the master branch of https://github.com/test-kitchen/kitchen-docker and replaced the installed gem by it. But it produced an error when creating the containers and it's reported and fixed here test-kitchen/kitchen-docker#356.

Testing with the corresponding fork branch https://github.com/paulcalabro/kitchen-docker/tree/fix-ip-address-issue resolves the commented issue and delete the docker images, but it produce an exception which orginally is generated due to the changes made in test-kitchen/kitchen-docker#340.

>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: 1 actions failed.
>>>>>>     Failed to complete #destroy action: [Expected process to exit with [0], but received '1'
---- Begin output of docker -H unix:///var/run/docker.sock rmi 29d6ee71040d ----

We will continue working to fix such errors.

Kr,

Rshad

@rshad
Copy link
Contributor Author

rshad commented Feb 5, 2020

Hi all!

Regarding the error we got when deleting the corresponding Docker images, I created an issue to comment on the error test-kitchen/kitchen-docker#360 and I fixed it in my fork branch. A PR with the fix is created also test-kitchen/kitchen-docker#361

We also adapted the related Dockerfile so it uses our fork branch of kitchen-docker till the pending PR get merged in branch master of kitchen-docker and also to get such changes published in a new release of the gem.

RUN git clone git@github.com:rshad/kitchen-docker.git && \
    git checkout fix-docker-image-deletion && \
    git pull origin fix-docker-image-deletion

RUN rm -rf /usr/local/bundle/gems/kitchen-docker-2.9.0/* && \
    cp -rf kitchen-docker/* /usr/local/bundle/gems/kitchen-docker-2.9.0/

Kr,

Rshad

@rshad
Copy link
Contributor Author

rshad commented Feb 5, 2020

Hi all!

Regarding the issue, we are facing with AmazonLinux2 agent, where the agent is not able to load the node hostname by calling the Ohai attribute node["hostname"] value. To fix such an issue we had many attempts, but none lead to a functioning case. On the following, we describe our study in detail.

  • Assigning the hostname in kitchen.yml
    By default, we do not assign a custom hostname to the created instances, but they take the corresponding container ID as the hostname. So we had a doubt that maybe, in the case of amazonlinux the hostname is not set, and by setting it manually in kitchen.yml, maybe it would work, but it didn't.

To be sure that the Ohai attributes are loaded correctly, once the instances are up and running, we ran the corresponding commands manually and the hostname is set properly.

$ ohai hostname
> [
       <hostname: Container ID>
]
  • Using Chef function lazy

After investigating this issue, I found that it's probably related to that the Ohai attributes are no directly loaded, but needs some time, and so we need to delay the variable evaluation, and here comes the function lazy.

system cookbook will currently support using node['fqdn'] within templates, but when being used with the variables attribute in resources, you'll need to lazy load.

But using the function lazy would help in our case as we do not want to delay the execution if all the tasks in the recipe agent.pp but only the related ones and in a determined order.

  • Using ruby_block

ruby_block guarantees that all the tasks included in it will run in lazy mode, but it did not help in our case because it does not support running execute function and our principal target task is a execute task:

    execute "#{dir}/bin/agent-auth #{args}" do
    timeout 30
    ignore_failure node['ossec']['ignore_failure']
    only_if { agent_auth['register'] == 'yes' && agent_auth['host'] && !File.size?("#{dir}/etc/client.keys") }
    end
  • Running converging 2 consecutive times

We noticed that after creating the corresponding instance for the amazonlinux agent, converging the first time fails to register the agent, but on the second time, it succeeds. This tells us that the target attributes took some time to be loaded and so on the second execution there was nor problem.

Kr,

Rshad

@rshad
Copy link
Contributor Author

rshad commented Feb 6, 2020

Hi all!

We decided to maintain the support for Ubuntu 14.04 in our tests, and so we needed to adapt the tests of testinfra as we added some new tests which use original functions of testinfra and these use systemd => systemctl to deal with the target services.

    dist = host.system_info.distribution.lower()
    release = host.system_info.release
    manager = host.service("wazuh-manager")

    if( not( (dist == 'ubuntu') and (release.startswith("14")) ) ):
        with host.sudo():
            assert manager.is_running
            assert manager.is_enabled

In this case, we add a conditional statement to avoid the execution of such funtions if the OS distribution is Ubuntu 14.04.


Regarding the issue we faced with AmazonLinux2 we think that probably, Chef does not support AmazonLinux as Docker container, however, we found no official documentation or resources from Chef.

We detected a related WARNING message in the logs:

       [2020-02-06T17:27:07+00:00] WARN: Plugin Network: unable to detect ipaddress
       Creating a new client identity for agent-amazonlinux-latest-kitchen-chef using the validator key.

This message indicates that Chef created a new identity to the instance and it does not recognize attributes then.

A similar issue, we found in chef/ohai#397, but in this case it's related to Gento.

To solve such issue, we finally decided to manually set the hostname for the agent registration.

kr,

Rshad

@rshad
Copy link
Contributor Author

rshad commented Feb 7, 2020

Hi all!

We added some new changes to finish the required tasks in this issue.

AmazonLinux2 Hostname Issue

To solve the issue caused when running the tests in AmazonLinux2 we decided to replace the corresponding attribute used as the argument for agent_auth -A <arg>, manually using sed command as follows.

The original attribute name in wazuh-chef/cookbooks/attributes/authd.rb

default['ossec']['agent_auth']['name'] = node['hostname']

In this case we need to replace node['hostname'] by e.g amazon_agent, and this is done as follows:

sed -i 's/node\['.*hostname.*'\]/amazon_agent/g' ../cookbooks/wazuh_agent/attributes/authd.rb

This command will be included in the auxiliary script kitchen/wazuh-chef/common/run.sh

if [[ $PLATFORM == *"amazon"* ]]; then

   sed -i 's/node\['.*hostname.*'\]/"amazon_agent"/g' "$COOKBOOKS_PATH/wazuh_agent/attributes/authd.rb"
fi

kitchen.yml Configuration Parameterization

We also adapted kitchen.yml the most possible to not repeat platforms entries, as we were using a specific entry for Ubuntu 14.04 case as it failed when running run_command: /sbin/init which is needed in the case of the OS images with Systemd. To solve such issue we also parameterized the run_command value a an environment variable, assigned in execution time.

platforms:
  - name: <%= ENV['PLATFORM'] %>_<%= ENV['RELEASE'] %>_kitchen_chef
    driver_config:
      image: <%= ENV['IMAGE'] %>
      platform: <%= ENV['PLATFORM'] %>
      publish_all: true
      run_command: <%= ENV['RUN_COMMAND'] %>
      privileged: true
      volume:
        - /sys/fs/cgroup:/sys/fs/cgroup:ro
      provision_command:
        - sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config
        - awk -F= '/^NAME/{print $2}' /etc/os-release | grep -qi 'debian\|ubuntu' && apt-get install -y apt-transport-https gnupg2 ca-certificates|| yum install -y openssl

In the case of Ubuntu 14.04 the value of run_command is:

"/usr/sbin/sshd -D -o UseDNS=no -o UsePAM=no -o PasswordAuthentication=yes -o UsePrivilegeSeparation=no -o PidFile=/tmp/sshd.pid"

Which is the default value as the official documentation mentions here.

For the rest of OS images (Ubuntu {16.04, 18.04}, Centos7 and AmazonLinux2) it's

/sbin/init

Kr,

Rshad

@rshad
Copy link
Contributor Author

rshad commented Feb 10, 2020

Hi all!

We set use_cache: false in kitchen.yml to not use old Docker images if exist, for the new Kitchen instances.

For more info, please check https://github.com/wazuh/wazuh-jenkins/issues/1202#issuecomment-584090248.

Kr,

Rshad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant