- Getting started
- Development
- Formatting Data for Yupana (without QPC)
- Sending Data to Insights Upload service for Yupana (without QPC)
- Advanced Topics
Full documentation is available through readthedocs.
Yupana is a service that works with the Insights Platform services. It's primary purpose is to receive bulk uploads of hosts. A client will create a specially crafted tarball and send the file to the Insights Ingress service. The Ingress service will notify yupana via Kafka that a tarball has arrived for processing. Yupana downloads the tarball, performs top level validation, and sends the host JSON to the Insight's Host Based Inventory service. Yupana does not validate the JSON of a host. The host based inventory service will not notify yupana of validation errors.
At this time the make file commands only work on a MacOS. If you develop on something besides MacOS, you will need to bring up the Ingres, host based inventory, and dependent services manually. Information for these services can be found at https://github.com/RedHatInsights/insights-ingress-go/ and https://github.com/RedHatInsights/insights-host-inventory/. Follow their README for instructions.
To get started developing against Yupana first clone a local copy of the git repository.
git clone https://github.com/quipucords/yupana
git clone https://github.com/RedHatInsights/insights-ingress-go
git clone https://github.com/RedHatInsights/insights-host-inventory.git
This project is developed using the Django web framework. Many configuration settings can be read in from a .env
file. An example file .env.dev.example
is provided in the repository. To use the defaults simply run:
cp .env.dev.example .env
Modify as you see fit.
The /etc/hosts
file must be updated for Kafka and Minio. Open your /etc/hosts
file and add the following lines to the end:
127.0.0.1 kafka
127.0.0.1 minio
This is a Python project developed using Python 3.6. Make sure you have at least this version installed. A Pipfile is provided. Pipenv is recommended for combining virtual environment (virtualenv) and dependency management (pip). To install pipenv, use pip :
pip3 install pipenv
Then project dependencies and a virtual environment can be created using:
pipenv install --dev
First, make sure you have no zombie docker containers that could conflict with the services you are bringing up. Run:
docker ps -a
Make sure that there are no docker containers that will conflict with the services that are about to be brought up. It is safest if you have none at all, but containers that will not conflict can be left.
To run the ingress service, yupana, and host inventory service locally, use the following command:
make local-dev-up
To check if the services are up, run:
docker ps --format '{{.Names}}'
You should see the following services up and running.
grafana
yupana_db-host-inventory_1
yupana_db_1
prometheus
insightsingressgo_ingress_1
insightsingressgo_kafka_1
insightsingressgo_zookeeper_1
insightsingressgo_minio_1
To send the sample data, run the following commands:
-
Prepare the sample for sending
make sample-data
-
Locate the temp file name. You will see a message like the following:
The updated report was written to temp/sample_data_ready_1561410754.tar.gz
-
Send the temp file to your local yupana. Copy the name of this file to the upload command as shown below:
make local-upload-data file=temp/sample_data_ready_1561410754.tar.gz
-
Watch the kafka consumer for a message to arrive. You will see something like this in the consumer iTerm.
{"account": "12345", "rh_account": "12345", "principal": "54321", "request_id": "52df9f748eabcfea", "payload_id": "52df9f748eabcfea", "size": 1132, "service": "qpc", "category": "tar", "b64_identity": "eyJpZGVudGl0eSI6IHsiYWNjb3VudF9udW1iZXIiOiAiMTIzNDUiLCAiaW50ZXJuYWwiOiB7Im9yZ19pZCI6ICI1NDMyMSJ9fX0=", "url": "http://minio:9000/insights-upload-perm-test/52df9f748eabcfea?AWSAccessKeyId=BQA2GEXO711FVBVXDWKM&Signature=WEgFnnKzUTsSJsQ5ouiq9HZG5pI%3D&Expires=1561586445"}
-
Look at the yupana logs to follow the report processing to completion.
Once all of the services have been brought up, you can view the metrics collected by our app through Prometheus and display them using Grafana.
You can view the running Prometheus server at http://localhost:9090. Here, you can execute queries by typing in the name of the metric you want and pressing the execute
button. You can also view the target that we are monitoring (our metrics endpoint) and the configuration of the Prometheus server.
If you would like to change the configuration of the Prometheus server, you can edit the configuration file found here. For example, if you would like to have a more accurate representation of the metrics, you can change change the scrape interval for the yupana
job before bringing the local development services up. Currently we are polling the /metrics
endpoint every 10s
to mimic the scrape interval used in CI, but you can set this to 1s
for more accurate metrics in development.
In order to visualize the metrics that we are collecting, log in to Grafana at http://localhost:3000:
-
Log in using
admin
as the username andsecret
as the password. -
Once you are logged in, click on
Create your first data source
, and selectPrometheus
. Leave all of the defaults, but enterhttp://docker.for.mac.localhost:9090
into theURL
field. Scroll down and clickSave & Test
. -
Now you can import our development dashboard. Click on the
+
in the lefthand toolbar and selectImport
. Next, selectUpload .json file
in the upper right-hand corner. Now, import dev-grafana.json. Finally, clickImport
to begin using the yupana dashboard to visualize the data.
To bring down all services run:
make local-dev-down
Yupana uses tox to standardize the environment used when running tests. Essentially, tox manages its own virtual environment and a copy of required dependencies to run tests. To ensure a clean tox environment run:
tox -r
This will rebuild the tox virtual env and then run all tests.
To run unit tests specifically:
tox -e py36
If you would like to run a single test you can do this.
tox -e py36 -- processor.tests_report_processor.ReportProcessorTests.test_archiving_report
Note: You can specify any module or class to run all tests in the class or module.
To lint the code base:
tox -e lint
To check whether or not the product manifest needs to be updated, run the following:
make check-manifest
If the manifest is out of date, you can run the following to update it:
make manifest
Below is a description of how to create data formatted for the yupana service.
Yupana retrieves data from the Insights platform ingress service. Yupana requires a specially formatted tar.gz file. Files that do not conform to the required format will be marked as invalid and no processing will occur. The tar.gz file must contain a metadata JSON file and one or more report slice JSON files. The file that contains metadata information is named metadata.json
, while the files containing host data are named with their uniquely generated UUID4 report_slice_id
followed by the .json extension. You can download sample.tar.gz to view an example.
Metadata should include information about the sender of the data, Host Inventory API version, and the report slices included in the tar.gz file. Below is a sample metadata section for a report with 2 slices:
{
"report_id": "05f373dd-e20e-4866-b2a4-9b523acfeb6d",
"host_inventory_api_version": "1.0",
"source": "satellite",
"source_metadata": {
"any_satellite_info_you_want": "some stuff that will not be validated but will be logged"
},
"report_slices": {
"2dd60c11-ee5b-4ddc-8b75-d8d34de86a34": {
"number_hosts": 1
},
"eb45725b-165a-44d9-ad28-c531e3a1d9ac": {
"number_hosts": 1
}
}
}
An API specification of the metadata can be found in metadata.yml.
Report slices are a slice of the host inventory data for a given report. A slice limits the number of hosts to 10K. Slices with more than 10K hosts will be discarded as a validation error. Below is a sample report slice:
{
"report_slice_id": "2dd60c11-ee5b-4ddc-8b75-d8d34de86a34",
"hosts": [
{
"display_name": "dhcp181-3.gsslab.rdu2.redhat.com",
"fqdn": "dhcp181-3.gsslab.rdu2.redhat.com",
"bios_uuid": "848F1E42-51ED-8D58-9FA4-E0B433EEC7E3",
"ip_addresses": [
"10.10.182.241"
],
"mac_addresses": [
"00:50:56:9e:f7:d6"
],
"subscription_manager_id": "848F1E42-51ED-8D58-9FA4-E0B433EEC7E3",
"facts": [
{
"namespace": "satellite",
"facts": {
"rh_product_certs": [69],
"rh_products_installed": [
"RHEL"
]
}
}
],
"system_profile": {
"infrastructure_type": "virtualized",
"architecture": "x86_64",
"os_release": "Red Hat Enterprise Linux Server release 6.9 (Santiago)",
"os_kernel_version": "6.9 (Santiago)",
"number_of_cpus": 1,
"number_of_sockets": 1,
"cores_per_socket": 1
}
}
]
}
An API specification of the report slices can be found in report_slices.yml. Yupana expects each host to be formatted according to the Insights host based inventory API spec. The host based inventory API specification includes a mandatory account
field. Yupana will extract the account
number from the kafka message it receives from the Insights platform ingress service and populate the account
field of each host.
Data being uploaded to Insights must be in tar.gz
format containing the .json
files with the given JSON structure above. It is important to note that Yupana processes & tracks reports based on their UUIDS, which means that data with a specific UUID cannot be uploaded more than once, or else the second upload will be archived and not processed. Therefore, before every upload we need to generate a new UUID and replace the current one with it if we want to upload the same data more than once. Use the following instructions to prepare and upload a sample or custom report.
Yupana has a sample tar.gz
file to showcase how to upload data to Insights. To prepare the sample data for upload, simply run:
make sample-data
This command will use the sample.tar.gz
file in the Yupana repository, change the UUIDs within the metadata & each report slice, and save it as a new tar.gz
file. Newly generated tar.gz
files are located in the temp/
directory.
We created a make command that will generate an arbitrary report with N hosts. This is useful for end to end testing or performance testing. To create a report run:
make create-report hosts=500000
This command will create a tar.gz
containing n hosts (500,000 in the above example). Newly generated tar.gz
files are located in the temp/
directory.
In addition to preparing a sample tar.gz
file, you also have the option to prepare your own data for uploading to Insights. To prepare your custom data for upload, simply run:
make custom-data file=<path/to/your-data.tar.gz>
Replace the <path/to/your-data.tar.gz>
with either the absolute or relative path to the tar.gz
file holding your data. This command will copy your data files into the temp/
directory, change the UUIDs and place the files into a new tar.gz
file inside the temp/
directory.
After preparing the data with new UUIDs through either of the above steps, you can upload it to Insights. Additionally, you must export the following required information as environment variables or add them to your .env
file. See .env.external.example
.
RH_ACCOUNT_NUMBER=<your-account-number>
RH_ORG_ID=<your-org-id>
INGRESS_URL=<ingress-url>
RH_USERNAME=<your-username>
RH_PASSWORD=<your-password>
To upload the data, run:
make upload-data file=<path/to/your-data.tar.gz>
You need to replace <path/to/your-data.tar.gz>
with either the absolute or relative path to the tar.gz
file that you want to upload to Insights.
After running this command if you see HTTP 202
like the following lines in your output logs, it means your file upload to Insights was successful:
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 202
PostgreSQL is used as the database backend for Yupana. If modifications were made to the .env file the docker-compose file will need to be modified to ensure matching database credentials. Several commands are available for interacting with the database.
Assuming the default .env file values are used, to access the database directly using psql run:
psql postgres -U postgres -h localhost -p 15432
To run a local gunicorn server with yupana do the following:
make server-init
gunicorn config.wsgi -c ./yupana/config/gunicorn.py --chdir=./yupana/
Please refer to Working with Openshift.
We deploy Yupana to the Insights Dev & Production Clusters (subscriptions-ci, subscriptions-qa, subscriptions-stage, subscriptions-prod) via the deployment pipeline defined by the e2e-deploy repo.
We use a stable branch to release our code to production. You can complete the release process using the following steps:
-
Submit a pull request (PR) with the changes that you want to merge from
master
into thestable
branch. In the PR description, create a draft of the release notes. Once the release notes and changes have been approved, and a smoke test has passed, merge the PR. Be sure not to squash commits in order to preserve the history (this may require changing the settings of the repo to allow merge commits). -
Create a release based off of the
stable
branch. Copy the release notes from your PR description and also record the commit number at the top of the release notes. -
Submit a pull request to the
e2e-deploy
repository updating theBUILD_VERSION
for the CI, QA, and PROD environment. TheBUILD_VERSION
for CI and QA should always be theBUILD_VERSION
for PROD plus0.0.1
. For example, if PROD is0.2.0
, CI and QA should be0.2.1
. -
Once the PR to update the versions has been reviewed and merged, manually kick off a Jenkins deploy job for the subscriptions service set to the production environment.