Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate obs-infraobs-integrations to package-spec v3 #1 #8170

Merged

Conversation

shmsr
Copy link
Member

@shmsr shmsr commented Oct 12, 2023

Migrate a batch of packages from v1/v2 to v3 package-spec. Used the following scripts and some manual changes to generate the changes in this PR:

yq_hack.zsh

#!/bin/bash

set -e
set -u
set -o pipefail
set -x

pushd $1

find data_stream -type f -path "*fields*" -name "*.yml" -exec yq e -i "del .[].release" {} \;
find data_stream -type f -path "*fields*" -name "ecs.yml" -exec yq e -i "unique_by(.name)" {} \;
find data_stream -type f -path "*fields*" -name "*.yml" -exec yq e -i "del .[].fields.[].required" {} \;

yq e -i 'del .release' manifest.yml
yq e -i "with(select(.license != null); .conditions.elastic.subscription = .license) | del .license" manifest.yml

popd

git add -u $1/
git commit -m "[$1]: migration with yq"

ecs-update.zsh

#!/bin/bash

set -u
set -o pipefail
set -x

echo "-- executing ecs-update for $1 --"

ecs-update -format-version=3.0.0 -fix-dotted-yaml-keys -add-owner-type -owner elastic/obs-infraobs-integrations -v -skip-format=true $1

if (( $? )) 
then
    echo "-- write validation for $1 --"
    pushd $1

    go run ../scripting/write_validation.go

    popd

    # Try again?
    git checkout .
    ecs-update -format-version=3.0.0 -fix-dotted-yaml-keys -add-owner-type -owner elastic/obs-infraobs-integrations -v -skip-format=true $1
fi

And another Go program to write the validation.yml by parsing the elastic-package lint errors.

Then I hooked this scripts to run with:

$ pkgs=($(cat .github/CODEOWNERS | grep '@elastic/obs-infraobs-integrations' | cut -d ' ' -f 1 | cut -d '/' -f 3 | sort | uniq))

$ for ((idx=1; idx<=15; idx++))
do
scripting/yq_hack.zsh $pkgs[idx]
done
$ for ((idx=1; idx<=15; idx++))
do
scripting/ecs_update.zsh $pkgs[idx] || break
done

Another variation of the above script:

$ for ((idx=1; idx<=15; idx++))
do
case $pkgs[idx] in
  "citrix_adc" | "azure_app_service" | "apache") echo "-- skip $pkgs[idx] --";;
  "coredns" | "couchbase" | "etcd" | "cockroachdb" | "ceph" | "azure_functions" | "activemq" | "apache_spark" | "apache_tomcat" | "cassandra" | "couchdb" | "airflow") echo "-- done $pkgs[idx] --";;
  *) scripting/ecs_update.zsh $pkgs[idx] || break;;
esac
done

As I am creating a batch of 15 packages per migration, I am using a for loop. In case the migration is not possible for some packages, I put it in the first case to skip them.

Proposed commit message

Migrate packages owned by obs-infraobs-integrations but few couldn't be migrated. Marking them as follows:

  • activemq
  • airflow
  • apache
  • apache_spark
  • apache_tomcat
  • azure_app_service
  • azure_functions
  • cassandra
  • ceph
  • citrix_adc
  • cockroachdb
  • coredns
  • couchbase
  • couchdb
  • etcd

Please see the comment #8170 (comment) in this PR itself to understand why some packages couldn't be migrated and also learn about the challenges faced when migrating.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

@shmsr shmsr requested a review from a team as a code owner October 12, 2023 06:25
@shmsr shmsr self-assigned this Oct 12, 2023
@shmsr
Copy link
Member Author

shmsr commented Oct 12, 2023

Here are some rough notes I recorded while migrating. I have fixed a few and accordingly improved my script. But there are 3 packages that couldn't be migrated.

  • airflow

    • Problem with duplicate field like container.name, container.image.name in ecs and agent.yml
  • apache

    • elastic: subscription already present (improve script)
    • duplicate field tags in error datastream
    • file "/Users/subhamsarkar/go/src/github.com/elastic/integrations/packages/apache/kibana/visualization/apache-22057f20-3a12-11eb-8946-296aab7b13db.json" is invalid: found legacy visualization "Uptime [Metrics Apache]" (metric, TSVB)
  • azure_app_service

    • Error: building package failed: resolving external fields failed: can't resolve fields: field geo.continent_name cannot be reused at top level (see: #1)
  • ceph

    • file "/Users/subhamsarkar/go/src/github.com/elastic/integrations/packages/ceph/data_stream/cluster_status/fields/fields.yml" is invalid: field 0.fields.0.fields.5.fields.5: object_type is required
      • Fixed by adding object_type: keyword (see: #1)
  • citrix_adc

    • Error: building package failed: resolving external fields failed: can't resolve fields: field interface.id cannot be reused at top level
  • cockroachdb

    • file "/Users/subhamsarkar/go/src/github.com/elastic/integrations/build/packages/cockroachdb-1.6.0.zip/data_stream/status/manifest.yml" is invalid: field elasticsearch.index_template.settings: Additional property index.mapping.dimension_fields.limit is not allowed
  • couchbase

    • file "/Users/subhamsarkar/go/src/github.com/elastic/integrations/build/packages/couchbase-1.3.1.zip/data_stream/query_index/fields/fields.yml" is invalid: field 0.fields.1.fields.0: Additional property metrics_type is not allowed
      • Will add to my script
  • cassandra

    • Normalize fields to make elastic-package test -v --report-format xUnit happy. (see: #1)
  • apache_spark

    • Normalize fields to make elastic-package test -v --report-format xUnit happy. (see: #1)

@elasticmachine
Copy link

elasticmachine commented Oct 12, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-10-17T16:50:08.864+0000

  • Duration: 27 min 16 sec

Test stats 🧪

Test Results
Failed 0
Passed 176
Skipped 0
Total 176

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Oct 12, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (23/23) 💚
Files 100.0% (25/25) 💚 4.545
Classes 100.0% (25/25) 💚 4.545
Methods 93.2% (233/250) 👍 11.662
Lines 95.781% (2225/2323) 👍 6.256
Conditionals 100.0% (0/0) 💚

The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/activemq
The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/apache_spark
The format_version in the package manifest changed from 2.3.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/apache_tomcat
The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/cassandra
The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/couchdb
The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/airflow
The format_version in the package manifest changed from 2.5.1 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/azure_functions
The format_version in the package manifest changed from 2.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/ceph
The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/cockroachdb
The format_version in the package manifest changed from 2.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/coredns
The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/couchbase
The format_version in the package manifest changed from 1.0.0 to 3.0.0. Removed
dotted YAML keys from package manifest. Added 'owner.type: elastic' to package
manifest.

[git-generate]
go run github.com/andrewkroh/go-examples/ecs-update@latest -v -format-version=3.0.0 -skip-format -fix-dotted-yaml-keys -add-owner-type packages/etcd
Copy link
Collaborator

@lalit-satapathy lalit-satapathy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have consistent formatting for these event.type changes quotes with and without quotes.

@@ -57,33 +57,6 @@
- name: image.id
type: keyword
description: Image ID for the cloud instance.
- name: container
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were these mappings redundant and why these got removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If you check packages/cockroachdb/data_stream/status/fields/ecs.yml you'd find the same fields.

@shmsr
Copy link
Member Author

shmsr commented Oct 17, 2023

Can we have consistent formatting for these event.type changes quotes with and without quotes.

Done

Copy link
Contributor

@tommyers-elastic tommyers-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all LGTM - just one outstanding change (nested) and this looks ready to merge. thanks!!

@shmsr
Copy link
Member Author

shmsr commented Oct 18, 2023

@lalit-satapathy / @ishleenk17 PR requires an approval. If there are any further review comments, please let me know. I'll resolve them and merge this PR.

fields:
- name: count
type: long
description: Total number of Placement Groups (pgs) in cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was happeing to these fields before the individual mapping ?
What did the mapping look like ?

Also, under this group, are these the only sub fields that are there since we didn't have that clarity when type was object ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before, the mapping just looks like this:

"state": {
  "type": "object"
},

which does nothing at all. the fields are mapped using default types, which are long and keyword respectively. so there's no behviour change here. it's just more explicit and intentional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the object is well defined in beats. i'm not sure why we never expclicity mapped them before.

type PgState struct {
	Count     int64  `json:"count"`
	StateName string `json:"state_name"`
}

@@ -14,5 +14,7 @@
external: ecs
- name: container.image.name
external: ecs
- name: container.labels
external: ecs
- name: host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tommyers-elastic , @andrewkroh - The host group ECS field reference is included in some integrations and some integrations doesn't include the ECS reference. As the host field is common, Is it worth adding validations for this field as well? Just a thought.
e.g. Integration not having host group field reference.

@tommyers-elastic tommyers-elastic merged commit d63214f into elastic:main Oct 18, 2023
4 checks passed
@elasticmachine
Copy link

Package activemq - 0.14.0 containing this change is available at https://epr.elastic.co/search?package=activemq

@elasticmachine
Copy link

Package airflow - 0.5.0 containing this change is available at https://epr.elastic.co/search?package=airflow

@elasticmachine
Copy link

Package apache_spark - 0.8.0 containing this change is available at https://epr.elastic.co/search?package=apache_spark

@elasticmachine
Copy link

Package azure_functions - 0.2.0 containing this change is available at https://epr.elastic.co/search?package=azure_functions

@elasticmachine
Copy link

Package cassandra - 1.10.0 containing this change is available at https://epr.elastic.co/search?package=cassandra

@elasticmachine
Copy link

Package ceph - 1.1.0 containing this change is available at https://epr.elastic.co/search?package=ceph

@elasticmachine
Copy link

Package cockroachdb - 1.7.0 containing this change is available at https://epr.elastic.co/search?package=cockroachdb

@elasticmachine
Copy link

Package coredns - 0.6.0 containing this change is available at https://epr.elastic.co/search?package=coredns

@elasticmachine
Copy link

Package couchbase - 1.4.0 containing this change is available at https://epr.elastic.co/search?package=couchbase

@elasticmachine
Copy link

Package couchdb - 1.1.0 containing this change is available at https://epr.elastic.co/search?package=couchdb

@elasticmachine
Copy link

Package etcd - 0.6.0 containing this change is available at https://epr.elastic.co/search?package=etcd

@shmsr shmsr deleted the migrate-infraobs-ps-v3-batch-1-serverless branch October 30, 2023 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants