Skip to content

Commit

Permalink
Merge ci_prod to insights_metrics (#1247)
Browse files Browse the repository at this point in the history
* fix version in Geneva config xml (#1227)

* fix bugs (#1230)

* fix bugs

* fix comment

* update dcr optimization error messages (#1228)

* update dcr optimization error messages

* add additional check for geneva

* redirect dcr parser stderr and stdout to traces file

---------

Co-authored-by: Amol Agrawal <amagraw@microsoft.com>

* update fluent-bit to 2.2.2 in linux (#1229)

* update fluent-bit to 2.2.2 in linux

---------

Co-authored-by: Amol Agrawal <amagraw@microsoft.com>

* update charts, yaml and release notes for 3.1.20 (#1234)

Co-authored-by: Amol Agrawal <amagraw@microsoft.com>

* Geneva -send windows container inventory and perf with RS (#1233)

* Update the geneva feature flag for RS

---------

Co-authored-by: Janvi Jatakia (from Dev Box) <jajataki@microsoft.com>

* Add scan tools to the build pipeline (#1237)

* Add the missing tools to the build pipeline

* update policheck similar to prom metrics

* update binskim

* update trivyignore

* add policheck in windows section

---------

Co-authored-by: Janvi Jatakia (from Dev Box) <jajataki@microsoft.com>

* streamline input plugin code. (#1238)

* streamline input plugin code

---------

Co-authored-by: Amol Agrawal <amagraw@microsoft.com>

* Telemetry optimization: adding addon token adapter traces as metrics (#1231)

* Add token adapter traces as metrics

* update trivyignore

* updating name of mdsd function

* Updating the addon token adapter to discard unnecessary logs

* Update trivyignore

---------

Co-authored-by: Janvi Jatakia (from Dev Box) <jajataki@microsoft.com>

* Update ai instrumentation key for USNAT/USSEC (#1239)

* update ai instrumentation key

* address comments

* resolve comments

* syntax error

---------

Co-authored-by: Janvi Jatakia (from Dev Box) <jajataki@microsoft.com>

* Gangams/logs 50k eps per node (#1235)

* mdsd version 50k changes

* amacore agent integration

* update liveness probe

* handle non-existent file

* refactor code

* fix bugs in mdsd install

* add poll to check amaca port up and running

* fix bug

* configure amaca configport

* try released mdsd version 1.30.3

* fix bug in logs and events profile

* test latest version of mdsd in GIG mode for both arm and x64

* try with build 50k eps changes

* update templates for high log scale mode

* remove libc.so copying

* revert logrotate conf for amaca log

* update mdsd version which has crash fix

* add proxy support for amacore agent

* update mdsd build with amaca gig la changes

* update mdsd build with gig la fixes

* update windows ama build

* mdsd version with 25k buffer size in mdsd

* update mdsd build

* add telemetry and configmap option

* fix bugs

* windows ama build with resource id bug fix

* update mdsd version with qos fixes

* update to use working templates

* add frequency to control amaca log

* mdsd build with qos updates

* trivy ignore update

* log amaca agent version

* improve comments

* add default fluent-bit config for high log scale

* add threding on tail plugin when high log scale enabled

* fix bugs

* fix bug

* fix bugs

* some improvements

* improve comments

* improve code

* update trivyignore

* fix bug

* update trivyignore

* pick GIGLA stream from config when highlogscale enabled

* fix bug

* template updates for high log scale mode

* fix bug

* clean up

* set envvar for ishighlogscale

* set envvar for ishighlogscale

* fix bug

* add log message to troubleshoot duplicate logs

* add log message to troubleshoot duplicate logs

* handle ama bug until fixed

* add storage total limit size

* rename for better reading

* fix pr feedback

* fix pr feedback

* fix pr feedback

* mdsd version update

* fix proxy bug

* fix proxy bug

* update trivy ignore

* clean up the code

* refactor code

* increase storage limit size to 2GB

* increase storage limit size to 10GB

* official mdsd and windows ama versions

* code cleanup

* code cleanup

* mdsd version annotation update

* fix pr feedback

* fix pr feedback

* fix pr feedback

* fix pr feedback

---------

Co-authored-by: Ganga Mahesh Siddem <gangams@microsoft.com>
Co-authored-by: Amol Agrawal <pfrcks@gmail.com>
Co-authored-by: Amol Agrawal <amagraw@microsoft.com>
Co-authored-by: Janvi Jatakia (from Dev Box) <jajataki@microsoft.com>
  • Loading branch information
5 people committed May 10, 2024
1 parent 47449fa commit fc932b9
Show file tree
Hide file tree
Showing 54 changed files with 1,142 additions and 3,938 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/run_unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ jobs:
Golang-Tests:
runs-on: ubuntu-latest
steps:
- name: Setup Go 1.19.x
uses: actions/setup-go@v4
with:
go-version: '1.19.x'
- name: Check out repository code
uses: actions/checkout@v2
- name: Run unit tests
Expand Down
45 changes: 44 additions & 1 deletion .pipelines/azure_pipeline_mergedbranches.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,9 @@ jobs:
echo "##vso[task.setvariable variable=windowsAMAUrl;isOutput=true]$windowsAMAUrl"
name: setup
- task: CredScan@3
displayName: "SDL : Run credscan"

- task: CopyFiles@2
displayName: "Copy ev2 deployment artifacts"
inputs:
Expand Down Expand Up @@ -126,6 +129,12 @@ jobs:
pathToPublish: '$(Build.ArtifactStagingDirectory)'
artifactName: drop

- task: Armory@2
displayName: 'Run ARMory'
inputs:
toolVersion: Latest
targetDirectory: '$(Build.SourcesDirectory)'

- job: build_linux
timeoutInMinutes: 120
dependsOn: common
Expand Down Expand Up @@ -454,6 +463,19 @@ jobs:
MaxRetryAttempts: '5'
displayName: 'EsrpCodeSigning for OSS'

- task: BinSkim@4
displayName: 'SDL: run binskim'
inputs:
InputType: 'CommandLine'
arguments: 'analyze --rich-return-code $(Build.ArtifactStagingDirectory)\ossSigning\out_oms.so $(Build.ArtifactStagingDirectory)\ossSigning\perf.so $(Build.ArtifactStagingDirectory)\ossSigning\containerinventory.so $(Build.ArtifactStagingDirectory)\fpSigning\livenessprobe.exe $(Build.ArtifactStagingDirectory)\fpSigning\CertificateGenerator.exe $(Build.ArtifactStagingDirectory)\fpSigning\CertificateGenerator.dll'
retryCountOnTaskFailure: 1

- task: PoliCheck@2
displayName: "SDL : Run PoliCheck"
inputs:
targetType: 'F'
targetArgument: '$(Build.SourcesDirectory)'

- task: PowerShell@2
displayName: Replace files in origin Image
inputs:
Expand Down Expand Up @@ -664,6 +686,19 @@ jobs:
MaxRetryAttempts: '5'
displayName: 'EsrpCodeSigning for OSS'

- task: BinSkim@4
displayName: 'SDL: run binskim'
inputs:
InputType: 'CommandLine'
arguments: 'analyze --rich-return-code $(Build.ArtifactStagingDirectory)\ossSigning\out_oms.so $(Build.ArtifactStagingDirectory)\ossSigning\perf.so $(Build.ArtifactStagingDirectory)\ossSigning\containerinventory.so $(Build.ArtifactStagingDirectory)\fpSigning\livenessprobe.exe $(Build.ArtifactStagingDirectory)\fpSigning\CertificateGenerator.exe $(Build.ArtifactStagingDirectory)\fpSigning\CertificateGenerator.dll'
retryCountOnTaskFailure: 1

- task: PoliCheck@2
displayName: "SDL : Run PoliCheck"
inputs:
targetType: 'F'
targetArgument: '$(Build.SourcesDirectory)'

- task: PowerShell@2
displayName: Replace files in origin Image
inputs:
Expand Down Expand Up @@ -805,4 +840,12 @@ jobs:
- task: PublishBuildArtifacts@1
inputs:
pathToPublish: '$(Build.ArtifactStagingDirectory)'
artifactName: drop
artifactName: drop

- task: AntiMalware@4
displayName: 'Run MpCmdRun.exe'
inputs:
InputType: Basic
ScanType: CustomScan
FileDirPath: '$(Build.ArtifactStagingDirectory)'
DisableRemediation: false
19 changes: 18 additions & 1 deletion .trivyignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ CVE-2024-27304
GHSA-7jwh-3vrq-q3m8
CVE-2024-24786
CVE-2024-24557
CVE-2023-45288

#telegraf HIGH
GHSA-m425-mq94-257g
CVE-2023-46129
CVE-2023-47090
CVE-2024-21626
CVE-2023-50658
CVE-2024-3154

# ruby HIGH
CVE-2017-10784
Expand All @@ -32,4 +34,19 @@ CVE-2023-5678

#golang MEDIUM
CVE-2023-48795
CVE-2024-24786
CVE-2024-24786
CVE-2023-45288

#stdlib
CVE-2023-45283
CVE-2023-29406
CVE-2023-29409
CVE-2023-39318
CVE-2023-39319
CVE-2023-39326
CVE-2023-45284
CVE-2023-45289
CVE-2023-45290
CVE-2024-24783
CVE-2024-24784
CVE-2024-24785
2 changes: 1 addition & 1 deletion Documentation/Internal/ContainerLogV2-Linux.xml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<MonitoringManagement version="1.1" namespace="<NamespaceForLinuxContainers>" eventVersion="1" timestamp="2016-01-20T00:00:00.000">
<MonitoringManagement version="1.0" namespace="<NamespaceForLinuxContainers>" eventVersion="1" timestamp="2016-01-20T00:00:00.000">
<Accounts>
<Account moniker="<GenevaLogsAccountMoniker>" isDefault="true" />
</Accounts>
Expand Down
29 changes: 29 additions & 0 deletions ReleaseNotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,35 @@ information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeo
additional questions or comments.

## Release History
### 04/22/2024 -
##### Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:3.1.20 (linux)
##### Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-3.1.20 (windows)
- Linux
- [CBL-Mariner 2.0.20240403](https://github.com/microsoft/azurelinux/releases/tag/2.0.20240403-2.0)
- Golang - 1.20.5
- Ruby - 3.1.3
- MDSD - 1.29.7
- Telegraf - 1.28.5
- Fluent-bit - 2.2.2
- Fluentd - 1.16.3
- Windows
- Ruby - 3.1.1
- Fluent-bit - 2.0.14
- Telegraf - 1.24.2
- Fluentd - 1.16.3
- Windows AMA - 46.9.43
- Golang - 1.20.5
##### Code change log
## What's Changed
- Common
* Containerlogv2 Kubernetes Metadata Grafana Dashboard Private Preview by @wanlonghenry in https://github.com/microsoft/Docker-Provider/pull/1218
* fix bugs by @ganga1980 in https://github.com/microsoft/Docker-Provider/pull/1230
* reduce podsChunkSizeMin (#1225) by @pfrcks in https://github.com/microsoft/Docker-Provider/pull/1226
* fix version in Geneva config xml by @ganga1980 in https://github.com/microsoft/Docker-Provider/pull/1227
- Linux
* update dcr optimization error messages by @pfrcks in https://github.com/microsoft/Docker-Provider/pull/1228
* update fluent-bit to 2.2.2 in linux by @pfrcks in https://github.com/microsoft/Docker-Provider/pull/1229

### 03/29/2024 -
##### Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:3.1.19 (linux)
##### Version mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-3.1.19 (windows)
Expand Down
70 changes: 69 additions & 1 deletion build/common/installer/scripts/fluent-bit-conf-customizer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,23 @@

@default_service_interval = "15"
@default_mem_buf_limit = "10"
@default_high_log_scale_service_interval = "1"
@default_high_log_scale_max_storage_chunks_up = "500" # Each chunk size is ~2MB
@default_high_log_scale_max_storage_type = "filesystem" # filesystem = memory + filesystem in fluent-bit
@default_high_log_scale_max_storage_total_limit_size = "10G"

def is_number?(value)
true if Integer(value) rescue false
end

def is_high_log_scale_mode?
isHighLogScaleMode = false
if !ENV["IS_HIGH_LOG_SCALE_MODE"].nil? && !ENV["IS_HIGH_LOG_SCALE_MODE"].empty? && ENV["IS_HIGH_LOG_SCALE_MODE"].to_s.downcase == "true"
isHighLogScaleMode = true
end
return isHighLogScaleMode
end

def substituteMultiline(multilineLogging, stacktraceLanguages, new_contents)
if !multilineLogging.nil? && multilineLogging.to_s.downcase == "true"
if !stacktraceLanguages.nil? && !stacktraceLanguages.empty?
Expand All @@ -48,6 +60,15 @@ def substituteMultiline(multilineLogging, stacktraceLanguages, new_contents)
return new_contents
end

def substituteStorageTotalLimitSize(new_contents)
if is_high_log_scale_mode?
new_contents = new_contents.gsub("#${AZMON_STORAGE_TOTAL_LIMIT_SIZE_MB}", "storage.total_limit_size " + @default_high_log_scale_max_storage_total_limit_size)
else
new_contents = new_contents.gsub("\n #${AZMON_STORAGE_TOTAL_LIMIT_SIZE_MB}\n", "\n")
end
return new_contents
end

def substituteResourceOptimization(resourceOptimizationEnabled, new_contents)
#Update the config file only in two conditions: 1. Linux and resource optimization is enabled 2. Windows and using aad msi auth and not using geneva logs integration
if (!@isWindows && !resourceOptimizationEnabled.nil? && resourceOptimizationEnabled.to_s.downcase == "true") || (@isWindows && @using_aad_msi_auth && !@geneva_logs_integration)
Expand All @@ -63,6 +84,40 @@ def substituteResourceOptimization(resourceOptimizationEnabled, new_contents)
return new_contents
end

def substituteHighLogScaleConfig(enableFbitThreading, storageType, storageMaxChunksUp, new_contents)
begin
if is_high_log_scale_mode? || (!enableFbitThreading.nil? && !enableFbitThreading.empty? && enableFbitThreading.to_s.downcase == "true" )
new_contents = new_contents.gsub("#${AZMON_TAIL_THREADED}", "threaded on")
puts "using threaded on for tail plugin"
else
new_contents = new_contents.gsub("\n #${AZMON_TAIL_THREADED}\n", "\n")
end

if is_high_log_scale_mode?
new_contents = new_contents.gsub("#${AZMON_STORAGE_TYPE}", "storage.type " + @default_high_log_scale_max_storage_type)
puts "using storage.type: #{@default_high_log_scale_max_storage_type} for tail plugin"
elsif !storageType.nil? && !storageType.empty?
new_contents = new_contents.gsub("#${AZMON_STORAGE_TYPE}", "storage.type " + storageType)
puts "using storage.type: #{storageType} for tail plugin"
else
new_contents = new_contents.gsub("\n #${AZMON_STORAGE_TYPE}\n", "\n")
end

if is_high_log_scale_mode?
new_contents = new_contents.gsub("#${AZMON_MAX_STORAGE_CHUNKS_UP}", "storage.max_chunks_up " + @default_high_log_scale_max_storage_chunks_up)
puts "using storage.max_chunks_up: #{@default_high_log_scale_max_storage_chunks_up} for tail plugin"
elsif !storageMaxChunksUp.nil? && !storageMaxChunksUp.empty?
new_contents = new_contents.gsub("#${AZMON_MAX_STORAGE_CHUNKS_UP}", "storage.max_chunks_up " + storageMaxChunksUp)
puts "using storage.max_chunks_up: #{storageMaxChunksUp} for tail plugin"
else
new_contents = new_contents.gsub("\n #${AZMON_MAX_STORAGE_CHUNKS_UP}\n", "\n")
end
rescue => err
puts "config::substituteHighLogScaleConfig failed with an error: #{err}"
end
return new_contents
end

def substituteFluentBitPlaceHolders
begin
# Replace the fluentbit config file with custom values if present
Expand All @@ -79,8 +134,18 @@ def substituteFluentBitPlaceHolders
windowsFluentBitDisabled = ENV["AZMON_WINDOWS_FLUENT_BIT_DISABLED"]
kubernetesMetadataCollection = ENV["AZMON_KUBERNETES_METADATA_ENABLED"]
annotationBasedLogFiltering = ENV["AZMON_ANNOTATION_BASED_LOG_FILTERING"]
storageMaxChunksUp = ENV["FBIT_STORAGE_MAX_CHUNKS_UP"]
storageType = ENV["FBIT_STORAGE_TYPE"]
enableFbitThreading = ENV["ENABLE_FBIT_THREADING"]


serviceInterval = (!interval.nil? && is_number?(interval) && interval.to_i > 0) ? interval : @default_service_interval
serviceInterval = @default_service_interval
if is_high_log_scale_mode?
serviceInterval = @default_high_log_scale_service_interval
puts " using Flush interval: #{serviceInterval}"
elsif (!interval.nil? && is_number?(interval) && interval.to_i > 0)
serviceInterval = interval
end
serviceIntervalSetting = "Flush " + serviceInterval

tailBufferChunkSize = (!bufferChunkSize.nil? && is_number?(bufferChunkSize) && bufferChunkSize.to_i > 0) ? bufferChunkSize : nil
Expand Down Expand Up @@ -115,6 +180,8 @@ def substituteFluentBitPlaceHolders
new_contents = new_contents.gsub("\n ${TAIL_IGNORE_OLDER}\n", "\n")
end

new_contents = substituteHighLogScaleConfig(enableFbitThreading, storageType, storageMaxChunksUp, new_contents)

if !kubernetesMetadataCollection.nil? && kubernetesMetadataCollection.to_s.downcase == "true"
new_contents = new_contents.gsub("#${KubernetesFilterEnabled}", "")
end
Expand All @@ -135,6 +202,7 @@ def substituteFluentBitPlaceHolders

puts "config::Starting to substitute the placeholders in fluent-bit-common.conf file for log collection"
text = File.read(@fluent_bit_common_conf_path)
text = substituteStorageTotalLimitSize(text)
new_contents = substituteMultiline(multilineLogging, stacktraceLanguages, text)
File.open(@fluent_bit_common_conf_path, "w") { |file| file.puts new_contents }
puts "config::Successfully substituted the placeholders in fluent-bit-common.conf file"
Expand Down

0 comments on commit fc932b9

Please sign in to comment.