Skip to content

Latest commit

 

History

History
257 lines (227 loc) · 9.33 KB

File metadata and controls

257 lines (227 loc) · 9.33 KB

Monitoring

Azure Monitor

  • Centralized ways of getting insights from application to infrastructure
  • You can diagnose, trace and debug issues
  • Uses ML to detect anomalies and reveal hidden patterns
  • Track how customers interact with the application
  • Components
    • • Alerts • Metrics • Action groups • Monitoring & reporting • Dashboard • Logs

High level view

  • Collects data from
    • • Application • Operating system • Resources • Subscription • Tenant
  • Populates stores
    • Metrics & logs
  • Perform functions:
    • Insights: • Application • Container • VM • Monitoring solutions
    • Visualize: • Dashboards • Views • Power BI • Workbooks
    • Analyze: • Metrics Explorer • Log Analytics
    • Respond: • Alerts • Autoscale
    • Integrate: • Event Hubs • Logic Apps • Ingest & Export APIs

Alerts

  • Notifies when important conditions are found in the monitoring data
  • Flow of alerts
    • Alert Rule
      • Target Resource (Signal) → Criteria (Logic Test)
      • Action Group (Actions to do)
      • Monitor condition (Alert State)
  • Alert rules have single of each properties:
    • Target resource
      • Scope & signals for alerting.
      • E.g. VM
    • Signal
      • Emitted by target resource
      • Can be metrics, activity log, application insights and log.
    • Criteria
      • Combination of signal and logic applied on target resources.
      • E.g. less than X CPU usage.
    • Logic
      • User-defined logic to verify that signal is within expected range/values.
      • E.g. less than 30% CPU usage.
    • Alert name
    • Alert description
    • Severity
      • Alert once the criteria specified in the alert.
      • Can range from 0 to 4.
    • Action
      • Specific action taken when the alert is fired.
  • You can alert on:
    • Metric values
    • Log search queries
    • Health of underlying Azure platform
    • More..
  • State of alerts:
    • New: Created or fired
    • Acknowledged: Issue is reviewed.
    • Closed: Issue has been resolved.
      • Can be reopened by changing its state.
    • User changes state from New.

Log types

Resource logs

  • Non-compute resources: Resource metrics
  • Compute resources: Guest OS (e.g. syslog for Linux, event logs for Windows)
  • Azure Monitoring Agents
    • Azure Diagnostics Extension (cloud only)
      • Windows Server and Linux
      • useful for basic resource-level monitoring
      • Deployed automatically to VM when you enable it.
      • Boot diagnostics (serial console)
    • Log Analytics Agent (hybrid solution)
      • Can collect logs from Azure & on-prem systems to same namespace.
  • 🤗 Formerly known as diagnostic logs 1

Application logs

  • Trace event streams
  • Programmed in application itself.
  • Application Insights
    • Instrumentation tool
    • HTTP requests
    • Dependency Calls (to e.g. SQL, external services, background services)

Activity logs

  • Azure infrastructure logs
  • E.g.
    • Who created VM?
    • Who configured this VNet?
    • Traffic stream from NSG?
  • Types include: • Administrative events • Service health events • Autoscale events • Recommendations • Security alerts • Alerts
  • ❗ Stored for 19 days
  • Through a diagnostic setting, they can be streamed into
    • Storage account
    • Log analytics workspace
    • Azure marketplace partner
    • Event Hub

NSG (Network Security Group) Flow logging

  • Flow logs handled by NSGs.
  • Plot using
    • In-built Azure plotting tool Network Watcher
    • Power BI

Azure Cost Management

  • In portal it can be eached through "Cost analysis" blade of desired scope.
  • In "Cost analysis" you can filter by "Tag"s.
  • Cost Management shows organizational cost and usage patterns with advanced analytics
  • Reports show your internal and external costs for usage and Azure Marketplace charges
  • You can automate periodically export of your costs
    • 💡 You can also see daily usage data in Portal: Azure Account Center → Billing history → Current period → Download usage
  • Data is consumed by other Azure resources
  • Predictive analytics are also available.

Metrics

  • Collected one-minute frequency
  • Uniquely identified in a namespace.
  • 💡 Stored for 93 days
    • Collected in Azure metrics database (time series database)
    • 💡 Copy to Log Analytics for long term storage
  • Holds value properties: Time, Type, Resource, Value, Multiple Dimensions
  • Value:
    • Health of application: can help to identify route cause.
    • Valuable when combined with other metrics.
  • Sources of metrics:
    • Platform metrics
      • Each resource provides
      • Visibility into health and performance
    • Application metrics
      • Generated by application insights
      • Detect performance issues & track trends
    • Custom metrics
      • ❗ Must be created in same region as the resource that has the metrics
  • Use-cases: • Metrics explorer • Metric Alert Rule • Auto Scale • Route & Stream • Archive • Access

Third party tools

  • ITSM
    • IT as a Service
    • Helps to design, plan, deliver, operate, and control information technology (IT) services
    • Azure ITSM Connector
      • Bi-directional connection layer between and your ITSM tool(s)
      • Use cases:
        • Create ITSM work items based on Azure alerts.
        • Sync ITSM incident/change request data to Azure.
  • SIEM
    • Security information and event management
    • E.g. Splunk (there's an open source add-on to send to Event Hubs)
      • 💡 You could even use Azure Sentinel as a SIEM tool.

Action groups

  • Name: Unique identifier
  • Action type
    • Voice call or SMS
      • ❗ Up to 10 SMS / voice call actions in an action group.
      • ❗ No more than 1 SMS / Voice call every 5 minutes.
    • Webhook
      • ❗ Up to 10 webhook call actions in an action group.
      • It'll retry 2 times: first after 10, then 100 seconds.
    • Logic App
      • ❗ Up to 10 logic app actions in an action group.
    • Automation runbook
      • ❗ Up to 10 Runbook actions in an action group.
    • Azure Function
    • ITSM
      • ❗ Up to 10 ITSM actions in an action group.
    • Email
      • ❗ Up to 1000 e-mail actions in an action group.
      • ❗ No more than 100 emails in an hour.
    • Push notification
      • Azure App Push
      • ❗ Up to 10 Azure app actions in an action group.
  • Details: corresponding phone number, email address, webhooks URI, or ITSM connection details.

Monitoring and reporting on spend

  • Two ways to understand Azure bill to compare usage and costs (invoice):
    1. Using usage file
      • Detailed usage CSV file shows charges & daily usage in billing period
        • Download:
          1. Sign into the Azure account Center as the Account Administrator
          2. Select the subscription for which you want the invoice and usage information
          3. Select billing history → Download usage
        • Select billing history
    2. Using Azure portal
      • Subscription → Cost analysis → Filter by Timespan
  • See estimated costs on Portal: Subscription → Usage and estimated costs

Log Analytics

  • Old: OMS, new: Embedded in Azure Monitor as Logs.
  • It's a dataware house for telemetry
    • It converts any schema to a table schema that allows you to query.
      • Uses KQL (pipe-based) language to query.
  • All monitoring roads lead t o Azure Log Analytics
    • There's always an integration from an logging Azure component to Log Analytics.
  • You can download agents in Workspace → Connect
    • Agents do not require VPN
    • System Center Operations Manager
      • Can send data to Log Analytics from cloud/on-prem servers.
  • Azure Data Explorer
    • Query language is used & viewed
  • Alert rule
    • Based on each query that run on regular intervals, results are evaluated to trigger an alert.
    • Target
      • Specific Aure resource
    • Criteria
      • Specific logic to trigger an action
      • Log Alerts
        • Describes where signal is custom query based on Log Analytics
    • Action
      • Call to send a notification
    • Set-up in Log Analytics → Alerts
  • Export
    • • Excel • PowerBI
  • Application Insights data is used in a different partition in Log Analytics.
    • E.g. requests, traces, usages
    • Allows you to cross application queries
  • Function
    • Queries can be saved as functions to be used within another query.
  • Requires log analytics workspace

Create performance baselines

  • Baseline
    • Configuration management term
    • Signifies an agreed-upon description of product attributes, per unit time, which serves as a basis for defining change.
    • 💡 It's not only recommended but mandatory for team to develop a baseline.
      • Gather diagnostics for long enough time.
        • Capture all peaks and values over ordinary usage.
        • Enable streams and create baseline
      • Even analyze those and agree upon which performance ranges are acceptable to define SLAs.
      • Helps to isolate problem
  • Baselining in Azure
    1. Continuous monitoring
    2. Normal operational parameters
    3. Alerts on deviations
    4. Take proactive corrective actions
  • Baselines actions
    • Enable diagnostics monitoring and telemetry, e.g.:
      • Azure IaaS resources
      • Azure App Service apps
    • Creating performance baselines
      • Analyze diagnostics output
      • Plot metrics