This is where we provide a service to a customer to provide value.
- Urgent operational problems are handled here.
- Bigger issues are fed back into earlier stages.
Work with what you've got
- Service Operation requires you to be flexible
- Don't have the benefit of re-engineering from the start
- Operations must continue alongside Continual Service Improvements
Key Takeaways
- This stage never ends.
- Provides users with value.
- Faults are quickly fixed or sent back to an earlier stage.
- Deliver and support IT services
- Minimize the impact of failure
- Control access to services
Scope
- Services
- People
- Processes
- Event Management
- Incident Management
- Problem Management
- Request Fulfillment
- Access Management
- Technology
- Balance
- Internal IT vs external business requirements
- Stability vs responsiveness
- Cost vs quality
- Proactive vs reactive
- Communication
- Between users and customers
- Between operational teams/shifts
- In performance reporting
- With projects and programs
- For any changes and releases
- For any failures and emergencies
To manage events throughout their lifecycle (usually short).
An event is a change of state that is significant for the IT service management.
Functions
- Detect changes of state that affect any CI
- Identify the type of event and take action
- Trigger other Service Management processes
- Capture performance info to compare against SLA targets
- Provide inputs into CSI
Alert - automated warning that something has occurred
Event Types
- Informational - shows everything is operating properly
- Considered a completed event and logged in the CMS
- Warning - something isn't operating properly; some threshold has been crossed
- Trigger Problem Managment process to determine cause
- Logged in CMS
- Exceptions - an error condition has occurred; performance is currently unacceptable
- Trigger the Incident Management process
- or create a Change Management issue
Restore normal service oepration as quickly as possible while maintaining service quality by minimizing adverse impact.
Covers any event that disrupts or may disrupt services.
Incident - any unplanned interruption or reduction in quality to an IT service or failure of a CI.
- Identification - Triggered by trigger
- Exception in Event Management
- Trechnician discovers issue
- System auto-detects
- User calls to complain
- Logging - Service desk logs all incidents
- Categorization - Incident vs service request
- Determine Priority - What is the impact/urgency?
- Determined by SLA, often by timeline
- Initial Diagnosis & Escalation
- Tier 1 is triage.
- Service Desk escalates if it needs a specialist
- Escalation
- Functional - most common, requires a specialist
- Hierarchical - referred to management due to severity, affected user, or permissions
- Resolution and Restoration
- Complete investigation and appropriate correction
- Solution reported back to Service Desk and the user
- Closure - just because tech says it's fixed doesn't mean it's fixed.
- Check with user that it is fixed.
- Close the ticket with what was wrong and how it was fixed.
An incident labeled as "major" during prioritization, based on predefined criteria.
- Reported to Service Desk Manager and the IT Director
Major Incident Teams
- pre-planned responses for major likely events
- Service Desk still own the incident
- Generic Models - most common, used for handling initial logging and categorization
- Specific Models - once categorized, analyst can move to a specific model for common issues
Incident matching
- Help desk should try to match incident to known existing incidents
- Can identify trends to be addressed
- Can match the symptoms to a database of known incidents
Problem Management focuses on the long-term solution and fixing the root cuase.
Incident Management focuses on correcting issues as quickly as possible.
To manage problems throughout their lifecycle, minimizing impact of underlying errors and prevent recurrence of related incidents.
Scope
- Triggered by Event Management, Incident Management, or Problem Management
- Implements solutions through Change Management and Release & Deployment
- Prevents issues using Availability Management and Capacity Management
- Problem - underlying cause of incidents or possible incidents
- Workaround - method to minimize impact of an incident until a permanent fix is ready
- Known Error - exists when you have an incident and a current workaround, allows business operations to continue
- Known Error Database (KEDB) - details problems, workarounds, and known errors
- contains error records and problem records
- exists in CMS
- Error records - problem with associated workaround
- Problem records - problem without an associated workaround
- Incident Matching - matching incident to a known error
- Initiating the Process - decided by Service Desk supervisors and managers
- Problem Models - predefined steps to investigate a given problem
- Major Problems - Conduct a postmortem.
Manage lifecycle for all service requests and deliver value directly to users.
- Managerial request - use Business Relationship or Service Level Management processes
- Individual request - use the Request Fulfillment process
Functions
- Increase user satisfaction
- Provide access to standard services
- Provide information and help
- Handle complaints, comments, and compliments
Key Takeaways
- Handle requests, not necessarily solve them
- All requests should be recorded
- Requests can trigger other processes
- Some requests are impossible
Provide users with access rights to use a given service.
Functions
- Operate per the IT Security Management policy
- Handle access rights as approved/directed
- Ensure changes occur by process
- Oversee access to services with Event Management
Triggers
- Service request
- Change Management process
- Release & Deployment Management process
Provide a single point of contact for all IT services.
Efficient and Effective
- Improved customer satisfaction
- Consistency in support to users
- Speedy service restoration for failures
- Quality and speed for fixing incidents
- Effective teamwork and delegation
- Better management of infrastrcuture through CMS
- Proactive service improvement and provisioning
- Capture of metrics
- Log & Categorize Incidents
- Log & Categorize Service Requests
- Prioritizing Incidents
- Diagnose and correct Tier 1 incidents
- Restore services (if possible)
- Escalate incidents and SRs
- Communicate status with users
- current incidents and problems
- future rollout of services
- Close incidents and SRs
- Maintain CMS data
- Communicate between users and service provider
- Follow-up with users to determine their satisfaction
- Knowledge of business
- Technical acumen and skill
- Communication skills
- Methodical approach to fixing problems
- Ownership and responsibility
Help Desk -> Service Desk - more all-encompassing
- Local service desk - located physically close to customers
- Centralized service desk - better use of resources, improves consistency, and centralizes management
- Virtual service desk - no central location, but still shared resources, consistency, and centralized management
- Follow-the-sun - combines types for 24/7 coverage across time zones
- Specialist Service Desks - internal variation on Tier 1
Provide a stable platform for services to be delivered and perform day-to-day IT oeprations.
IT Operations Control
- Monitors infrastructure for optimal performance
- Carries out tasks to keep services up
- IT Operations Control (NOC)/Operations Bridge
- Activities
- Monitoring performance and event management
- Scheduling jobs, roll-outs, and upgrades
- Perform backups and restorations
- Produce reports and metrics
- Maintain infrastructure
- Service desk for the services, rather than for the customer
Facilities Management
- Concerned with physical environment of infrastructure
- Close relationship with IT Operations Control
Provide technical resources to various phases.
- Custodian of technical knowledge and skills related to infrastructure
- Source of resources needed for ITIL lifecycle
- Plan, implement, and maintain a stable technical infrastructure
Functions
- Produce a platform that works as best as possible
- Provide guideance to IT Operations Management to maintain operations
- Keep technical infrastructure in the best condition possible
- Provide additional support during incidents and problem management
Provide application resources and identify software requirements and their sourcing.
- Custodian of technical knowledge and skills related to applications
- Source of application resources needed for entire ITIL lifecycle
- Decides internal/external
Functions
-
Design good applications
-
Ensure applications provide functionality (utility)
-
Provide technical skills to keep applications in good condition
-
Provide additional support during incidents and problem management
-
App Development - design and construct an application solution to gain initial utility
-
App Management - ongoing oversight, management, and improvement of applications for utility and warranty
Processes first, tools second. Fit the tools to the processes.
- Self-help functionality: Allow users conduct their own routine SRs quickly for standard changes.
- Workflow and process control: Ticketing systems
- Configuration Management System: automate regular processes and functions
- Remote Control of User Desktop: Take control of user's computer for fixes or training
- Diagnostic scripting: Collect data quickly, created by Tier 2 and 3 analysts
- Dashboards and reporting: Real-time performance information and metrics to NOC personnel and management
- What is the best description of an incident model?
Documented set of steps to be followed when dealing with a specific type of incident - What is the best description of an incident?
Occurrence of significance regarding a specific configuration item (CI) - What is the best description of the purpose of Facilities Management?
Manage the phsyical IT environment - If a monitoring tool detect a failure within the technical infrastructure, how soon should incident management be invoked?
Immediately - What is not a benefit of using an incident model?
- Helps to capture valuable management information
- Ensures all incidents are easy to resolve
- Helps provide faster resolution of incidents
- Helps to achieve consistency in the handling of specific types of incidents
- What is not a purpose of the Service Operation phase of the ITIL lifecycle?
- Manage the technology and applications being used to deliver services
- Deliver value to users and customers day-by-day
- Monitor the performance of configuration items in the live environment
- Complete user acceptance testing
- What is the purpose of the Request Fulfillment process?
Handling service requests from users - Which statement(s), if any, concering Problem Management is correct?
(1) Problem Management must submit the required RFCs through Change Management for the permanent solution of faults
(2) Problem Management provides valuable management information regarding the costs of resolving and preventing problems
Both statements - What is the best action that should be taken when a workaround has been found during Problem Management?
Update the open problem record with the details of the workaround, and if any related incidents are open, inform incident management that a workaround is now available - Which is not correct concerning the Service Desk?
- Service Desk analysts try to restore normal service as quickly as possible
- The Service Desk handles service requests and incidents
- The Service Desk is an ITIL function
- The Service Desk is a Service Operations process
- The term IT Operations Controls refers to... Overseeing and monitoring of operational events and activities
- What is the benefit of using an incident model?
It helps achieve speed, consistency, and accuracy - What is the correct sequence of activities for incident handling?
Identification, logging, categorization, prioritization, investigation, diagnosis, resolution, recovery, closure - What is the best example of a workaround?
The Service Desk temporarily reassigns prints to an alternative shared printer while a broken printer is being fixed. - What is the best description of a major problem review?
A major problem review is usually conducted by the problem manager and is held to learn lessons and improve future arrangements.