From 1c2310cdc424e8cd91f6457de40e82e5739ccfbe Mon Sep 17 00:00:00 2001 From: Peter Willis Date: Tue, 20 Nov 2018 13:59:02 -0500 Subject: [PATCH] Import more initial docs and changes to some files --- README.md | 19 +- calms.md | 11 + guides/docker/linux/README.md | 2 +- guides/docker/linux/machine/README.md | 2 +- metrics.md | 9 + practices/abstraction/README.md | 49 +++++ practices/agile/.gitignore | 0 practices/ci/continuous_delivery_features.md | 17 ++ .../continuous_integration_best_practices.md | 31 +++ practices/declarative/README.md | 15 ++ practices/deployment.md | 9 + practices/deployment_models.md | 45 +++++ practices/immutable/.gitignore | 0 practices/monitoring_models.md | 13 ++ practices/monitoring_reliability.md | 12 ++ practices/scaling.md | 5 + practices/scaling_models.md | 9 + practices/testing_models.md | 40 ++++ practices/writing_user_docs.md | 191 ++++++++++++++++++ site-documentation/README.md | 24 +++ what-is-devops.md | 35 +--- 21 files changed, 508 insertions(+), 30 deletions(-) create mode 100644 calms.md create mode 100644 metrics.md create mode 100644 practices/abstraction/README.md create mode 100644 practices/agile/.gitignore create mode 100644 practices/ci/continuous_delivery_features.md create mode 100644 practices/ci/continuous_integration_best_practices.md create mode 100644 practices/declarative/README.md create mode 100644 practices/deployment.md create mode 100644 practices/deployment_models.md create mode 100644 practices/immutable/.gitignore create mode 100644 practices/monitoring_models.md create mode 100644 practices/monitoring_reliability.md create mode 100644 practices/scaling.md create mode 100644 practices/scaling_models.md create mode 100644 practices/testing_models.md create mode 100644 practices/writing_user_docs.md create mode 100644 site-documentation/README.md diff --git a/README.md b/README.md index 5f64648..347d6ea 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,23 @@ # DevOps Yoga -Seeking DevOps enlightenment through practice. +Seeking DevOps enlightenment through practice + +![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] + +![][construction_anim_1] ![alt-text][construction_bar_rotate] ![][headdesk] + +![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] + +[What is DevOps?][1] + +More content is coming, but we need your help! Visit the [GitHub repo][https://github.com/peterwwillis/devopsyoga-content] to help contribute. Send us an e-mail, open an issue, or a pull request! + +[1]: what-is-devops.md +[construction_bar_rotate]: https://web.archive.org/web/20091027000226im_/http://de.geocities.com/cad_klaus_e/construction_bar_rotate_md_wht.gif +[construction_anim_1]: https://web.archive.org/web/20091027071000im_/http://geocities.com/jpdetroitusa/CONSTRUCTION_ANIMEE.gif +[headdesk]: https://web.archive.org/web/20091024114538im_/http://www.geocities.com/paradisesurfing/head_construction.gif +[construction_anim_2]: https://web.archive.org/web/20090821184406im_/http://geocities.com/Piano_Wizard/construction.gif --- The content here is licensed with the CC-BY-SA-4.0 license, included in the [LICENSE](LICENSE) file. +([GitHub][https://github.com/peterwwillis/devopsyoga-content]) diff --git a/calms.md b/calms.md new file mode 100644 index 0000000..a4f02b8 --- /dev/null +++ b/calms.md @@ -0,0 +1,11 @@ +# Culture, Automation, Lean, Metrics, Sharing + +I need content, please edit me! + +## Links + - http://conferences.unicom.co.uk/balancing-devops/ + - https://www.slideshare.net/devopsguys/dev-opsguys-devops-101-for-recruiters/19-CALMS_ModelCultureAutomationLean_Hearts_Minds_Embrace + - https://futroninc.com/2015/06/keep-c-a-l-m-s-and-be-agile/ + - https://devopscollective.org/automation-is-not-devops/ + - https://www.forrester.com/report/CALMSS+A+Model+For+Assessing+Modern+Service+Delivery/-/E-RES122633# + diff --git a/guides/docker/linux/README.md b/guides/docker/linux/README.md index 4cc9698..c3293a0 100644 --- a/guides/docker/linux/README.md +++ b/guides/docker/linux/README.md @@ -1,4 +1,4 @@ -# Docker on Linux +# Guide to Docker on Linux - [Using Docker in a Docker Machine](./machine/) diff --git a/guides/docker/linux/machine/README.md b/guides/docker/linux/machine/README.md index c935c21..77f18c0 100644 --- a/guides/docker/linux/machine/README.md +++ b/guides/docker/linux/machine/README.md @@ -1,4 +1,4 @@ -# Setting Up Docker on Linux +# Guide To Setting Up Docker Machine on Linux This guide sets up Docker Machine (a virtual machine with Docker installed) in order to control Docker in a pre-built environment. diff --git a/metrics.md b/metrics.md new file mode 100644 index 0000000..b562429 --- /dev/null +++ b/metrics.md @@ -0,0 +1,9 @@ +# DevOps Metrics + +## Links + - https://devops.com/closing-gap-continuous-delivery-metrics-matter/ + - https://devops.com/tag/metrics/ + - https://techbeacon.com/how-use-metrics-measurement-drive-devops + - https://techbeacon.com/devops-value-stream-mapping-why-you-need-metrics + - https://www.slideshare.net/DevOpstastic/value-stream-mapping-and-metrics-that-matter + diff --git a/practices/abstraction/README.md b/practices/abstraction/README.md new file mode 100644 index 0000000..36b3440 --- /dev/null +++ b/practices/abstraction/README.md @@ -0,0 +1,49 @@ +# Abstracting Infrastructure + +A large amount of the cloud infrastructure we use is built on details which can be abstracted and ignored. Here is a short list of some of those details, and how they can be abstracted. + +## Components that are abstracted + +## Types of abstraction + +### Services +A service is *"A valuable action, deed, or effort performed to satisfy a need or to fulfill a demand."* + +There are many things we manage as infrastructure that can be seen as services. AWS and OpenStack provide us the service of running programs and storing & retriveving data. Web servers provide us the service of taking a request and returning a result. Services may often depend on other services, and through automation and layers of abstraction, we simplify the process of supporting these services for our customers. + +### Credential management +A credential is *"Evidence of authority, status, rights, or entitlement to privileges."* + +Each instance of a service in a cloud system requires its own credentials. By assigning a credential to a service, we can run multiple of the same service to isolate them in independent access control domains, or grant access to them as needed. + +Credentials work with an access control system to control what parts of the system, or what users, have particular access to, or perform particular operations on. All services and users in a distributed system require credentials to operate securely. + +However, the specific credentials often do not affect the infrastructure directly. As long as they are applied correctly, the infrastructure does not care what your credentials are. Therefore we can easily abstract the credentials away in an automated system. + +### Users and Groups +There may be multiple users of your system, and each user may need different access to different services. By managing groups and assigning users to them, we can abstract away the details of users. Furthermore, if we put groups inside of other groups, we can abstract away the entire hierarchy of user access to services. + +If we know we will have multiple projects, and that most users will at least be assigned to a single project, we can start by creating one group for every project. In this way we can control the groups of users who have access to entire projects. This is mostly only useful for groups of administrators, but can also be useful to provide read-only access to projects without sensitive data. + +### Roles + +As we saw above, assigning administrators to a project's group lets us manage who can access all the instances of a project. But if there are multiple projects, we would have to manage those admins on every project. Instead we can create a special group called a 'role', which is made up of all the users who should have a particular kind of access to a project, or across projects, and add that 'role' to the project. + +### Customers +Some projects can become large, and the use of the project can vary greatly between users and groups. For this purpose we have created a 'customer' designator, to be a sort of project super-group. + +We know our project 'Web' has a group 'Web', but there may be lots of users who need different access to 'Web', with different kinds of access to data. If we can separate groups of users into 'Customer A', 'Customer B', etc, we can split up how the 'Web' project is used, but inherit the shared properties of the 'Web' project. + +We can create a group called 'Web-Customer\_A', and assign users 1,2,3 to this group, and assign users 4,5,6 to 'Web-Customer\_B'. + +Now we can not only manage the users who have access to all of the 'Web' project, but we can also control which users have what access to what parts of which customer. + +(why do we call this designator 'Customer'? Because a customer by definition is "someone who is being served by a business or individual", and we treat our individual groups of users as customers of our service) + +In this way, we not only know there is a group that encompasses all of the 'Web' project instances. + +### Example: Running Jenkins +For our project 'Web', we have an instance "Web-Customer\_A", where jobs for 'Customer A' are run for the 'Web' project. To start Jenkins for this instance "Web-Customer\_A", we will use credentials called "Web-Customer\_A-Jenkins". This ensures that this instance of Jenkins can only be controlled by a user or service that works with 'Customer A' on the 'Web' project. + +We derived the name of the credential from the project, the customer, and the service name. All of this is known by the time we deploy our customer's project, so an automated process can create the credential as well, and assign it as necessary. + diff --git a/practices/agile/.gitignore b/practices/agile/.gitignore new file mode 100644 index 0000000..e69de29 diff --git a/practices/ci/continuous_delivery_features.md b/practices/ci/continuous_delivery_features.md new file mode 100644 index 0000000..476d6f9 --- /dev/null +++ b/practices/ci/continuous_delivery_features.md @@ -0,0 +1,17 @@ +# Mapping | CI-CD + +## Input + * credentials + * job + * pipeline + * inventory + * version control files + * other build info + * action (build, stop) + * build child nodes +## Output + * job status + * job result + * output/files generated + * logs + diff --git a/practices/ci/continuous_integration_best_practices.md b/practices/ci/continuous_integration_best_practices.md new file mode 100644 index 0000000..8eea6e2 --- /dev/null +++ b/practices/ci/continuous_integration_best_practices.md @@ -0,0 +1,31 @@ +# Best Practices | CI-CD + +## Best Practices + +https://en.wikipedia.org/wiki/Continuous_integration#Best_practices + +### Maintain a code repository + +### Automate the build + +### Make the build self-testing + +### Everyone commits to the baseline every day + +### Every commit (to baseline) should be built + +### Keep the build fast + +### Test in a clone of the production environment + +### Make it easy to get the latest deliverables + +### Everyone can see the results of the latest build + +### Automate deployment + +## Notes + + * Beware large monolithic projects. The bigger they get, the more tests there are, and the harder it is to have a successful build as different parts of it constantly change. Segment your project or split it off if it's getting too difficult to keep the builds clean. + + * Use Multi-stage Continuous Integration to better manage large projects. https://en.wikipedia.org/wiki/Multi-stage_continuous_integration diff --git a/practices/declarative/README.md b/practices/declarative/README.md new file mode 100644 index 0000000..081e548 --- /dev/null +++ b/practices/declarative/README.md @@ -0,0 +1,15 @@ +# Declarative Programming of Infrastructure + +Declarative programming is a programming paradigm used to describe *what* a program must accomplish, rather than *how* to accomplish it. This form of programming is used to build cloud infrastructure, with solutions such as AWS CloudFormation, and HashiCorp Terraform. + +## Benefits of Declarative Infrastructure + I need content, please edit me! + +## Problems of Declarative Infrastructure + +### Non-idempotence +Not all resources used by your infrastructure are idempotent - that is, they cannot be applied multiple times without changing the initial result. Therefore, declarative infrastructure cannot be said to be fully idempotent. You will have to write tests and workarounds to ensure your result is as expected. + +### Making specific changes is hard +You know you want to modify job C that runs on server type A, but you don't want to change job C on server type B. You will probably have to copy the job and make a new one with the change, and make it run only on server type A. Or, even more complicated, add specific logic to job C to behave differently only on server type A. These kinds of divergent changes make your code more complex over time, and thus harder to manage. + diff --git a/practices/deployment.md b/practices/deployment.md new file mode 100644 index 0000000..1d96337 --- /dev/null +++ b/practices/deployment.md @@ -0,0 +1,9 @@ +# Deployment + +# Undeployment + +The process of deployment can include building infrastructure, copying files, changing security rules, and opening network paths for application traffic. If these changes are not removed when they no longer become necessary, various problems emerge. Wasteful use of resources, security vulnerabilities, and overall confusion over legacy configurations can result. + +Undeployment processes effectively reverse the deployment process. As a practical tool, they can be one part of a rollback procedure, and as another, help shut down resources no longer in use. They are also useful to reduce waste. + + diff --git a/practices/deployment_models.md b/practices/deployment_models.md new file mode 100644 index 0000000..b0bca2a --- /dev/null +++ b/practices/deployment_models.md @@ -0,0 +1,45 @@ +# Deployment Models + +There are two basic deployment models we will focus on: simple and short, and complicated and long. + +## Simple Short Deployment + +A simple short deployment model involves taking build artifacts, putting them in some remote place, and performing some steps to make the new build artifacts live. It happens immediately and there are no other considerations. + +## Complicated Long Deployment + +A complicated long deployment model takes the simple short model and extends it to support large, customer-facing, potentially volatile situations. It may have multiple dependencies and require orchestration of multiple components. It may need to be paused, resumed, or aborted. It may need to have tests run at each phase before continuing. + +## Dependency-driven deployment + +Dependency-driven deployment involves deploying using a hierarchy of dependencies. Dependencies that have already been deployed successfully may be skipped to save time. + +## Blue/Green deployment + +In this model, two environments exist: one with an old version of your app, and one with a new one. By switching traffic from one environment to the other you can fail back quickly in the event of a problem with the deploy or issues experienced after deploy. + +Potential pitfalls: + + * Long-running transactions in the old environment need to be phased out safely + * If you use separate databases for these environments, you may need to migrate data from one to the other. + * If you use the same database for these environments, you may have problems in the new environment that affect the old one. + * If you use a database service that supports snapshots, definitely use them. + +## Blue/Turqoise/Green deployment + +This is the same as Blue/Green deployment, but with an extra step to handle moving applications on a shared data store. + +When switching apps that are using a shared database, the database may need to be modified to support the new app code, which would break the old application. This normally requires a maintenance outage. To avoid that outage, "Turquoise" is an intermediate environment with the old application, and with a database that supports both the old and new application. Blue is switched to Turquoise, where the Blue app keeps running, but on a Green-compatible database. Then you switch to the Green stack, which is both the new app and the new database. + +https://www.digitalocean.com/community/tutorials/how-to-use-blue-green-deployments-to-release-software-safely +http://blog.dixo.net/2015/02/blue-turquoise-green-deployment/ +https://www.slideshare.net/mikebrittain/mbrittain-continuous-deploymentalm3public/50 + +## Canary deployment + +Canary deployment is similar to Blue/Green deployment except it is not an all-or-nothing cutover. Instead, a small percentage of users will use the new environment, and the rate increases as confidence grows, until the new environment gets 100% of traffic. The old environment can then be removed. + +## Continuous deployment + +https://www.slideshare.net/mikebrittain/mbrittain-continuous-deploymentalm3public/ + diff --git a/practices/immutable/.gitignore b/practices/immutable/.gitignore new file mode 100644 index 0000000..e69de29 diff --git a/practices/monitoring_models.md b/practices/monitoring_models.md new file mode 100644 index 0000000..b46e20d --- /dev/null +++ b/practices/monitoring_models.md @@ -0,0 +1,13 @@ + +# Alerts + +Alerts are a form of monitoring that require immediate action. + +# Tickets + +Tickets are a form of monitoring that require eventual action. + +# Logging + +Logs are a form of monitoring that require no action. Graphs are a type of logging, or log reporting. + diff --git a/practices/monitoring_reliability.md b/practices/monitoring_reliability.md new file mode 100644 index 0000000..6936d35 --- /dev/null +++ b/practices/monitoring_reliability.md @@ -0,0 +1,12 @@ +# Monitoring Reliability + + +Reliability is a function of mean-time-to-failure (MTTF), and mean-time-to-recovery (MTTR). + +MTTF is important in determining when a process is not reliable. + +## MTTR + +MTTR is important in determining how quickly you can resolve an issue and limit the impact on a service. + +As humans add latency to the MTTR, automated systems are useful to reduce the amount of time it takes to resolve an issue. Playbooks, or runbooks, also help reduce MTTR. diff --git a/practices/scaling.md b/practices/scaling.md new file mode 100644 index 0000000..6524ede --- /dev/null +++ b/practices/scaling.md @@ -0,0 +1,5 @@ +# Scaling + +Scaling is the process of adjusting to changing load. It is necessary to reduce cost and handle varying amounts of resource use and demand. Not every application and customer will need to be scaled, but some part of any given system usually needs to be scaled at some point in its lifecycle. + + diff --git a/practices/scaling_models.md b/practices/scaling_models.md new file mode 100644 index 0000000..946cc2a --- /dev/null +++ b/practices/scaling_models.md @@ -0,0 +1,9 @@ +# Scaling Models + +## Static scaling + +Static scaling is the process of changing scale once, to a fixed size. + +## Dynamic scaling + +Dynamic scaling is the process of changing scale whenever necessary to a variable size. diff --git a/practices/testing_models.md b/practices/testing_models.md new file mode 100644 index 0000000..50b447b --- /dev/null +++ b/practices/testing_models.md @@ -0,0 +1,40 @@ +# Testing Models + +There are multiple models of tests, some of which depend on or come after the others. + +## Unit tests + +Unit tests are tests of specific individual functionality. This test does not take any other part into consideration. + +## Integration tests + +Integration tests are tests of the integration of multiple components as a whole. This may involve multiple dependencies. + +## System tests + +System tests are tests to verify an entire system functions as expected. This includes smoke tests, performance tests, regression tests, etc. + +## Production tests + +These tests verify that live production systems work as expected. If one of these fails, there is a significant unexpected problem which may be customer-facing, which should trigger alerts and potentially trigger a rollback. + +## Verification tests + +Verification tests, also known as configuration tests, are intended to validate expectations. Instead of verifying if something is working, they simply verify that something looks the way it should. If this test fails, it means something unexpected has occurred, which may need to trigger a redeploy or rollback. Examples of failure would include file corruption, security penetrations, mistakes in configuration, unexpected results of automation processes, etc. + +## Stress tests + +Stress tests are used to find the limits of a component of a system. They are basically intended to break something, and record what it took to break it, and what the result was. + +## Canary test + +Canary tests are used to wait a period of time in an attempt to find unexpected problems before continuing a process. On failure, they should immediately revert any changes, and fail whatever process depends on them. Canary tests may use sampling of expected metrics (such as error rates, traffic rates, resource use, load) to trigger a failure when something unexpected occurs. + +## Issue-driven tests + +Issue-driven tests are tests created from reported issues. As each issue is documented and resolved, a test case is created to check for that issue, so that one will know immediately if it re-occurs. + +## Chaos tests + +Chaos tests are used to identify unexpected bugs and fix them before they become a larger problem. + diff --git a/practices/writing_user_docs.md b/practices/writing_user_docs.md new file mode 100644 index 0000000..3132c35 --- /dev/null +++ b/practices/writing_user_docs.md @@ -0,0 +1,191 @@ +# Writing End User Documentation + +### Who is this documentation for? + - anyone who isn't deeply familiar with your work + - people who need to learn how to access & use something you provide + +### Why is it important? + - quickly consuming a service reduces toil, or muda + - _toil_: + "manual, repetitive, automatable, tactical, devoid of enduring value, + and that scales linearly as a service grows" + - _muda_: waste. in this case: + - motion muda, or waste of unnecessary downtime + - waiting muda + - confusion muda + - excess time learning means less time doing + - lean principles + - eliminate waste + - amplify learning + - deliver as fast as possible + - empower the team / individual agency + - improve integrity + - iterating over documents improves quality of resulting work & processes + - actually painful to have to work hard to learn simple things + - creating good docs exposes weaknesses in on-boarding + + +### Types of docs + + * ##### FAQs + - first docs a user should see + - read your Slack chat and email and write down every question a user asks + - put the question in your FAQ + - ask yourself what you need to use the product + - what information do i need about my account or organization + - what account access do i need + - what network access do i need + - what information does the user need to provide to you + - what parts of the product are non-obvious, non-intuitive or hidden + - what would a five year old need to know to use your product? + - if longer than a few pages, split up into multiple FAQs + + * ##### quick-start docs + - second docs a user should see + - basically a runbook + - learn by doing + - should be 90% commands to run + - minimize prerequisites as much as possible + + * ##### on-boarding workflows + - third docs a user should see + - assume the user knows nothing and has nothing set up + - provide steps, screenshots, videos + - specify prerequisites + - provide all prerequisites to the user + - provide a link that explains how to set each prerequiste up + - actually sit with user and go through onboarding + - document the steps that have problems + - fix them manually for the user + - create a ticket/issue to go back later and fix the problem step + - commit documentation to version control + + * ##### on-boarding workflows - part 2 + - break up a section to keep it from getting too long! keep it readable + - onboarding can take days/weeks + - if a user has problems: + - create a ticket to track a user's onboarding + - document in the ticket the problems found by user and solutions provided + - ask user if their problem is solved before closing ticket + - include chat logs + email in ticket + - add tags to the ticket: 'onboarding, faq' + - add the problem+solution to FAQ + - can make an independent page for issue, link to it from FAQ question + + * ##### technical docs + - last docs a user should see + - used for: + - building the fine details/implementation + - troubleshooting + - organize into categories & sections + - explain everything someone would need to troubleshoot their own problem + - one line of code may need four lines of documentation + - if a component such as a data type/format is used, + provide link to the format's docs + + +### Organization + + * ##### organizing information in a doc + - put prerequisites at the top of each document + - shuffle where information lives in the doc as you edit it + + * ##### organizing docs + - hierarchy is useful + - have a TOC (see below: "table of contents") in the root that includes + all sub-docs + + * ##### linking docs + - link pages and major sections from your table of contents + - link to other pages/docs for prerequisites covered in other parts of docs + + + +## Writing the docs + + +### Structure + + * ##### table of contents + - create a TOC hierarchy for each doc + (see above: "Organization" / "organizing docs") + - TOC at the base of all docs; the more comprehensive the better + + * ##### the document content + + * ##### examples + - just show the code/config/commands + - provide one config file with all config options and defaults + - provide examples & configs for different use cases + - commands that you can actually run + - highlight the example with a different background, font style, size + + * ##### demos + - in lieu of documentation, a __BRIEF__ recorded demo is appropriate + - helps show flow of steps in a UI + - provide in a format that can be paused, rather than as a .gif + + * ##### linked guides + - in lieu of documentation, a concise blog entry can be helpful + - will not replace the experience of the user for your product + + * ##### index + - keyword tags + - list of words/phrases/concepts and link to each page + + * ##### glossary + - provide in addition to index + - put all acronyms in glossary + - put any words/phrases which may have multiple interpretations + + +### Editing + + * ##### formats + - Markdown is simple, widely available, can be converted to many formats + - ASCII is the most compatible thing in the world + - Monospace font size 12 + + * ##### versioning + - version control and collaboration + - git + - make sure the repo is public + - allow people to submit PRs + - put the link to the docs repo in the doc! + - wiki + + * ##### making it readable / skimmable + - break document up into multiple heading sizes/styles + - indent each part of document + - break up steps with bulleted or numbered lists + - when necessary, number each section and sub-section + + +### After the docs are written + + * ##### feedback loop + - provide _easy_ mechanism to allow users to give feedback on the docs, including + - the author + - the team + - slack, email, ticket queue + - incorporate feedback into docs + + +### Tips + + * ##### help the user get help + - always provide the user a way to find help if something goes wrong + - provide links to support outside the scope of your team + - password problem? provide links to password reset tools + - network connectivity problem? provide link to network team support + - reduces muda/toil by not requiring user to find these elsewhere + + * ##### don't: + - assume they have accounts, an environment, firewall access, etc + - assume they have used X technology before + - provide enough simple steps that somebody with zero knowledge can get + the thing up and running + + * ##### do: + - assume they just joined the company and have no idea where anything is + diff --git a/site-documentation/README.md b/site-documentation/README.md new file mode 100644 index 0000000..c79759e --- /dev/null +++ b/site-documentation/README.md @@ -0,0 +1,24 @@ +# Site Documentation + +This site is designed with the following in mind: + + - Collaboratively edited and maintained + + As with all Wikis, this site should be written and maintained by volunteers, for the purposes of documenting information they may need or want to share. + + - Wiki-like, but with pull request-based changes. + + The layout of the site should mirror that of a Wiki, but changes are submitted as pull requests, rather than edited directly. + + - Hosted on Github Pages + + The hosting provider may change in the future. + + - Featured Blogs + + At first there will be a site-wide blog, but user blogs should also be supported in the future. The purpose is mainly for editorial or otherwise lengthy explanations of a topic that are not appropriate for Wiki content. Also, news about the Wiki. + + - Uses Jekyll to manage content + + Reliance on Liquid template language, to interface better with Jekyll. + diff --git a/what-is-devops.md b/what-is-devops.md index a1d4b75..93b6917 100644 --- a/what-is-devops.md +++ b/what-is-devops.md @@ -1,17 +1,16 @@ # What is DevOps? -If you search for "What is DevOps?", you will find hundreds of pages that try to explain the concept. The simplest way I can describe it as follows. +If you search for "What is DevOps?", you will find hundreds of pages that try to explain the concept. However, there is broad acceptance of what DevOps means throughout the industry: https://devops.com/surprise-broad-agreement-on-the-definition-of-devops/ -The goal of DevOps is to improve the quality of software while speeding up its delivery and operation. The way this is achieved is through a specific culture, a set of processes, and tools. +The goal of DevOps is to improve the quality of software while speeding up its delivery and operation. The way this is achieved is through a specific culture, a set of processes and tools, working closely with other teams, and constantly working to review and improve all of these aspects to reduce waste and increase quality, and thus business value. -Another way to think of DevOps is as agile methods, applied throughout an organization, for the benefit of a software product. +Another way to think of DevOps is as agile methods, applied throughout an organization, to help a software product succeed. Implementing DevOps is a big undertaking, but not because the tools or processes are complicated. The biggest challenge is in changing everyone's mindset in order to become comfortable with new ways to work together. ## What is DevOps Culture? - -At its core, DevOps culture is inspired by the Agile Manifesto; it was even sometimes referred to as "agile infrastructure". But as DevOps encompasses many different roles and responsibilities, it doesn't strictly follow only agile methods. +At its core, DevOps culture is inspired by the Agile Manifesto. DevOps was even sometimes referred to as "agile infrastructure" [citation needed]. But as DevOps encompasses many different roles and responsibilities, it doesn't strictly follow only agile methods [citation needed]. ### Lack of Silos @@ -50,32 +49,14 @@ Ultimately, it's clear from the results that the different organizations involve Similar situations happen to products all over the world, all the time. An agile culture won't solve these problems, but it will facilitate teamwork, which often leads to better outcomes. -## What are DevOps Processes? - -These are ways of working that enable the goals of DevOps. - - - -There is no simple definition of DevOps, because it encompasses many different aspects of software development, software operation, and other entities in a business that enable and utilize software. - -One definition focuses on the most popular aspects of DevOps: - - - -The term has evolved over time, but it has roots in some simple ideas. - -The most common definition of DevOps involves teams of software development and operations engineers working together closely. Traditionally these and other departments were separated from each other, which caused problems such as miscommunication and slowness. - - - - +## What are DevOps Practices? At its core, DevOps is just a collection of engineering, business, and cultural practices. When fully implemented, those practices should result in measurably more stable products that can be delivered quicker than using more traditional methods. -## Agile software development +### Agile software development The development and production of software should follow an Agile software development lifecycle. Frequent integration and deployment of code should result in faster and less error-prone software testing and delivery. Focus should be on testing and deploying small amounts of work frequently in short cycles, rather than trying to ship perfect code over long development cycles. -## Continuously improving processes +### Continuously improving processes In every aspect of a product, there should be a focus on repetition and improvement. Frequent repetition of process allows the process not to become stale, and for inefficiencies to be caught early. A review of any inefficiency found should result in an improvement to the process. Examples would be continous integration and testing, but also things like failover recovery, customer service feedback of critical issues, etc. -## Immutable, version-controlled state +### Immutable, version-controlled state Wherever possible, a process should take as input a version controlled object, and produce a version controlled artifact. These artifacts should be immutable, and as little state should change as possible in the use of these artifacts. This prevents changes from causing unexpected errors, and allows for simple recovery in the event of errors.