Skip to content

Commit

Permalink
Import more initial docs and changes to some files
Browse files Browse the repository at this point in the history
  • Loading branch information
peterwwillis committed Nov 20, 2018
1 parent c5f226c commit 1c2310c
Show file tree
Hide file tree
Showing 21 changed files with 508 additions and 30 deletions.
19 changes: 18 additions & 1 deletion README.md
@@ -1,6 +1,23 @@
# DevOps Yoga

Seeking DevOps enlightenment through practice.
Seeking DevOps enlightenment through practice

![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2]

![][construction_anim_1] ![alt-text][construction_bar_rotate] ![][headdesk]

![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2] ![][construction_anim_2]

[What is DevOps?][1]

More content is coming, but we need your help! Visit the [GitHub repo][https://github.com/peterwwillis/devopsyoga-content] to help contribute. Send us an e-mail, open an issue, or a pull request!

[1]: what-is-devops.md
[construction_bar_rotate]: https://web.archive.org/web/20091027000226im_/http://de.geocities.com/cad_klaus_e/construction_bar_rotate_md_wht.gif
[construction_anim_1]: https://web.archive.org/web/20091027071000im_/http://geocities.com/jpdetroitusa/CONSTRUCTION_ANIMEE.gif
[headdesk]: https://web.archive.org/web/20091024114538im_/http://www.geocities.com/paradisesurfing/head_construction.gif
[construction_anim_2]: https://web.archive.org/web/20090821184406im_/http://geocities.com/Piano_Wizard/construction.gif

---
The content here is licensed with the CC-BY-SA-4.0 license, included in the [LICENSE](LICENSE) file.
([GitHub][https://github.com/peterwwillis/devopsyoga-content])
11 changes: 11 additions & 0 deletions calms.md
@@ -0,0 +1,11 @@
# Culture, Automation, Lean, Metrics, Sharing

I need content, please edit me!

## Links
- http://conferences.unicom.co.uk/balancing-devops/
- https://www.slideshare.net/devopsguys/dev-opsguys-devops-101-for-recruiters/19-CALMS_ModelCultureAutomationLean_Hearts_Minds_Embrace
- https://futroninc.com/2015/06/keep-c-a-l-m-s-and-be-agile/
- https://devopscollective.org/automation-is-not-devops/
- https://www.forrester.com/report/CALMSS+A+Model+For+Assessing+Modern+Service+Delivery/-/E-RES122633#

2 changes: 1 addition & 1 deletion guides/docker/linux/README.md
@@ -1,4 +1,4 @@
# Docker on Linux
# Guide to Docker on Linux

- [Using Docker in a Docker Machine](./machine/)

2 changes: 1 addition & 1 deletion guides/docker/linux/machine/README.md
@@ -1,4 +1,4 @@
# Setting Up Docker on Linux
# Guide To Setting Up Docker Machine on Linux

This guide sets up Docker Machine (a virtual machine with Docker installed) in order to control Docker in a pre-built environment.

Expand Down
9 changes: 9 additions & 0 deletions metrics.md
@@ -0,0 +1,9 @@
# DevOps Metrics

## Links
- https://devops.com/closing-gap-continuous-delivery-metrics-matter/
- https://devops.com/tag/metrics/
- https://techbeacon.com/how-use-metrics-measurement-drive-devops
- https://techbeacon.com/devops-value-stream-mapping-why-you-need-metrics
- https://www.slideshare.net/DevOpstastic/value-stream-mapping-and-metrics-that-matter

49 changes: 49 additions & 0 deletions practices/abstraction/README.md
@@ -0,0 +1,49 @@
# Abstracting Infrastructure

A large amount of the cloud infrastructure we use is built on details which can be abstracted and ignored. Here is a short list of some of those details, and how they can be abstracted.

## Components that are abstracted

## Types of abstraction

### Services
A service is *"A valuable action, deed, or effort performed to satisfy a need or to fulfill a demand."*

There are many things we manage as infrastructure that can be seen as services. AWS and OpenStack provide us the service of running programs and storing & retriveving data. Web servers provide us the service of taking a request and returning a result. Services may often depend on other services, and through automation and layers of abstraction, we simplify the process of supporting these services for our customers.

### Credential management
A credential is *"Evidence of authority, status, rights, or entitlement to privileges."*

Each instance of a service in a cloud system requires its own credentials. By assigning a credential to a service, we can run multiple of the same service to isolate them in independent access control domains, or grant access to them as needed.

Credentials work with an access control system to control what parts of the system, or what users, have particular access to, or perform particular operations on. All services and users in a distributed system require credentials to operate securely.

However, the specific credentials often do not affect the infrastructure directly. As long as they are applied correctly, the infrastructure does not care what your credentials are. Therefore we can easily abstract the credentials away in an automated system.

### Users and Groups
There may be multiple users of your system, and each user may need different access to different services. By managing groups and assigning users to them, we can abstract away the details of users. Furthermore, if we put groups inside of other groups, we can abstract away the entire hierarchy of user access to services.

If we know we will have multiple projects, and that most users will at least be assigned to a single project, we can start by creating one group for every project. In this way we can control the groups of users who have access to entire projects. This is mostly only useful for groups of administrators, but can also be useful to provide read-only access to projects without sensitive data.

### Roles

As we saw above, assigning administrators to a project's group lets us manage who can access all the instances of a project. But if there are multiple projects, we would have to manage those admins on every project. Instead we can create a special group called a 'role', which is made up of all the users who should have a particular kind of access to a project, or across projects, and add that 'role' to the project.

### Customers
Some projects can become large, and the use of the project can vary greatly between users and groups. For this purpose we have created a 'customer' designator, to be a sort of project super-group.

We know our project 'Web' has a group 'Web', but there may be lots of users who need different access to 'Web', with different kinds of access to data. If we can separate groups of users into 'Customer A', 'Customer B', etc, we can split up how the 'Web' project is used, but inherit the shared properties of the 'Web' project.

We can create a group called 'Web-Customer\_A', and assign users 1,2,3 to this group, and assign users 4,5,6 to 'Web-Customer\_B'.

Now we can not only manage the users who have access to all of the 'Web' project, but we can also control which users have what access to what parts of which customer.

(why do we call this designator 'Customer'? Because a customer by definition is "someone who is being served by a business or individual", and we treat our individual groups of users as customers of our service)

In this way, we not only know there is a group that encompasses all of the 'Web' project instances.

### Example: Running Jenkins
For our project 'Web', we have an instance "Web-Customer\_A", where jobs for 'Customer A' are run for the 'Web' project. To start Jenkins for this instance "Web-Customer\_A", we will use credentials called "Web-Customer\_A-Jenkins". This ensures that this instance of Jenkins can only be controlled by a user or service that works with 'Customer A' on the 'Web' project.

We derived the name of the credential from the project, the customer, and the service name. All of this is known by the time we deploy our customer's project, so an automated process can create the credential as well, and assign it as necessary.

Empty file added practices/agile/.gitignore
Empty file.
17 changes: 17 additions & 0 deletions practices/ci/continuous_delivery_features.md
@@ -0,0 +1,17 @@
# Mapping | CI-CD

## Input
* credentials
* job
* pipeline
* inventory
* version control files
* other build info
* action (build, stop)
* build child nodes
## Output
* job status
* job result
* output/files generated
* logs

31 changes: 31 additions & 0 deletions practices/ci/continuous_integration_best_practices.md
@@ -0,0 +1,31 @@
# Best Practices | CI-CD

## Best Practices

https://en.wikipedia.org/wiki/Continuous_integration#Best_practices

### Maintain a code repository

### Automate the build

### Make the build self-testing

### Everyone commits to the baseline every day

### Every commit (to baseline) should be built

### Keep the build fast

### Test in a clone of the production environment

### Make it easy to get the latest deliverables

### Everyone can see the results of the latest build

### Automate deployment

## Notes

* Beware large monolithic projects. The bigger they get, the more tests there are, and the harder it is to have a successful build as different parts of it constantly change. Segment your project or split it off if it's getting too difficult to keep the builds clean.

* Use Multi-stage Continuous Integration to better manage large projects. https://en.wikipedia.org/wiki/Multi-stage_continuous_integration
15 changes: 15 additions & 0 deletions practices/declarative/README.md
@@ -0,0 +1,15 @@
# Declarative Programming of Infrastructure

Declarative programming is a programming paradigm used to describe *what* a program must accomplish, rather than *how* to accomplish it. This form of programming is used to build cloud infrastructure, with solutions such as AWS CloudFormation, and HashiCorp Terraform.

## Benefits of Declarative Infrastructure
<!-- TODO --> I need content, please edit me!

## Problems of Declarative Infrastructure

### Non-idempotence
Not all resources used by your infrastructure are idempotent - that is, they cannot be applied multiple times without changing the initial result. Therefore, declarative infrastructure cannot be said to be fully idempotent. You will have to write tests and workarounds to ensure your result is as expected.

### Making specific changes is hard
You know you want to modify job C that runs on server type A, but you don't want to change job C on server type B. You will probably have to copy the job and make a new one with the change, and make it run only on server type A. Or, even more complicated, add specific logic to job C to behave differently only on server type A. These kinds of divergent changes make your code more complex over time, and thus harder to manage.

9 changes: 9 additions & 0 deletions practices/deployment.md
@@ -0,0 +1,9 @@
# Deployment

# Undeployment

The process of deployment can include building infrastructure, copying files, changing security rules, and opening network paths for application traffic. If these changes are not removed when they no longer become necessary, various problems emerge. Wasteful use of resources, security vulnerabilities, and overall confusion over legacy configurations can result.

Undeployment processes effectively reverse the deployment process. As a practical tool, they can be one part of a rollback procedure, and as another, help shut down resources no longer in use. They are also useful to reduce waste.


45 changes: 45 additions & 0 deletions practices/deployment_models.md
@@ -0,0 +1,45 @@
# Deployment Models

There are two basic deployment models we will focus on: simple and short, and complicated and long.

## Simple Short Deployment

A simple short deployment model involves taking build artifacts, putting them in some remote place, and performing some steps to make the new build artifacts live. It happens immediately and there are no other considerations.

## Complicated Long Deployment

A complicated long deployment model takes the simple short model and extends it to support large, customer-facing, potentially volatile situations. It may have multiple dependencies and require orchestration of multiple components. It may need to be paused, resumed, or aborted. It may need to have tests run at each phase before continuing.

## Dependency-driven deployment

Dependency-driven deployment involves deploying using a hierarchy of dependencies. Dependencies that have already been deployed successfully may be skipped to save time.

## Blue/Green deployment

In this model, two environments exist: one with an old version of your app, and one with a new one. By switching traffic from one environment to the other you can fail back quickly in the event of a problem with the deploy or issues experienced after deploy.

Potential pitfalls:

* Long-running transactions in the old environment need to be phased out safely
* If you use separate databases for these environments, you may need to migrate data from one to the other.
* If you use the same database for these environments, you may have problems in the new environment that affect the old one.
* If you use a database service that supports snapshots, definitely use them.

## Blue/Turqoise/Green deployment

This is the same as Blue/Green deployment, but with an extra step to handle moving applications on a shared data store.

When switching apps that are using a shared database, the database may need to be modified to support the new app code, which would break the old application. This normally requires a maintenance outage. To avoid that outage, "Turquoise" is an intermediate environment with the old application, and with a database that supports both the old and new application. Blue is switched to Turquoise, where the Blue app keeps running, but on a Green-compatible database. Then you switch to the Green stack, which is both the new app and the new database.

https://www.digitalocean.com/community/tutorials/how-to-use-blue-green-deployments-to-release-software-safely
http://blog.dixo.net/2015/02/blue-turquoise-green-deployment/
https://www.slideshare.net/mikebrittain/mbrittain-continuous-deploymentalm3public/50

## Canary deployment

Canary deployment is similar to Blue/Green deployment except it is not an all-or-nothing cutover. Instead, a small percentage of users will use the new environment, and the rate increases as confidence grows, until the new environment gets 100% of traffic. The old environment can then be removed.

## Continuous deployment

https://www.slideshare.net/mikebrittain/mbrittain-continuous-deploymentalm3public/

Empty file added practices/immutable/.gitignore
Empty file.
13 changes: 13 additions & 0 deletions practices/monitoring_models.md
@@ -0,0 +1,13 @@

# Alerts

Alerts are a form of monitoring that require immediate action.

# Tickets

Tickets are a form of monitoring that require eventual action.

# Logging

Logs are a form of monitoring that require no action. Graphs are a type of logging, or log reporting.

12 changes: 12 additions & 0 deletions practices/monitoring_reliability.md
@@ -0,0 +1,12 @@
# Monitoring Reliability


Reliability is a function of mean-time-to-failure (MTTF), and mean-time-to-recovery (MTTR).

MTTF is important in determining when a process is not reliable.

## MTTR

MTTR is important in determining how quickly you can resolve an issue and limit the impact on a service.

As humans add latency to the MTTR, automated systems are useful to reduce the amount of time it takes to resolve an issue. Playbooks, or runbooks, also help reduce MTTR.
5 changes: 5 additions & 0 deletions practices/scaling.md
@@ -0,0 +1,5 @@
# Scaling

Scaling is the process of adjusting to changing load. It is necessary to reduce cost and handle varying amounts of resource use and demand. Not every application and customer will need to be scaled, but some part of any given system usually needs to be scaled at some point in its lifecycle.


9 changes: 9 additions & 0 deletions practices/scaling_models.md
@@ -0,0 +1,9 @@
# Scaling Models

## Static scaling

Static scaling is the process of changing scale once, to a fixed size.

## Dynamic scaling

Dynamic scaling is the process of changing scale whenever necessary to a variable size.
40 changes: 40 additions & 0 deletions practices/testing_models.md
@@ -0,0 +1,40 @@
# Testing Models

There are multiple models of tests, some of which depend on or come after the others.

## Unit tests

Unit tests are tests of specific individual functionality. This test does not take any other part into consideration.

## Integration tests

Integration tests are tests of the integration of multiple components as a whole. This may involve multiple dependencies.

## System tests

System tests are tests to verify an entire system functions as expected. This includes smoke tests, performance tests, regression tests, etc.

## Production tests

These tests verify that live production systems work as expected. If one of these fails, there is a significant unexpected problem which may be customer-facing, which should trigger alerts and potentially trigger a rollback.

## Verification tests

Verification tests, also known as configuration tests, are intended to validate expectations. Instead of verifying if something is working, they simply verify that something looks the way it should. If this test fails, it means something unexpected has occurred, which may need to trigger a redeploy or rollback. Examples of failure would include file corruption, security penetrations, mistakes in configuration, unexpected results of automation processes, etc.

## Stress tests

Stress tests are used to find the limits of a component of a system. They are basically intended to break something, and record what it took to break it, and what the result was.

## Canary test

Canary tests are used to wait a period of time in an attempt to find unexpected problems before continuing a process. On failure, they should immediately revert any changes, and fail whatever process depends on them. Canary tests may use sampling of expected metrics (such as error rates, traffic rates, resource use, load) to trigger a failure when something unexpected occurs.

## Issue-driven tests

Issue-driven tests are tests created from reported issues. As each issue is documented and resolved, a test case is created to check for that issue, so that one will know immediately if it re-occurs.

## Chaos tests

Chaos tests are used to identify unexpected bugs and fix them before they become a larger problem.

0 comments on commit 1c2310c

Please sign in to comment.