Skip to content
khodayard edited this page Apr 15, 2020 · 17 revisions

Table of contents:


Start here

I couldn't find a good and complete guide on the web to install a working (production ready) cluster of OpenShift (OKD) 3.11 with all of it's requirements and dynamic storage provisioning. So this is a guide to install a OKD 3.11 with this spec:

  1. All Nodes with CentOS 7.7.1908 (Core)

  2. 20 Nodes

    • 1 Edge Server (DNS, Chrony-ntp, Router, Ansible Management) (33.33.33.33) (hostname: dns##.okd.lcl)
    • 2 External Load Balancers (10.1.1.35 (vip),10.1.1.36, 10.1.1.37) (hostname: xlb##.okd.lcl) (vip name: xlb.okd.lcl)
    • 2 Internal Load Balancers ((10.1.1.30 (vip),10.1.1.31, 10.1.1.32) (hostname: nlb##.okd.lcl) (vip name: nlb.okd.lcl)
    • 3 Master Nodes (10.1.1.21, 10.1.1.22, 10.1.1.23) (hostname: mst##.okd.lcl)
    • 3 Worker Nodes (10.1.1.41, 10.1.1.42, 10.1.1.43) (hostname: wrk##.okd.lcl)
    • 3 Infra Nodes (10.1.1.51, 10.1.1.52, 10.1.1.53) (hostname: inf##.okd.lcl)
    • 3 GlusterFS Storage Nodes (10.1.1.61, 10.1.1.62, 10.1.1.63) (hostname: glf##.okd.lcl)
    • 3 GlusterFS Registry Storage Nodes ((10.1.1.71, 10.1.1.72, 10.1.1.73) (hostname: glr##.okd.lcl)
  3. Only Edge server has access to internet and all the other servers are routed through this server which is set as dns server, Chrony server and network gateway for all the other servers.

  4. As it may be obvious we have two networks:

    • One which can access internet (Public) with this IP: 33.33.33.33 and is provided by IaaS provider. We only have 1 IP address in this range which is set on router.
    • One which is private and in our full control with this IP range: 10.1.1.1/24
  5. Main domain name for this cluster is okd.lcl and OKD router will work with *.apps.okd.lcl which is set in DNS zone.

Documents Source

I've mainly used OKD documentsation here: https://docs.okd.io/3.11/welcome/index.html and specially here: https://docs.okd.io/3.11/install/index.html These docs are very good but there could be some ambiguity for newbies like me in some aspects.

Architecture

Comprehensive documentation about OKD Architecture could be find here: https://docs.okd.io/3.11/architecture/index.html But a quick intro to my implementation cloud be like this:

  • Control will consist of three nodes: mst##.okd.lcl, these nodes will host all apis, controllers and etcd. I've decided to host etcd instances alongside with other instances as I was going to have enough resources available on master nodes.
  • SDN will be initiated from master nodes but will include all nodes. openvswitch will handle SDN.
  • I've separated infra nodes (inf##.okd.lcl) from compute nodes(wrk##.okd.lcl), infra nodes will host OKD load like router and registry and compute nodes will host client's load.
  • I've considered two GlusterFS clusters, each with 3 nodes:
    • glf##.okd.lcl will host clients' storage and provide "dynamic storage provisioning" for them.

    • glr##.okd.lcl will host docker registry storage.

    • It's OKD docs recommendation to separate these two clusters and it seems reasonable because of their different task, load and meaning.

    • OKD docs also recommend to use gluster-block only for metric and logging and it does not provide so much performance change, so I've decided not to use it.

    • Same applies to Gluster S3 Storage, as it's in tech preview I've skipped it.

    • Such implementation cloud hardly be implemented with less than this resource, but you can add resources as much as you want. of course this cluster would be scalable, but before implementation you can add HDDs to GlusterFS nodes to provide more storage for your registry or clients' storage and so on.

    • Independent from this openshift-ansible installation, I've made my load balancer high available and created a separated cluster of load balancer for external access to this cluster using haproxy and keepalived. I've talked about it detail here.