Skip to content

Commit

Permalink
Update documentatio for Frinx Machine 1.9
Browse files Browse the repository at this point in the history
  • Loading branch information
Jozef Volak committed Feb 15, 2022
1 parent a52221b commit 242fe70
Show file tree
Hide file tree
Showing 9 changed files with 293 additions and 101 deletions.
136 changes: 54 additions & 82 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,117 +1,89 @@
# Frinx Machine 1.8 RELEASE NOTE:
# Frinx Machine 1.9 RELEASE NOTE:
-----------------
## Frinx Machine
* Credentials and certificates via docker secret

* KrakenD custom certs via docker secrets

* Multinode deployment, multiple placement methods can be used

* Uniconfig and Traefik settings via docker config

* Authorization and Authentification with Azure AD (AAA)

* Added high-performance resource limits

<br>

## Updated Services
### Uniconfig 4.2.9:

* RPC install-multiple-nodes

* Fix issues and improve stability

* transactions, load balancing (without multizone support)

### Uniconfig
* Leaf-ref validation

### Conductor:
* Introduction of transaction idle-timeout

* External Storage with Postgres DB

* Exposed in workflows proxy and also in KrakenD

* Removed AAA

### Micros:
* Bug fixing

* Add transactions and load-balancing to CLI, NETCONF, UNICONFIG workers

* Add worker for installing multiple devices in parallel

* Add base workflows to pypi package


### Frinx Frontend:
<br>

* Editor for workflow JSON's in UI

* Used new KrakenD conductor endpoints

### Monitoring

### Sample Topology:
* InfluxDB instead of Prometheus

* Simulation of new device types (Junos, iosxr 653, iosxr 663)

* new simulated CLI cisco and Junos

* Telegraf instead of node-exported and cadvisor

### Demo Workflows:
### Conductor

* new workflows

* device identification (name, version, …) based on IP address

* LLDP one device

* Sanitize log4j vulnerability

### KrakenD:
### Workflow-proxy

* removed obsolete endpoints

* updated queries for searching

* fixed query parsing for Uniconfig RPC’s

* Fix RBAC issues

### Workflow Proxy:
* OpenAPI with AAA

* updated endpoint for searching

* add policy headers

* swagger ui

* Event sanitize

### Monitoring services:
### Inventory

* Monitoring services in global mode

* Dashboards prepared for multi-node deployment

* Replaced host id by hostname

* Transaction id to uniconfig API communication

### Device Inventory:
* Remove snapshots

###
* Uniconfig zone tenant defined via env variable

Rest-api changes:
=================
### Frinx-Frontend

### Removed krakend endpoints:
* Bug fixing

* **POST** - /api/uniflow/executions

* **GET** - /api/uniflow/schedule/{name}/{b}

* **POST** - /api/uniconfig/rests/operations

* **PUT, GET, PATCH, DELETE, POST** - /api/uniconfig/rests/data


### Changed KrakenD endpoints:
### KrakenD

<table data-layout="full-width" data-local-id="b5b465ff-93a0-439f-9c10-c17c09c1dcc2" class="confluenceTable"><colgroup><col style="width: 231.0px;"><col style="width: 734.0px;"><col style="width: 111.0px;"><col style="width: 724.0px;"></colgroup><tbody><tr><th class="confluenceTh"><p><strong>METHOD</strong></p></th><th class="confluenceTh"><p><strong>OLD ENDPOINT</strong></p></th><th class="confluenceTh"><p><strong>METHOD</strong></p></th><th class="confluenceTh"><p><strong>NEW ENDPOINT</strong></p></th></tr><tr><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/api/uniflow/metadata</p></td><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/api/uniflow/metadata/workflow</p></td></tr><tr><td class="confluenceTd"><p>GET</p></td><td class="confluenceTd"><p>/api/uniflow/metadata/workflow/:name/:version</p></td><td class="confluenceTd"><p>GET</p></td><td class="confluenceTd"><p>/api/uniflow/metadata/workflow/{name}?version=</p></td></tr><tr><td class="confluenceTd"><p>DELETE</p></td><td class="confluenceTd"><p>/api/uniflow/bulk/terminate</p></td><td class="confluenceTd"><p>DELETE</p></td><td class="confluenceTd"><p>/api/uniflow/workflow/bulk/terminate</p></td></tr><tr><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/api/uniflow/bulk/pause</p></td><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/api/uniflow/workflow/bulk/pause</p></td></tr><tr><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/api/uniflow/bulk/resume</p></td><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/api/uniflow/workflow/bulk/resume</p></td></tr><tr><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/api/uniflow/bulk/retry</p></td><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/api/uniflow/workflow/bulk/retry</p></td></tr><tr><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/api/uniflow/bulk/restart</p></td><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/api/uniflow/workflow/bulk/restart</p></td></tr></tbody></table>
* KrakenD Azure plugin with role claims to the header

### Changed KrakenD query inputs
* KrakenD Azure plugin with optional group claims to the header

<table data-layout="full-width" data-local-id="5c679ae8-0c97-4b4c-a589-fd8917db2b87" class="confluenceTable"><colgroup><col style="width: 605.0px;"><col style="width: 577.0px;"><col style="width: 618.0px;"></colgroup><tbody><tr><th class="confluenceTh"><p><strong>ENDPOINT</strong></p></th><th class="confluenceTh"><p><strong>OLD QUERY</strong></p></th><th class="confluenceTh"><p><strong>NEW QUERY</strong></p></th></tr><tr><td class="confluenceTd"><p>/api/uniflow/hierarchical</p></td><td class="confluenceTd"><p>?freeText=(workflowId:)AND(status:)&amp;start=&amp;size=</p></td><td class="confluenceTd"><p>?workflowId=&amp;status=&amp;start=&amp;size=&amp;order=</p><p>order inputs : DESC (default), ASC</p></td></tr><tr><td class="confluenceTd"><p>/api/uniflow/executions</p></td><td class="confluenceTd"><p>?q=&amp;h=&amp;freeText=(workflowId:)AND(status:)&amp;start=&amp;size=</p></td><td class="confluenceTd"><p>?q=&amp;h=&amp;workflowId=&amp;status=&amp;start=&amp;size=&amp;order=</p><p>order inputs : DESC (default), ASC</p></td></tr><tr><td class="confluenceTd"><p>/workflow/{a}</p></td><td class="confluenceTd"><p>?*</p></td><td class="confluenceTd"><p>?includeTask=</p></td></tr><tr><td class="confluenceTd"><p>/metadata/taskdefs/{name}</p></td><td class="confluenceTd"><p>?*</p></td><td class="confluenceTd"><p>?archiveWorfklow=</p></td></tr><tr><td class="confluenceTd"><p>/metadata/workflow/{name}</p></td><td class="confluenceTd"><p>?*</p></td><td class="confluenceTd"><p>?version=</p></td></tr></tbody></table>
* Validate certs during starting a container

### Removed workflow-proxy endpoints
### Resource manager

* Add desired value for vlan strategy

* Rewrite and refactor ivp4 strategy

* **GET** - /schedule/metadata/workflow

* Update unique-id strategy

### Changed workflow-proxy endpoints
## Rest-api changes

<table data-layout="full-width" data-local-id="d7a2dfd7-0839-48c9-8d57-aa31f3e595fe" class="confluenceTable"><colgroup><col style="width: 231.0px;"><col style="width: 734.0px;"><col style="width: 111.0px;"><col style="width: 724.0px;"></colgroup><tbody><tr><th class="confluenceTh"><p><strong>METHOD</strong></p></th><th class="confluenceTh"><p><strong>OLD ENDPOINT</strong></p></th><th class="confluenceTh"><p><strong>METHOD</strong></p></th><th class="confluenceTh"><p><strong>NEW ENDPOINT</strong></p></th></tr><tr><td class="confluenceTd"><p>GET</p></td><td class="confluenceTd"><p>/shedule/?</p></td><td class="confluenceTd"><p>GET</p></td><td class="confluenceTd"><p>/schedule</p></td></tr><tr><td class="confluenceTd"><p>GET</p></td><td class="confluenceTd"><p>/metadata/workflow/:name/:version</p></td><td class="confluenceTd"><p>GET</p></td><td class="confluenceTd"><p>/metadata/workflow/{name}?version=</p></td></tr><tr><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/metadata</p></td><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/metadata/workflow</p></td></tr><tr><td class="confluenceTd"><p>DELETE</p></td><td class="confluenceTd"><p>/bulk/terminate</p></td><td class="confluenceTd"><p>DELETE</p></td><td class="confluenceTd"><p>/workflow/bulk/terminate</p></td></tr><tr><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/bulk/pause</p></td><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/workflow/bulk/pause</p></td></tr><tr><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/bulk/resume</p></td><td class="confluenceTd"><p>PUT</p></td><td class="confluenceTd"><p>/workflow/bulk/resume</p></td></tr><tr><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/bulk/retry</p></td><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/workflow/bulk/retry</p></td></tr><tr><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/bulk/restart</p></td><td class="confluenceTd"><p>POST</p></td><td class="confluenceTd"><p>/workflow/bulk/restart</p></td></tr></tbody></table>
### New workflow-proxy endpoints

### Changed workflow-proxy query inputs
* **GET** - /oauth2-redirect.html : Swagger UI redirect url

* **POST** - /api/uniflow/docs/token : CORS fixing token change url

### Removed workflow-proxy endpoints

<table data-layout="full-width" data-local-id="1ef74ee3-665e-4f01-8ce4-2a4b45a3f0c4" class="confluenceTable"><colgroup><col style="width: 605.0px;"><col style="width: 577.0px;"><col style="width: 618.0px;"></colgroup><tbody><tr><th class="confluenceTh"><p><strong>ENDPOINT</strong></p></th><th class="confluenceTh"><p><strong>OLD QUERY</strong></p></th><th class="confluenceTh"><p><strong>NEW QUERY</strong></p></th></tr><tr><td class="confluenceTd"><p>/hierarchical</p></td><td class="confluenceTd"><p>?freeText=(workflowId:)AND(status:)&amp;start=&amp;size=</p></td><td class="confluenceTd"><p>?workflowId=&amp;status=&amp;start=&amp;size=</p></td></tr><tr><td class="confluenceTd"><p>/executions</p></td><td class="confluenceTd"><p>?q=&amp;h=&amp;freeText=(workflowId:)AND(status:)&amp;start=&amp;size=</p></td><td class="confluenceTd"><p>?q=&amp;h=&amp;workflowId=&amp;status=&amp;start=&amp;size=</p></td></tr></tbody></table>
* **GET** - /api/uniflow/workflow/{a}
123 changes: 104 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,20 @@ Minimal hardware requirements (See [resource limitation](#resource-limitation))
Production: (default)

- 24GB RAM
- 4x CPU
- 8x CPU

Development:

- 16GB RAM
- 4x CPU

It is recommended to deploy with 30GB of disk space or more depending on your data retention policies and deployment scale.
It is recommended to deploy with 40GB of disk space or more depending on your data retention policies and deployment scale.

To deploy an FM swarm cluster you need at least one machine with Ubuntu 18.04/20.04 installed.

## Frinx Machine architecture
Architecture overview [here](docs/fm_architecture.md).

# How to install and run FRINX Machine
You can deploy the FM either locally with all services running on a single node, or you can distribute UniFlow and UniConfig instances across multiple nodes. UniFlow is always running on the docker swarm manager node.

Expand Down Expand Up @@ -77,12 +80,52 @@ $ ./install.sh \
--http-proxy "ip:port" \
--https-proxy "ip:port" \
--no-proxy "ip:port,ip:port,..."

# or use env variables if are configured
$ ./install.sh \
--proxy-conf "${USER}/.docker/config.json" \
--http-proxy "${http_proxy}" \
--https-proxy "${https_proxy}" \
--no-proxy "${no_proxy}"

```
For disabling proxy, the config.json must be removed and content of UC_PROXY_* variables in .env file must be erased! For example: UC_PROXY_HTTP_ENV="".

For more info see: https://docs.docker.com/network/proxy/
</br></br>

### Enable Azure AD authorization

Frinx Machine support authentification and authorization via Azure AD.
For details about configuration visit [Azure AD configuration](docs/azure_ad.md).

For configuration use `azure_ad.sh` script.

You need to define:
- tenant namne: organization name (single tenant e.g. `yourAdName.onmicrosoft.com`), or `common` for multi tenant
- tenant id: code of tenant AD (GUID), e.g. aaaaaaaa_bbbb_cccc_dddd_eeeeeeeeeeee

- client id: code of application (GUID) for KrakenD plugin (see [KrakenD Azure Plugin docs](https://github.com/FRINXio/krakend-azure-plugin))
- client secret: application secret
- redirect uri: IP/DNS of server, from where is accessed frinx-frontend

```sh
# print help for configuration
$ ./azure_ad.sh configure -h
# example for multi-tenant
$ ./azure_ad.sh configure --azure_enable \
--tenant_name 'common' \
--tenant_id 'aaaaaaaa_bbbb_cccc_dddd_eeeeeeeeeeee' \
--client_id 'aaaaaaaa_bbbb_cccc_dddd_eeeeeeeeeeee' \
--client_secret '79A4Q~RL5pELYji-KU58UfSeGoRVGco8f20~K' \
--redirect_url 'localhost'

# validate configuration environment variables
# 0 - variables are correct, 1 - variables are wrong configured + print error message
./azure_ad.sh validate | echo $?
```
<br>

### Install/Update docker secrets (KrakenD HTTPS/TLS)
During installation, docker secrets are created and are used for establishing HTTPS/TLS connections. These secrets contain private and public keys and are generated from files in the ./config/certificates folder.

Expand Down Expand Up @@ -182,33 +225,61 @@ Before the Frinx Machine is start, is necessary to generate unique configuration
For generating these files use `generate_uc_compose.sh`.

You need to define:
- uniconfig zone name: must be unique name
- swarm node-id: where will be deployed (use docker node ls)
- uniconfig zone name: must be unique name - <service_name>
- folder path: where are stored composefiles for multinode deployment
- instances: how many uniconfig instances will be started per zone (redundancy)

- swarm node placement method: swarm node identificator, select one of them (use `docker node ls` for info)
* node id : unique ID od node
* node hostname : unique node hostname
* node role : manager/worker1/worker2 ...
* node label : node with label zone=<NODE_LABEL>

Script is checking input placement values based on Readiness status. Files are not generated bu default, when
nodes are not ready. This generation of composefiles can be forced with flag --force.
Default folder path is `./composefiles/uniconfig`, but can be differend (outside from FM repo folder).

```sh
$ ./generate_uc_compose.sh -s <service_name> -n <node_id> -f <path_to_folder> -i <instances>
# Check all nodes in cluster (from manager node)
$ docker node ls

m4lyotjrwc059u76dkdksyfsp * frinx-manager Ready Active Leader 20.10.5
vrybz35tsmtp23gd9byoimq1z frinx-worker1 Ready Active 20.10.5
li5msj11609ss58n7mafa9cbt frinx-worker2 Ready Active 20.10.5

# Check settings of nodes in cluster
docker node inspect <HOSTNAME> --format "{{.Description.Hostname}} {{.ID}} {{.Spec.Labels.zone}} {{.Spec.Role}}"

<Hostname> <ID> <Label> <Role>
frinx-manager m4lyotjrwc059u76dkdksyfsp uniflow manager

# Check
$ ./generate_uc_compose.sh -s <service_name> -f <path_to_folder> -i <instances> --hostname <Hostname>
$ ./generate_uc_compose.sh -s <service_name> -f <path_to_folder> -i <instances> --node-id <ID>
$ ./generate_uc_compose.sh -s <service_name> -f <path_to_folder> -i <instances> --label <Label>
$ ./generate_uc_compose.sh -s <service_name> -f <path_to_folder> -i <instances> --role <Role>

# Label swarm node with zone label
docker node update <NODE_HOSTNAME> --label-add zone=<UNIQUE_LABEL>

# Force generating of composefiles, e.g.
$ ./generate_uc_compose.sh -s <service_name> -f <path_to_folder> -i <instances> --role <Role> --force
```


<br>

### Upload configuration files on worker node
### Preparing worker/slave node for multi-node deployment

To deploy UniConfig to a worker node, distribute the default UniConfig configuration to `/opt` directory on the worker node (SCP used as an example).
To deploy UniConfig to a worker node, create cache folder and clean old uniconfig volumes.

From the worker node:
```sh
$ sudo install -o $USER -g $USER -m 755 -d /opt/frinx
# create cache volume for uniconfig-cotroller
mkdir -p /opt/frinx/<SERVICE_NAME>/uniconfig-controller/cache/

# if older FM was started on this node, remove docker persistant volumes
$ docker volume prune -f
```

From the manager node:
```sh
# path_to_folder contain generated files from previous steps
$ scp -r <path_to_folder>/opt/frinx/* username@host:/opt/frinx
docker volume prune --filter label=fm
```
</br>

Expand All @@ -229,6 +300,20 @@ NOTE: The deployment might take a while as the worker node needs to download all

## Preparing Environment
The FRINX-Machine repository contains a **env.template** (used for creating .env) and **.env** file in which the default FM configuration settings are stored. In .env file, the settings are divided to these groups:
* **Common settings**
> * JWT_PRODUCTION: enable/disable Azure AD authorization
* **Azure AD settings** (See [AzureAD Instalation manual](docs/azure_ad.md) )
> * AZURE_LOGIN_URL : url for logging to Azure AD, default: https://login.microsoftonline.com
> * AZURE_TENANT_NAME : tenant domain name
> * AZURE_TENANT_ID : tenant id where '-' are replaced with '_'
> * AZURE_CLIENT_ID : App (Client) ID
> * AZURE_CLIENT_SECRET : App (Client) secret
> * REDIRECT_URI : IP/DNS of server without scheme (http(s)://) !!!

* **RBAC settings**
> * ADMIN_GROUP : super admin group with all permissions, default: network-admin
* **Temporary settings** - Created by FM scripts, **do not change them**
> * UC_PROXY_* : use docker proxy in Uniconfig Service ( See [Installation](#installation) )
Expand Down Expand Up @@ -297,11 +382,11 @@ $ ./config/docker-security/bench_security.sh
### Monitoring services

Frinx Machine is collecting logs and metrics/statistics from services.
* Metrics: Prometheus
* Metrics: InfluxDB
* Logs: Loki
* Node monitoring: node-exporter
* Swarm monitoring: google/cadvisor
* Visualization: Grafana (url 127.0.0.1:3000)
* Node monitoring: Telegrad
* Swarm monitoring: Telegraf
* Visualization: Grafana (url 127.0.0.1:3000, user: admin, password: admin)

NOTE: Be aware, that the monitoring system is space consuming. For longer monitoring is good to have enough free space on the disc.
Optimal is 30Gb and more.
Expand Down
Binary file added docs/assets/azure_api_permissions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/azure_client_secret.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/azure_tenant.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/azure_token_configuration.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/assets/fm_architecture.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 242fe70

Please sign in to comment.