fee-flow

本项目是我编写的一个关于前端监控的学习课程，参与学习人数370人。

fee介绍

fee（灯塔）是前端监控系统，贝壳找房主要前端监控系统，服务公司上百条产品线。特点：架构简单、轻量、支持私有化部署。可收集前端设备、系统、环境信息，可以对前端页面js报错、资源错误、性能指标进行配置报警等，并且可以通过上报错误信息引导用户快速定位解决问题。

fee是一款优秀的开源前端监控系统项目，因为涉及后端，对前端开发同学来说有一定门槛，直接照着文档和代码，恐怕项目还很难跑起来，所以整理了一下，可以确保这个项目能顺利运行。

架构图

1. nginx

配置日志格式 log_format
配置打点服务，并设置access_log日志存放路径
打点服务中，利用nginx自带的empty_gif模块，配置一个空gif的静态资源服务

另外需要：

新建一个nginx日志存放的文件夹
```
mkdir -p /var/log/nginx/ferms
```

启动nginx的用户，对这个文件夹有读写权限

-rw-r--r-- 1 nginx nginx 14201 Sep  1 14:35 202009-01-14-35.log

dig-log的日志，为防止占满磁盘，需要定期清除，可以设置时间间隔为1到3天

log_format ferms '$time_iso8601     -       -       $remote_addr    $http_host      $status $request_time   $request_length $body_bytes_sent        15d04347-be16-b9ab-0029-24e4b66459509689c3ea-5155-2df7-a719-e90d2dedeb2c 937ba755-116a-18e6-0735-312cba23b00c        -       -       $request_uri    $http_user_agent        -       sample=-&_UC_agent=-&device_id=-&-      -   -';
http {
    server {
        listen 80;
        server_name ferms.xxx.com;
        return 301 https://ferms.xxx.com$request_uri;
    }
    server {
        listen 443 ssl;
        server_name ferms.xxx.com;

        ssl_certificate      /etc/nginx/ssl/xxx.crt;
        ssl_certificate_key  /etc/nginx/ssl/xxx.key;

        if ($time_iso8601 ~ "^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2})") {
            set $year $1;
            set $month $2;
            set $day $3;
            set $hour $4;
            set $minute $5;
        }
        access_log /var/log/nginx/ferms/$year$month-$day-$hour-$minute.log ferms;
        location = /dig.gif {
            empty_gif;
        }
    }
}

1.1 测试nginx服务

访问https://ferms.xxx.com/dig.gif?a=1
观察/var/log/nginx/ferms/文件夹下是否产生日志文件，例如202009-01-14-35.log

2. kafka

2.1 配置hosts

在打点服务所在机器，根据kafka集群的hostname配置hosts

# vim `/etc/hosts`
# kafka
xxx.xxx.xxx.xxx kafka1
yyy.yyy.yyy.yyy kafka2
zzz.zzz.zzz.zzz kafka3

2.2 下载kafka

为方便调试，可下载kafka

镜像下载地址

cd /your/path/to/kafka_2.12-2.5.0

2.3 常用kafka命令

查看topic列表

./kafka-consumer-groups.sh --bootstrap-server xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092,zzz.zzz.zzz.zzz:9092 --list

生产消息

./kafka-console-producer.sh --broker-list xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092,zzz.zzz.zzz.zzz:9092 --topic fee

消费消息

./kafka-console-consumer.sh --bootstrap-server=xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092,zzz.zzz.zzz.zzz:9092 --topic fee --from-beginning

查看某个group消费情况

./kafka-consumer-groups.sh --bootstrap-server xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092,zzz.zzz.zzz.zzz:9092 --describe -group test-group

启动kafka

./bin/kafka-server-start.sh ./config/server.properties &

参考文档

Kafka 概念、单机搭建与使用

3. filebeat

3.1 filebeat安装

安装官方文档

Download and install the public signing key:

sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch

Create a file with a .repo extension (for example, elastic.repo) in your /etc/yum.repos.d/ directory and add the following lines:

[elastic-7.x]
name=Elastic repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

sudo yum install filebeat

安装后:

bin所在目录为：/usr/share/filebeat
配置文件所在目录为：/etc/filebeat/filebeat.yml

3.2 filebeat配置

vim /etc/filebeat/filebeat.yml

如果是老版本filebeat，filebeat.inputs的字段名称会是filebeat.prospectors，启动会报错：

Exiting: 1 error: setting 'filebeat.prospectors' has been removed

故修改为：

# ============================== Filebeat inputs ===============================
# 把filebeat.prospectors:
# 修改为
filebeat.inputs:

配置filebeat.inputs的enabled为true
配置filebeat.inputs的paths为nginx日志文件路径

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/ferms/*.log

配置output.kafka的hosts和topic

# ================================== Outputs ===================================

output.kafka:
  hosts: ["xxx.xxx.xxx.xxx:9092","yyy.yyy.yyy.yyy:9092","zzz.zzz.zzz.zzz:9092"]
  topic: "fee"

filebeat需要把一些无用的fields字段删除：

# ================================= Processors =================================
processors:
  - drop_fields:
      fields: ["@timestamp", "@metadata", "log", "input", "ecs", "host", "agent"]

drop_fields具体配置参见官网

具体配置如下：

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/nginx/ferms/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  #multiline.match: after

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 3
  #index.codec: best_compression
  #_source.enabled: false


# ================================== General ===================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# =============================== Elastic Cloud ================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

# ================================== Outputs ===================================

# Configure what output to use when sending the data collected by the beat.

output.kafka:
  hosts: ["xxx.xxx.xxx.xxx:9092","yyy.yyy.yyy.yyy:9092","zzz.zzz.zzz.zzz:9092"]
  topic: "fee"

# ---------------------------- Elasticsearch Output ----------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
  #hosts: ["localhost:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  #username: "elastic"
  #password: "changeme"

# ------------------------------ Logstash Output -------------------------------
#output.logstash:
  # The Logstash hosts
  #hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

# ================================= Processors =================================
processors:
  - drop_fields:
      fields: ["@timestamp", "@metadata", "log", "input", "ecs", "host", "agent"]

# ================================== Logging ===================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
# logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:

# ============================== Instrumentation ===============================

# Instrumentation support for the filebeat.
#instrumentation:
    # Set to true to enable instrumentation of filebeat.
    #enabled: false

    # Environment in which filebeat is running on (eg: staging, production, etc.)
    #environment: ""

    # APM Server hosts to report instrumentation results to.
    #hosts:
    #  - http://localhost:8200

    # API Key for the APM Server(s).
    # If api_key is set then secret_token will be ignored.
    #api_key:

    # Secret token for the APM Server(s).
    #secret_token:


# ================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true

3.3 filebeat启动、状态和停止

系统启动：

service filebeat start
service filebeat status
service filebeat stop

或bin启动：

/usr/share/filebeat/bin/filebeat -e -c /etc/filebeat/filebeat.yml

3.4 测试

./kafka-console-consumer.sh --bootstrap-server=xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092,zzz.zzz.zzz.zzz:9092 --topic fee --from-beginning

kafka接收到的消息内容：

{
    "@timestamp": "2020-07-17T06:52:48.899Z",
    "@metadata": {
        "beat": "filebeat",
        "type": "_doc",
        "version": "7.8.0"
    },
    "ecs": {
        "version": "1.5.0"
    },
    "host": {
        "name": "high-52"
    },
    "log": {
        "offset": 0,
        "file": {
            "path": "/etc/nginx/logs/dig-log/202007-17-14-52.log"
        }
    },
    "message": "17/Jul/2020:14:52:41 +0800                      -       -       - xxx.xxx.com  304 0.000 594 0      15d04347-be16-b9ab-0029-24e4b6645950    -       -       9689c3ea-5155-2df7-a719-e90d2dedeb2c 937ba755-116a-18e6-0735-312cba23b00c    GET /dig.gif?a=iAmAMessageFromBrowser HTTP/1.1  -       Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36     -       sample=-&_UC_agent=-&lianjia_device_id=-&-      -       -       -",
    "input": {
        "type": "log"
    },
    "agent": {
        "id": "5d6667a6-31eb-48e4-b9a5-0881c38a00d0",
        "name": "high-52",
        "type": "filebeat",
        "version": "7.8.0",
        "hostname": "high-52",
        "ephemeral_id": "3e8701c7-a37a-47d6-9469-267aec61ea0b"
    }
}

3.5 filebeat常见问题

当碰到问题，可以配置日志级别为debug，查看详细日志

# vim /etc/filebeat/filebeat.yml
logging.level: debug

tail -f  /var/lib/filebeat/registry/filebeat/log.json

3.6 filebeat参考文档

filebeat命令参考

4. mysql

创建fee数据库

然后执行npm run prod_fee Utils:GenerateSQL 1 '2018-01' '2020-10' > init.sql 生成数据库SQL

`
Utils:GenerateSQL
{projectIdList:项目id列表,逗号分割}
{startAtYm:建表日期开始时间(包括该时间),${DATE_FORMAT.COMMAND_ARGUMENT_BY_MONTH}格式}
{finishAtYm:建表日期结束时间(包括该时间),${DATE_FORMAT.COMMAND_ARGUMENT_BY_MONTH}格式}
`

执行mysql -u root -h 127.0.0.1 platform -p < init.sql, 执行建表语句

通过插入语句创建新项目

// id, 抽样比率, 项目名(展示), 项目id, 负责人信息
// project_test_id
REPLACE INTO `t_o_project` (`id`, `rate`, `display_name`, `project_name`, `c_desc`, `is_delete`, `create_ucid`, `update_ucid`, `create_time`, `update_time`) VALUES (1, 10000, '测试项目', 'project_test_id', '测试项目负责人', 0, '', '', 0, 0);

5. fee server

5.1 环境确认

为确保node-rdkafka顺利编译和运行，检查是否安装了gcc+、gcc-c++、zlib-devel，否则可能产生难以定位的问题。

yum -y install gcc+ gcc-c++ zlib-devel

5.2 配置fee server

配置server/src/configs/common.js

const production = {
    use: {
        kafka: true, // 是否使用kafka。如果没有kafka，设为false，并且指定下面的nginxLogFilePath
    }
    // ...
}

配置server/src/configs/kafka.js

const production = {
    'group.id': 'fee',
    'metadata.broker.list': 'xxx.xxx.xxx.xxx:9092, yyy.yyy.yyy.yyy:9092, zzz.zzz.zzz.zzz:9092',
}

配置server/src/configs/mysql.js

const production = {
    host: 'xxx.com',
    port: '3306',
    user: 'xxx',
    password: 'xxx',
    database: 'fee'
}

配置server/src/configs/redis.js

const production = {
    host: 'xxx.com',
    port: '6379',
    db: '4',
    password: 'xxx'
}

配置server/src/library/redis/index.js

class RedisClient {
constructor (isTest = false) {
    this.redisClient = new Redis({
    // ...
    password: redisConfig.password,

修改server/src/app.js

const app = express()
// 设置存放模板引擎目录
app.set('views', path.join(__dirname, '../public'))
// 设置模板引擎为ejs
// app.set('view engine', 'ejs')
app.engine('html', ejs.renderFile)
app.set('view engine', 'html')

修改server/bin/run.sh

FEE_COMMAND="Task:Manager" # 上线时直接把它改成Task:Manager即可

修改server/src/commands/save_log/parseKafkaLog.js

// 适配filebeat
if (content.includes('@timestamp') && content.includes('@metadata')) {
    const dataObj = JSON.parse(content);
    content = dataObj.message || '';
}

5.3 启动server

进程守护启动：

pm2 start ./pm2_fee_task_manager.json --env production
pm2 start ./pm2_fee_app.json --env production

观察server日志：

tail -f ./log/pm2/command/task-manager-out.log

5.4 调试相关

启动server watch，响应实时修改
```
npm run watch
```
开发启动
```
npm run dev
```
前台启动
```
./bin sh ./run.sh run production
```

kafka消费调试

NODE_ENV=testing node dist/fee.js SaveLog:Kafka

入库调试

# [按小时] 解析nginx日志, 分析分钟级别的指定时间范围内的性能指标
NODE_ENV=testing node dist/fee.js Parse:Monitor "2020-09-01 10:26"  "2020-09-01 14:27"

汇总调试

NODE_ENV=testing node dist/fee.js Summary:Performance "2020-09-01 14" hour
NODE_ENV=testing node dist/fee.js Summary:Performance "2020-09-01" day
NODE_ENV=testing node dist/fee.js Summary:Performance "2020-09" month

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
assets		assets
day01		day01
day02		day02
day03		day03
day04		day04
day05		day05
day06		day06
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fee-flow

fee介绍

架构图

1. nginx

1.1 测试nginx服务

2. kafka

2.1 配置hosts

2.2 下载kafka

2.3 常用kafka命令

查看topic列表

生产消息

消费消息

查看某个group消费情况

启动kafka

参考文档

3. filebeat

3.1 filebeat安装

3.2 filebeat配置

3.3 filebeat启动、状态和停止

3.4 测试

3.5 filebeat常见问题

3.6 filebeat参考文档

4. mysql

5. fee server

5.1 环境确认

5.2 配置fee server

5.3 启动server

5.4 调试相关

About

Releases

Packages

diandian18/fee-flow

Folders and files

Latest commit

History

Repository files navigation

fee-flow

fee介绍

架构图

1. nginx

1.1 测试nginx服务

2. kafka

2.1 配置hosts

2.2 下载kafka

2.3 常用kafka命令

查看topic列表

生产消息

消费消息

查看某个group消费情况

启动kafka

参考文档

3. filebeat

3.1 filebeat安装

3.2 filebeat配置

3.3 filebeat启动、状态和停止

3.4 测试

3.5 filebeat常见问题

3.6 filebeat参考文档

4. mysql

5. fee server

5.1 环境确认

5.2 配置fee server

5.3 启动server

5.4 调试相关

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages