Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[srv.]怼员综合观察系统 #14

Closed
5 of 25 tasks
ZoomQuiet opened this issue Apr 23, 2017 · 8 comments
Closed
5 of 25 tasks

[srv.]怼员综合观察系统 #14

ZoomQuiet opened this issue Apr 23, 2017 · 8 comments

Comments

@ZoomQuiet
Copy link
Owner

ZoomQuiet commented Apr 23, 2017

背景

自怼圈 所有成员的嗯哼, 需要自动化的追踪和统计/展示;
相同需求也存在所有基于 github 的工程和课程中...

分析

官方提供了非常 RESTful 的接口, 并包含了所有 github 的行为,
但是, 还没有对应的, 我们需要的,可定制的自动化系统

方案

~ 没有,就自制

  • 筹备:
  • 初始化:
    • Django 原型网站 ~ 437623b
    • Heroku 试部署
    • 自动化接口测试
  • 迭代:
    • 数据库部署探索
      • Heroku-Redis 服务
    • 历史数据初始化
    • 自动每日增补
      • web hook <-- Github
    • 周统计图表
    • 每日统计图表
    • 可点击追查图表
    • ...
  • 发布:
    • 文档组织
    • 文档内审
    • DU发布
    • 引导持续优化
    • ...

参考

进展

@bambooom
Copy link
Collaborator

😱 python....快忘光了....我该从哪里下手..... orz

@ZoomQuiet
Copy link
Owner Author

@bambooom 是也乎,( ̄▽ ̄)

  • 好机会哪, 重新开始, 从笨办法
  • 这样才能将 开智项目 全桟式嗯哼下来, 变成一方大佬...
  • 当然的, 这个前端, 默认是你的了...

@xpgeng
Copy link
Collaborator

xpgeng commented Apr 23, 2017

已经看到 README 中关于数据的要求, 之前的 comment 删了...

  • 数据仓库
    • commits:
      • commit 谁/时间/变更行数/变更文件/url/sha
      • -> 排名 当天/周/全程
      • -> 各人 每天/时段/数量 趋势
    • comments
      • <- 谁/时间/变更行数/变更文件/url/sha
      • Issue 创建
      • Issue 回复
      • commit-comments
      • -> 排名 当天/周/全程
      • -> 各人 每天/时段/数量 趋势

单品脚本我来认领.

@ZoomQuiet
Copy link
Owner Author

参考: 0009973

@xpgeng @sunoonlee

所有事务数据字段

~ 俺直觉应该采集的

问题是我们应该用什么样的逻辑整理在一起, 以便可以简洁的查询出统计数据?

  • 因为关注是的成员的行为
  • 所以, 可能用这个结构来思考比较可行:
    • author
      • commits
      • commits-comments
      • issues
      • issues-comments

Commits

{'count': int
, 'commits':[{'sha':        ''
        , 'author':     ''
        , 'files':      int
        , 'changes':    int
        , 'additions':  int
        , 'deletions':  int
        , 'created_at': [str_to_timestamp?]
        }
        ,,,]
}

Commit-Comments

{'count': int
, 'comments':[{'id':        ''
        , 'author':     ''
        , 'body':       ''
        , 'words':      int
        , 'sha':        ''
        , 'created_at': [str_to_timestamp?]
        }
        ,,,]
}

Issue

{'count': int
, 'comments':[{'id':        ''
        , 'number':     int
        , 'comments':   int
        , 'author':     ''
        , 'body':       ''
        , 'words':      int
        , 'state':      ''
        , 'created_at': [str_to_timestamp?]
        , 'updated_at': [str_to_timestamp?]
        }
        ,,,]
}

Issue-Comments

{'count': int
, 'comments':[{'id':        ''
        , 'number':     int
        , 'author':     ''
        , 'body':       ''
        , 'words':      int
        , 'created_at': [str_to_timestamp?]
        , 'updated_at': [str_to_timestamp?]
        }
        ,,,]
}

@ZoomQuiet
Copy link
Owner Author

ZoomQuiet commented Apr 24, 2017

@bambooom @xpgeng @sunoonlee 是也乎,( ̄▽ ̄)

共同评估如下 JSON 文档是否够用, 易用:


=> .pkl {
    'ALL_SHA':[] , # commit sha collections
    'ALL_CID':[] , # comment id collections
    'ALL_UID':[] , # author id collections
    'ALL_DUR':{
        uid:{
            'CI_COUNT': int, # count commit
            'CI_LINES': int, # count commit lines
            
            'CC_COUNT': int, # count comment
            'CC_WORDS': int, # count comment words
            
            'IU_COUNT': int, # count Issue 
            'IU_WORDS': int, # count Issue words
            
            'IC_COUNT': int, # count Issue comments
            'IC_WORDS': int, # count Issue comments words
            
            'author':       '',
            
            'COMMITS': {
                sha:{
                    'files':        int,
                    'changes':      int,
                    'additions':    int,
                    'deletions':    int,
                    'created_at':   POSIX timestamp,
                    }
                , , ,
                },
            
            'CCOMMENTS': {
                ccid:{
                    'body':        '',
                    'words':      int,
                    'sha':    '',
                    'created_at':   POSIX timestamp,
                    }
                , , ,
                },
            'ISSUES': {
                isid:{
                    'number':        int,
                    'comments':      int, # need updating
                    'words':    int,
                    'state':    '',
                    'created_at':   POSIX timestamp,
                    'updated_at':   POSIX timestamp,
                    }
                , , ,
                },
            'ICOMMENTS': {
                icid:{
                    'number':        int,
                    'words':    int,
                    'created_at':   POSIX timestamp,
                    'updated_at':   POSIX timestamp,
                    }
                , , ,
                },
            , , ,
            } # <- uid
        } #<- ALL_DUR
    }
    

@ZoomQuiet
Copy link
Owner Author

<-- c895d0f

  • 发现一次只能对一个分支进行 commits-comments 提取
  • 原因目测是 HTTPS 的缓存问题?
  • 反正,只能 fabric 来自动并发了
  • 另外,这导致无法通过全局 sha/uid/cid 来进行实时去重
  • 所以,开始本地 Redis 数据库迁移

@ZoomQuiet
Copy link
Owner Author

ZoomQuiet commented Apr 25, 2017

Redis 键设计

~ 基于当前需要的统计需求
@xpgeng @bambooom @sunoonlee

  • 170425 ZQ init.
  • 170426 ZQ fix ASET
  • 170427 ZQ append BINGLOG
  • 170428 ZQ append ophan collections

日常统计

~ 以怼员为核心

  • 全程排名
    • top5 次数/行数 Commit
    • top5 次数/行数 Commits-Comments
    • top5 次数/行数 Issue
    • top5 次数/行数 Issue-Comments
  • 当前周排名
    • top5 次数/行数 Commit
    • top5 次数/行数 Commits-Comments
    • top5 次数/行数 Issue
    • top5 次数/行数 Issue-Comments
  • 任意一天排名
    • top5 次数/行数 Commit
    • top5 次数/行数 Commits-Comments
    • top5 次数/行数 Issue
    • top5 次数/行数 Issue-Comments
  • 追查任意怼员:
    • 全程 Commit/Commits-Comments/Issue/Issue-Comments
    • 某日 Commit/Commits-Comments/Issue/Issue-Comments
    • 具体 Commit/Commits-Comments/Issue/Issue-Comments

键值对

Key 设计模式: object-type:id:field

=> Redis:

ALL:SHA -> SET # commit sha collections
ALL:CID -> SET # commit/issue comment id collections
ALL:UID -> SET # author id collections
ALL:IID -> SET # issue number collections
orphan:SHA -> SET # commit sha collections
orphan:CID -> SET # commit/issue comment id collections
orphan:IID -> SET # issue number collections

binglog -> SET # UUID for all action for anti re-count/incr
                    +- urlsafe_b64encode([seq])
                                           |
                 /-------------------------+---------------\
                  [orig KEY]-[created_at]-[amount]
                     |           |           +- 真实追加的数量
                     |           +- POSIX timestamp
                     +- case KEY such as:
                             +- CI:ALL:COUNT
                             +- CI:ALL:LINES
                             +-  ...
                             +- IC:{yymmdd}:COUNT
                             +- IC:{yymmdd}:LINES

CI:ALL:COUNT -> ZSET -> {int:uid,,,} # count commit time
CI:ALL:LINES -> ZSET -> {int:uid,,,} # count commit lines 
CC:ALL:COUNT -> ZSET -> {int:uid,,,} # count commit-comment time 
CC:ALL:WORDS -> ZSET -> {int:uid,,,} # count commit-comment WORDS  
IU:ALL:COUNT -> ZSET -> {int:uid,,,} # count issue time 
IU:ALL:WORDS -> ZSET -> {int:uid,,,} # count issue WORDS  
IC:ALL:COUNT -> ZSET -> {int:uid,,,} # count issue-commet time 
IC:ALL:WORDS -> ZSET -> {int:uid,,,} # count issue-commet WORDS  

CI:{yymmdd}:COUNT -> ZSET -> {int:uid,,,} # daily count commit time
CI:{yymmdd}:LINES -> ZSET -> {int:uid,,,} # daily count commit lines 
CC:{yymmdd}:COUNT -> ZSET -> {int:uid,,,} # daily count commit-comment time 
CC:{yymmdd}:WORDS -> ZSET -> {int:uid,,,} # daily count commit-comment WORDS  
IU:{yymmdd}:COUNT -> ZSET -> {int:uid,,,} # daily count issue time 
IU:{yymmdd}:WORDS -> ZSET -> {int:uid,,,} # daily count issue WORDS  
IC:{yymmdd}:COUNT -> ZSET -> {int:uid,,,} # daily count issue-commet time 
IC:{yymmdd}:WORDS -> ZSET -> {int:uid,,,} # daily count issue-commet WORDS  

ALL:{uid}:CIS -> SET # user commits sha collections
ALL:{uid}:CCS -> SET # user commits-comments cid collections
ALL:{uid}:IUS -> SET # user issue iuid? collections
ALL:{uid}:ICS -> SET # user issue-comments cid collections

USR:{uid}:INFO -> HASH ->   login: ''
                            created_at: ''
                            disk_usage: ''
                            email: ''
                            following: ''
                            location: ''
                            blog: ''
,,,
CI:{uid}:{sha} -> HASH ->   files:      int
                            changes:    int
                            additions:  int
                            deletions:  int
                            created_at: POSIX timestamp
,,,
CC:{uid}:{ccid} -> HASH ->  body:       ''
                            words:      int
                            sha:        ''
                            created_at: POSIX timestamp
,,,
IU:{uid}:{iuid} -> HASH ->  number:     int
                            comments:   int
                            words:      int
                            state:      ''
                            created_at: POSIX timestamp
,,,
IC:{uid}:{iuid} -> HASH ->  number:     int
                            words:      int
                            created_at: POSIX timestamp
    

@ZoomQuiet
Copy link
Owner Author

--> README.md

ARCHIVED be wiki ;-) 归档/收录/提交ed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants