Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Chinese info #251

Merged
merged 13 commits into from
Dec 19, 2020
Merged

add Chinese info #251

merged 13 commits into from
Dec 19, 2020

Conversation

wengzhenjie
Copy link
Contributor

@open-digger-bot open-digger-bot bot added the pull/sql SQL related pull label Dec 15, 2020
@wengzhenjie
Copy link
Contributor Author

I get some information about Chinese developers and organizations by processing the given text. I also get some Chinese full repo name from text. Then I get repo_id and actor_id by using GitHub API. And for developers and organizations, I get all repos under them. So I add more repos.

@frank-zsy
Copy link
Contributor

There are several tricks about this component.

  • We don't really need to list all repo id of Chinese projects, we can use organization id because there is org_id in database which can be used. So the manifest can be quite shorter and also we can cover new projects from the org.
  • We need to find a way to inject repo and org id into SQL rather than filter in post-processor which will cause a quite huge network load.

@wengzhenjie
Copy link
Contributor Author

There are several tricks about this component.

  • We don't really need to list all repo id of Chinese projects, we can use organization id because there is org_id in database which can be used. So the manifest can be quite shorter and also we can cover new projects from the org.
  • We need to find a way to inject repo and org id into SQL rather than filter in post-processor which will cause a quite huge network load.

OK. I will use org_id to filter repos. However, according to my current stats, the number of actors is very small. Can we get more actors info by other ways?

@frank-zsy
Copy link
Contributor

I am not sure what do you mean. Do you mean you want more info than actor info in log data? Like?

@wengzhenjie
Copy link
Contributor Author

I am not sure what do you mean. Do you mean you want more info than actor info in log data? Like?

Yeah. I only get 60 actors. But we show top 20 active developers. The data looks like small.

@frank-zsy
Copy link
Contributor

What do you mean 60 actors? For one repo or what? The global active developer accounts are more than 13 million.

@wengzhenjie
Copy link
Contributor Author

wengzhenjie commented Dec 16, 2020

What do you mean 60 actors? For one repo or what? The global active developer accounts are more than 13 million.

I get actor’s login from text. In the text, the number of Chinese actors is small.

@frank-zsy
Copy link
Contributor

frank-zsy commented Dec 16, 2020

Oh, I see, actually we can not find out all developer accounts from China, the text accounts mean that all repos under the accounts should be considered as Chinese repos. But we do not make report for Chinese developers because we can not figure out if the contributors' accounts are Chinese or not.

@wengzhenjie
Copy link
Contributor Author

Oh, I see, actually we can not find out all developer accounts from China, the text accounts mean that all repos under the accounts should be considered as Chinese repos. But we do not make report for Chinese developers because we can not figure out if the contributors' accounts are Chinese or not.

Shall we show top Chinese active developers in report?

@frank-zsy
Copy link
Contributor

Shall we show top Chinese active developers in report?

I don't think it is necessary and we did not do that in 2019 report too.

@wengzhenjie
Copy link
Contributor Author

Shall we show top Chinese active developers in report?

I don't think it is necessary and we did not do that in 2019 report too.

OK.

@frank-zsy
Copy link
Contributor

I assume you may need a pre-processor to construct your SQL config without which may lead to a very hard situation.

@wengzhenjie
Copy link
Contributor Author

I assume you may need a pre-processor to construct your SQL config without which may lead to a very hard situation.

I can't understand this. I use GitHub API to get id.If I use pre-processor, I need to config the token. Can you tell me your logic about pre-processor ?

@frank-zsy
Copy link
Contributor

I mean how can you pass the repo id list into SQL renderer? You need to construct the render param string even you got all the repo id and org id list in json config.

@wengzhenjie
Copy link
Contributor Author

I mean how can you pass the repo id list into SQL renderer? You need to construct the render param string even you got all the repo id and org id list in json config.

I pass repo_id in having sentence .
image

@frank-zsy
Copy link
Contributor

frank-zsy commented Dec 17, 2020

Need to figure out if this works in current render lib, the template engine may not be so powerful.

BTW, we use pope as our template engine and according to the documentation, the values function may not be supported.

@wengzhenjie
Copy link
Contributor Author

Need to figure out if this works in current render lib, the template engine may not be so powerful.

Sorry. I am not sure it works. I will fix it.

@frank-zsy
Copy link
Contributor

/sql-run

@open-digger-bot
Copy link
Contributor

I found SQL component activity-repo-top-Chinese in this PR, the SQL run result data is:

{"data":[],"elapsed":73708}

The renderred text is:

{"err":"data.forEach is not a function"}

Please check whether the result is as expected.

I found SQL component activity-repo-top in this PR, the SQL run result data is:

{"data":[{"repo_name":"flutter/flutter","repo_language":"Dart","repo_activity":34619.81,"developer_count":"16500","issue_comment":"126299","open_issue":"14674","open_pull":"7167","pull_review_comment":"18647","merge_pull":"4938","commits":"21269","additions":"793564","deletions":"410017"},{"repo_name":"microsoft/vscode","repo_language":"TypeScript","repo_activity":26995.22,"developer_count":"13959","issue_comment":"100223","open_issue":"16228","open_pull":"1796","pull_review_comment":"1769","merge_pull":"1370","commits":"5078","additions":"170697","deletions":"103532"},{"repo_name":"MicrosoftDocs/azure-docs","repo_language":"PowerShell","repo_activity":23863.37,"developer_count":"9447","issue_comment":"86879","open_issue":"11667","open_pull":"3082","pull_review_comment":"880","merge_pull":"1833","commits":"3136","additions":"13469","deletions":"8692"},{"repo_name":"home-assistant/core","repo_language":"Python","repo_activity":22238.52,"developer_count":"8052","issue_comment":"75895","open_issue":"5296","open_pull":"7752","pull_review_comment":"30543","merge_pull":"6635","commits":"33962","additions":"1134406","deletions":"495762"},{"repo_name":"tensorflow/tensorflow","repo_language":"C++","repo_activity":21387.47,"developer_count":"9549","issue_comment":"63149","open_issue":"6158","open_pull":"2961","pull_review_comment":"7981","merge_pull":"2180","commits":"6428","additions":"249529","deletions":"67695"},{"repo_name":"kubernetes/kubernetes","repo_language":"Go","repo_activity":19710.15,"developer_count":"6050","issue_comment":"237966","open_issue":"3642","open_pull":"6763","pull_review_comment":"31153","merge_pull":"4797","commits":"6726","additions":"1326284","deletions":"728298"},{"repo_name":"NixOS/nixpkgs","repo_language":"Nix","repo_activity":18212.28,"developer_count":"2994","issue_comment":"82299","open_issue":"4313","open_pull":"19243","pull_review_comment":"29267","merge_pull":"15890","commits":"37535","additions":"973883","deletions":"625134"},{"repo_name":"pytorch/pytorch","repo_language":"C++","repo_activity":14324.99,"developer_count":"4756","issue_comment":"68225","open_issue":"4816","open_pull":"11644","pull_review_comment":"38613","merge_pull":"336","commits":"675","additions":"40085","deletions":"19724"},{"repo_name":"dotnet/runtime","repo_language":"C#","repo_activity":13977.79,"developer_count":"3695","issue_comment":"82959","open_issue":"7019","open_pull":"7483","pull_review_comment":"41016","merge_pull":"6480","commits":"27031","additions":"1965405","deletions":"1134424"},{"repo_name":"DefinitelyTyped/DefinitelyTyped","repo_language":"TypeScript","repo_activity":13089.64,"developer_count":"4029","issue_comment":"52774","open_issue":"532","open_pull":"6354","pull_review_comment":"6462","merge_pull":"5350","commits":"13236","additions":"1119459","deletions":"707027"}],"elapsed":97182}

The renderred text is:

# name language activity developer_count issue_comment open_issue open_pull pull_review_comment merge_pull pull_commits pull_additions pull_deletions
1 flutter/flutter Dart 34619.81 16500 126299 14674 7167 18647 4938 21269 793564 410017
2 microsoft/vscode TypeScript 26995.22 13959 100223 16228 1796 1769 1370 5078 170697 103532
3 MicrosoftDocs/azure-docs PowerShell 23863.37 9447 86879 11667 3082 880 1833 3136 13469 8692
4 home-assistant/core Python 22238.52 8052 75895 5296 7752 30543 6635 33962 1134406 495762
5 tensorflow/tensorflow C++ 21387.47 9549 63149 6158 2961 7981 2180 6428 249529 67695
6 kubernetes/kubernetes Go 19710.15 6050 237966 3642 6763 31153 4797 6726 1326284 728298
7 NixOS/nixpkgs Nix 18212.28 2994 82299 4313 19243 29267 15890 37535 973883 625134
8 pytorch/pytorch C++ 14324.99 4756 68225 4816 11644 38613 336 675 40085 19724
9 dotnet/runtime C# 13977.79 3695 82959 7019 7483 41016 6480 27031 1965405 1134424
10 DefinitelyTyped/DefinitelyTyped TypeScript 13089.64 4029 52774 532 6354 6462 5350 13236 1119459 707027

Please check whether the result is as expected.

@open-digger-bot open-digger-bot bot added the pull/sql-runned SQL related pull and the sql has been verified label Dec 19, 2020
sqls/activity-repo-top-Chinese/pre-processor.js Outdated Show resolved Hide resolved
sqls/activity-repo-top-Chinese/sql Show resolved Hide resolved
@frank-zsy
Copy link
Contributor

/approve

@open-digger-bot open-digger-bot bot added the pull/approved If a pull is approved, it will be automatically merged label Dec 19, 2020
@open-digger-bot open-digger-bot bot merged commit 9a722ba into X-lab2017:master Dec 19, 2020
@frank-zsy frank-zsy removed the pull/approved If a pull is approved, it will be automatically merged label Dec 19, 2020
@frank-zsy
Copy link
Contributor

False approve.

/sql-run

@open-digger-bot
Copy link
Contributor

I found SQL component activity-repo-top-Chinese in this PR, the SQL run result data is:

{"data":[],"elapsed":57615}

The renderred text is:

{"err":"data.forEach is not a function"}

Please check whether the result is as expected.

I found SQL component activity-repo-top in this PR, the SQL run result data is:

{"data":[{"repo_name":"flutter/flutter","repo_language":"Dart","repo_activity":34619.81,"developer_count":"16500","issue_comment":"126299","open_issue":"14674","open_pull":"7167","pull_review_comment":"18647","merge_pull":"4938","commits":"21269","additions":"793564","deletions":"410017"},{"repo_name":"microsoft/vscode","repo_language":"TypeScript","repo_activity":26995.22,"developer_count":"13959","issue_comment":"100223","open_issue":"16228","open_pull":"1796","pull_review_comment":"1769","merge_pull":"1370","commits":"5078","additions":"170697","deletions":"103532"},{"repo_name":"MicrosoftDocs/azure-docs","repo_language":"PowerShell","repo_activity":23863.37,"developer_count":"9447","issue_comment":"86879","open_issue":"11667","open_pull":"3082","pull_review_comment":"880","merge_pull":"1833","commits":"3136","additions":"13469","deletions":"8692"},{"repo_name":"home-assistant/core","repo_language":"Python","repo_activity":22238.52,"developer_count":"8052","issue_comment":"75895","open_issue":"5296","open_pull":"7752","pull_review_comment":"30543","merge_pull":"6635","commits":"33962","additions":"1134406","deletions":"495762"},{"repo_name":"tensorflow/tensorflow","repo_language":"C++","repo_activity":21387.47,"developer_count":"9549","issue_comment":"63149","open_issue":"6158","open_pull":"2961","pull_review_comment":"7981","merge_pull":"2180","commits":"6428","additions":"249529","deletions":"67695"},{"repo_name":"kubernetes/kubernetes","repo_language":"Go","repo_activity":19710.15,"developer_count":"6050","issue_comment":"237966","open_issue":"3642","open_pull":"6763","pull_review_comment":"31153","merge_pull":"4797","commits":"6726","additions":"1326284","deletions":"728298"},{"repo_name":"NixOS/nixpkgs","repo_language":"Nix","repo_activity":18212.28,"developer_count":"2994","issue_comment":"82299","open_issue":"4313","open_pull":"19243","pull_review_comment":"29267","merge_pull":"15890","commits":"37535","additions":"973883","deletions":"625134"},{"repo_name":"pytorch/pytorch","repo_language":"C++","repo_activity":14324.99,"developer_count":"4756","issue_comment":"68225","open_issue":"4816","open_pull":"11644","pull_review_comment":"38613","merge_pull":"336","commits":"675","additions":"40085","deletions":"19724"},{"repo_name":"dotnet/runtime","repo_language":"C#","repo_activity":13977.79,"developer_count":"3695","issue_comment":"82959","open_issue":"7019","open_pull":"7483","pull_review_comment":"41016","merge_pull":"6480","commits":"27031","additions":"1965405","deletions":"1134424"},{"repo_name":"DefinitelyTyped/DefinitelyTyped","repo_language":"TypeScript","repo_activity":13089.64,"developer_count":"4029","issue_comment":"52774","open_issue":"532","open_pull":"6354","pull_review_comment":"6462","merge_pull":"5350","commits":"13236","additions":"1119459","deletions":"707027"}],"elapsed":81260}

The renderred text is:

# name language activity developer_count issue_comment open_issue open_pull pull_review_comment merge_pull pull_commits pull_additions pull_deletions
1 flutter/flutter Dart 34619.81 16500 126299 14674 7167 18647 4938 21269 793564 410017
2 microsoft/vscode TypeScript 26995.22 13959 100223 16228 1796 1769 1370 5078 170697 103532
3 MicrosoftDocs/azure-docs PowerShell 23863.37 9447 86879 11667 3082 880 1833 3136 13469 8692
4 home-assistant/core Python 22238.52 8052 75895 5296 7752 30543 6635 33962 1134406 495762
5 tensorflow/tensorflow C++ 21387.47 9549 63149 6158 2961 7981 2180 6428 249529 67695
6 kubernetes/kubernetes Go 19710.15 6050 237966 3642 6763 31153 4797 6726 1326284 728298
7 NixOS/nixpkgs Nix 18212.28 2994 82299 4313 19243 29267 15890 37535 973883 625134
8 pytorch/pytorch C++ 14324.99 4756 68225 4816 11644 38613 336 675 40085 19724
9 dotnet/runtime C# 13977.79 3695 82959 7019 7483 41016 6480 27031 1965405 1134424
10 DefinitelyTyped/DefinitelyTyped TypeScript 13089.64 4029 52774 532 6354 6462 5350 13236 1119459 707027

Please check whether the result is as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pull/sql SQL related pull pull/sql-runned SQL related pull and the sql has been verified
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants